Variational Partitioned Runge-Kutta methods for Lagrangians linear in velocities

In this paper we construct higher-order variational integrators for a class of degenerate systems described by Lagrangians that are linear in velocities. We analyze the geometry underlying such systems and develop the appropriate theory for variational integration. Our main observation is that the evolution takes place on the primary constraint and the 'Hamiltonian' equations of motion can be formulated as an index-1 differential-algebraic system. We also construct variational Runge-Kutta methods and analyze their properties. The general properties of Runge-Kutta methods depend on the 'velocity' part of the Lagrangian. If the 'velocity' part is also linear in the position coordinate, then we show that non-partitioned variational Runge-Kutta methods are equivalent to integration of the corresponding first-order Euler-Lagrange equations, which have the form of a Poisson system with a constant structure matrix, and the classical properties of the Runge-Kutta method are retained. If the 'velocity' part is nonlinear in the position coordinate, we observe a reduction of the order of convergence, which is typical of numerical integration of DAEs. We verify our results through numerical experiments for various dynamical systems.


Introduction
Geometric integrators are numerical methods that preserve geometric structures and properties of the flow of a differential equation. Structure-preserving integrators have attracted considerable interest due to their excellent numerical behavior, especially for long-time integration of equations possessing geometric properties (see [11], [20], [25]).
An important class of structure-preserving integrators are variational integrators (see [19]). This type of numerical schemes is based on discrete variational principles and provides a natural framework for the discretization of Lagrangian systems, including forced, dissipative, or constrained ones. Variational integrators were introduced in the context of finite-dimensional mechanical systems, but were later generalized to Lagrangian field theories (see [18]) and applied in many computations, for example in elasticity ( [15]), electrodynamics ( [26]), or fluid dynamics ( [22]).

Equations of motion
The Lagrangian (2.1) is degenerate, since the associated Legendre transform is not invertible. The local representation of the Legendre transform is FL(q µ ,q µ ) = q µ , ∂L ∂q µ = q µ , α µ (q) , (2.4) that is, where (q µ , p µ ) denote canonical coordinates on T * Q. The dynamics is defined by the action functional S[q(t)] = b a L q(t),q(t) dt (2.6) and Hamilton's principle, which seeks the curves q(t) such that the functional S[q(t)] is stationary under variations of q(t) with fixed endpoints, i.e., we seek q(t) such that for all δq(t) with δq(a) = δq(b) = 0, where q ǫ (t) is a smooth family of curves satisfying q 0 = q and d dǫ ǫ=0 q ǫ = δq. The resulting Euler-Lagrange equations M µν (q)q ν = ∂ µ H(q) (2.8) form a system of first-order ODEs, where we assume that the even-dimensional antisymmetric matrix M µν (q) = ∂ µ α ν (q) − ∂ ν α µ (q) is invertible for all q ∈ Q. Without loss of generality we can further assume that the coordinate mapping p µ = α µ (q) is invertible and its inverse is smooth: if the Jacobian ∂α µ ∂q ν is singular, we can redefine α µ (q) → α µ (q) + b µ (q µ ), where b µ (q µ ) are arbitrary functions; the Euler-Lagrange equations remain the same, and with the right choice of the functions b µ (q µ ), the redefined Jacobian can be made nonsingular. Using B = M −1 , Eq. (2.8) can be equivalently written as the Poisson systeṁ q µ = B µν (q) ∂ ν H(q). (2.9) The Euler-Lagrange equations (2.8) can also be formulated as the implicit 'Hamiltonian' system p µ = α µ (q), p µ = ∂ µ α ν (q)q ν − ∂ µ H(q). (2.10) Since the Lagrangian L is degenerate, (2.10) is an index-1 system of differential-algebraic equations (DAE), rather than a Hamiltonian ODE system: the Legendre transform is an algebraic equation and has to be differentiated once with respect to time in order to turn this system into (2.8).
This reflects the fact that the evolution of the considered degenerate system takes place on the primary constraint N = FL(T Q) ⊊ T * Q. It is easy to see that the primary constraint N is (locally) diffeomorphic to the configuration manifold Q, where the diffeomorphism η ∶ Q ∋ q → α q ∈ N is locally, in the coordinates on T * Q, given by η(q) = (q, α(q)), (2.11) where by a slight abuse of notation α(q) = (α 1 (q), . . . , α n (q)). This shows that q µ can also be used as local coordinates on N . Note that η is simply the restriction of α to N , i.e., η = α Q →N .

Symplectic forms
The spaces Q, T Q, T * Q and N can be equipped with several symplectic or pre-symplectic forms. It is instructive to investigate the relationships between them in order to later avoid confusion regarding the sense in which variational integrators for Lagrangians linear in velocities are symplectic. On the configuration space Q we can define the two-form Ω = −dα, (2.12) which in local coordinates can be expressed as The two-form Ω is symplectic if it is nondegenerate, i.e., if the matrix M µν is invertible for all q.
Using the Legendre transform (2.3) we can define the Lagrangian two-formΩ L on T Q byΩ L = FL * Ω , which in canonical coordinates (q µ ,q µ ) is given bỹ The Lagrangian formΩ L is only pre-symplectic, because it is degenerate. Noting that FL = α ○ π T Q , where π T Q ∶ T Q → Q is the tangent bundle projection, we can relate Ω andΩ L through the formulã The symplectic structure on N can be introduced in two ways: by pushing forward Ω from Q, or pulling backΩ from T * Q. Both ways are equivalent where i ∶ N → T * Q is the inclusion map. This follows from the calculation where we used α = i ○ η. If we use q µ as coordinates on N , then the local representation ofΩ N will be given by (2.13).

Symplectic flows
Let ϕ t ∶ Q → Q denote the flow of (2.8) or (2.9). This flow is symplectic on Q, that is This fact can be proven by considering the Hamiltonian or Poisson properties of Equation (2.8) or Equation (2.9) (see [11], [17]). It also follows directly from the action principle (2.7) (see [24]). Since the Lagrangian (2.1) is degenerate, the dynamics of the system is defined on Q rather than T Q. However, we can obtain the associated flow on T Q through lifting ϕ t by its tangent map T ϕ t ∶ T Q → T Q. This flow preserves the Lagrangian two-form This can be seen from the calculation where we used (2.20), (2.23), and the property π T Q ○ T ϕ t = ϕ t ○ π T Q . The flow ϕ t induces the flowφ t ∶ N → N in a natural way as This flow is symplectic on N , i.e.,φ * which can be established through the simple calculatioñ

Discrete Mechanics
For a Veselov-type discretization we consider the discrete state space Q × Q, which serves as a discrete approximation of the tangent bundle (see [19]). We define a discrete Lagrangian L d as a smooth map L d ∶ Q × Q → R and the corresponding discrete action The variational principle now seeks a sequence q 0 , q 1 , ..., q N that extremizes S for variations holding the endpoints q 0 and q N fixed. The Discrete Euler-Lagrange equations follow Assuming that these equations can be solved for q k+1 , i.e., L d is non-degenerate, they implicitly define the discrete Lagrangian map Let (q µ ,q µ ) denote local coordinates on Q × Q. We can define the discrete Legendre transforms which in local coordinates on Q × Q and T * Q are respectively given by where q = (q 1 , . . . , q n ) andq = (q 1 , . . . ,q n ). The Discrete Euler-Lagrange equations (3.2) can be equivalently written as Using either of the transforms, one can define the discrete Lagrange two-form on Q × Q by ω L d = (F ± L d ) * Ω , which in coordinates gives It then follows that the discrete flow F L d is symplectic, i.e., F * L d ω L d = ω L d . Using the Legendre transforms we can pass to the cotangent bundle and define the discrete This map is also symplectic, i.e.,F * L dΩ =Ω.

Exact discrete Lagrangian
To relate discrete and continuous mechanics it is necessary to introduce a timestep h ∈ R. If the continuous Lagrangian L is non-degenerate, it is possible to define a particular choice of discrete Lagrangian which gives an exact correspondence between discrete and continuous systems (see [19]), the so-called exact discrete Lagrangian where q E (t) is the solution to the continuous Euler-Lagrange equations associated with L such that it satisfies the boundary conditions q E (0) = q and q E (h) =q. Note, however, that in the case of a regular Lagrangian the associated Euler-Lagrange equations are second order, therefore boundary value problems are solvable, at least for sufficiently small h andq sufficiently close to q. In the case of the Lagrangian (2.1) the associated Euler-Lagrange equations (2.8) are first order in time, therefore we have the freedom to choose an initial condition either at t = 0 or t = h, but not both. An exact discrete Lagrangian analogous to (3.6) cannot thus be defined on whole Q × Q. We will therefore assume the following definition: where q E (t) is the solution to (2.8) that satisfies the initial condition q E (0) = q.
Note that in this definition we automatically have q E (h) =q.

Singular perturbation problem
As mentioned, the purpose of introducing an exact discrete Lagrangian is to establish an exact correspondence between the continuous and discrete systems. For a regular Lagrangian L and its exact discrete Lagrangian L E d , one can show that the exact discrete Hamiltonian mapF L E d is equal toφ h , whereφ t is the symplectic flow for the Hamiltonian system associated with L. The problem is that the exact discrete Lagrangian (3.7) is not defined on the whole space Q × Q, so the discrete Euler-Lagrange equations (3.2) do not make sense, and it is not entirely clear how to define the associated discrete Lagrangian map F L E d . One possible way to deal with this issue is to consider a singular perturbation problem. Assume that Q is a Riemannian manifold equipped with the nondegenerate scalar product ⟪., .⟫. Define the ǫ-regularized Lagrangian (3.8) or in coordinates where g µν denotes the local coordinates of the metric tensor. Without loss of generality assume that in the chosen coordinates g µµ = 1 and g µν = 0 if µ = ν. For ǫ > 0 this Lagrangian is nondegenerate and the Legendre transform FL ǫ ∶ T Q → T * Q is given by that is, The Euler-Lagrange equations are second order. The corresponding Hamiltonian equations (in implicit form) are p µ = ǫ g µνq ν + α µ (q), There is no reason to expect that the solutions of (3.12) or (3.13) unconditionally approximate the solutions of (2.8) or (2.10), respectively. The equations (3.13) form a system of first-order ordinary differential equations, and therefore it is possible to specify arbitrary initial conditions q(0) = q init and p(0) = p init , whereas initial conditions for (2.10) have to satisfy the algebraic constraint p init = α(q init ). Under certain restrictive analytic assumptions, for some singular perturbation problems it is possible to show that, in order to satisfy the initial conditions, the solutions initially develop a steep boundary layer, but then rapidly converge to the solution of the corresponding DAE system (see [13]). On the other hand, for other singular perturbation problems, when the initial conditions do not satisfy the algebraic constraint, it may happen that the solutions do not converge to the solution of the DAE, but instead rapidly oscillate (see [16], [23]). We expect the latter behavior for (3.13), as will be demonstrated by a simple example in Section 3.5. Since our main goal here is to show how the notion of a discrete Legendre transform can be introduced for the exact discrete Lagrangian (3.7), we will make two intuitive, although nontrivial, assumptions. We refer the interested reader to [13] and [16] for techniques that can be used to prove these statements rigorously.

Exact discrete Legendre transform
Since L ǫ is regular, L ǫ,E d is properly defined on the whole space Q × Q (or at least in a neighborhood of Γ(ϕ h )) and the associated exact discrete Legendre transforms satisfy the properties (see [19]) where q ǫ E (t) is the solution to the regularized Euler-Lagrange equations (3.12) satisfying the boundary conditions q ǫ E (0) = q and q ǫ E (h) =q, and we denotedq ǫ =q ǫ E (0),q ǫ =q ǫ E (h). In the spirit of (3.14), we can assume the following definitions of the exact discrete Legendre transforms where ǫq ǫ → 0 and ǫq ǫ → 0 by uniform convergence ofq ǫ . This is a close analogy to FL = α ○ π T Q (see Section 2). We also note the property where q E (t) is the solution of (2.8) satisfying the initial condition q E (0) = q. This further indicates that our definition of the exact discrete Legendre transforms is sensible.
so that both transforms are diffeomorphisms between Γ(ϕ h ) and N .
The discrete Euler-Lagrange equations for L E d can be obtained as the limit of the discrete Euler-Lagrange equations for L ǫ,E d , that is, one can substitute L ǫ,E d in (3.4) and take the limit ǫ → 0 + on both sides to obtain This equation implicitly defines the exact discrete Lagrangian map Using the discrete Legendre transforms F ± L E d we can define the corresponding exact discrete 'Hamil- shows that the discrete 'Hamiltonian' map associated with the exact discrete Lagrangian L E d is equal to the 'Hamiltonian' flowφ h for (2.10), i.e., the evolution of the discrete systems described by L E d coincides with the evolution of the continuous system described by L at times t k = kh, k = 0, 1, 2, . . .

Example
Let us illustrate these ideas with a very simple example for which analytic solutions are known. Let Q = R 2 and let (x, y) denote local coordinates on Q. The tangent bundle is T Q = R 2 × R 2 , and the induced local coordinates are (x, y,ẋ,ẏ). Consider the Lagrangian The corresponding Euler-Lagrange equations (2.8) are simplẏ Let us now consider the ǫ-regularized Lagrangian The corresponding Euler-Lagrange equations (3.12) take the form One can easily verify analytically that is the solution to (3.25) satisfying the boundary conditions ( 27) and this solution converges uniformly (in this simple example it is in fact equal) to the solution of (3.21) with the same initial condition. We can also find an analytic expression for the exact discrete Lagrangian (3.6) associated with (3.24) as Restricting the domain to Γ(ϕ h ) we get L ǫ,E d (x, y, x, y) = 0, and comparing to (3.23) we verify that (3.14) indeed holds. The discrete Legendre transforms (3.3) associated with L ǫ,E d take the form Restricting the domain to Γ(ϕ h ) and taking the limit ǫ → 0 + as in (3.16), we can define the exact discrete Legendre transforms associated with (3.23) Comparing with (3.22), we see that the property (3.17) is satisfied, which replicates the analogous property for regular Lagrangians.

Variational error analysis
For a given continuous system described by the Lagrangian L, a variational integrator is constructed by choosing a discrete Lagrangian L d which approximates the exact discrete Lagrangian L E d . We can define the order of accuracy of the discrete Lagrangian in a way similar to that for discrete Lagrangians resulting from regular continuous Lagrangians (see [19]).
for all solutions q(t) of the Euler-Lagrange equations (2.8) with initial conditions q(0) ∈ U and for all h ≤h.
We will always assume that the discrete Lagrangian L d is non-degenerate, so that the discrete Euler-Lagrange equations (3.2) can be solved for q k+1 . This defines the discrete Lagrangian map F L d ∶ Q × Q → Q × Q and the associated discrete Hamiltonian mapF L d ∶ T * Q → T * Q, as in Section 3.1. Of particular interest is the rate of convergence ofF L d toφ h . One usually considers a local error (error made after one step) and a global error (error made after many steps). We will assume the following definitions, which are appropriate for differential-algebraic systems (see [11], [12], [13], [19]).
for all (q, p) ∈ U and h ≤h.

33)
where h = T K, for all (q, p) ∈ U , h ≤h, and T ≤T .
If the Lagrangian L is regular, then one can show that a discrete Lagrangian L d is of order r if and only if the corresponding Hamiltonian mapF L d is of order r (see [19]). Also, the associated Hamiltonian equations are a set of ordinary differential equations, and under some smoothness assumptions one can show that ifF L d is of order r, then it is also convergent of order r (see [12]). However, in the case of the Lagrangian (2.1) it is not true in general-both the order of the discrete Lagrangian and the local order of the discrete Hamiltonian map may be different than the actual global order of convergence (see [13], [10]), as will be demonstrated in Section 4.
Example: Midpoint Rule. In a simple example we will demonstrate that the variational order of accuracy of a discretization method is unaffected by a degeneracy of a Lagrangian L. In order to calculate the order of a discrete Lagrangian L d , we can expand L d (q(0), q(h)) in a Taylor series in h and compare it to the analogous expansion for L E d . If the two expansions agree up to r terms, then L d is of order r. Expanding q(t) in a Taylor series about t = 0 and substituting it in (3.7), we get the expression (0), etc., and the Lagrangian L and its derivatives are computed at (q,q). For the Lagrangian (2.1) the values ofq,q, ... q are determined by differentiating (2.8) sufficiently many times and substituting the initial condition q(0). Note that in case of regular Lagrangians the value ofq is determined by the boundary conditions q(0), q(h), and the higherorder derivatives by differentiating the corresponding Euler-Lagrange equations, but apart from that the expression (3.34) remains qualitatively unaffected.
The midpoint rule is an integrator obtained by defining the discrete Lagrangian Calculating the expansion in h yields Comparing this to (3.34) shows that the discrete Lagrangian defined by the midpoint rule is second order regardless of the degeneracy of L. However, as mentioned before, if L is degenerate we cannot conclude about the global order of convergence of the corresponding discrete Hamiltonian map. The midpoint rule can be formulated as a Runge-Kutta method, namely the 1-stage Gauss method. We discuss Gauss and other Runge-Kutta methods and their convergence properties in more detail in Section 4. Note that low-order variational integrators for Lagrangians (2.1) based on the midpoint rule have been studied in [24] and [30] in the context of the dynamics of point vortices.

VPRK methods as PRK methods for the 'Hamiltonian' DAE
To construct higher-order variational integrators one may consider a class of partitioned Runge-Kutta (PRK) methods. Variational partitioned Runge-Kutta (VPRK) methods for regular Lagrangians are described in [11] and [19]. In this section we show how VPRK methods can be applied to systems described by Lagrangians such as (2.1). As in the case of regular Lagrangians, we will construct an s-stage variational partitioned Runge-Kutta integrator for the Lagrangian (2.1) by considering the discrete Lagrangian where the internal stages Q i ,Q i , i = 1, . . . , s, satisfy the relation and are chosen so that the right-hand side of (4.1) is extremized under the constraint A variational integrator is then obtained by forming the corresponding discrete Euler-Lagrange equations (3.2).
Proof. See Theorem VI.6.4 in [11] or Theorem 2.6.1 in [19]. The proof is essentially identical. The only qualitative difference is the fact that in our case the Lagrangian (2.1) is degenerate, so the corresponding Hamiltonian system is in fact the index-1 differential-algebraic system (2.10) rather than a typical system of ordinary differential equations.
Existence and uniqueness of the numerical solution. Given q and p, one can use Equations (4.4) to compute the new positionq and momentump. First, one needs to solve (4.4a)-(4.4d) for the internal stages Q i ,Q i , P i , andṖ i . This is a system of 4sn equations for 4sn variables, but one has to make sure these equations are independent, so that a unique solution exists. One may be tempted to calculate the Jacobian of this system for h = 0, and then use the Implicit Function Theorem. However, even if we start with consistent initial values (q 0 , p 0 ), the numerical solution (q k , p k ) for k > 0 will only approximately satisfy the algebraic constraint; so Q i = q and P i = p cannot be assumed to be the solution of (4.4a)-(4.4d) for h = 0, and consequently, the Implicit Function Theorem will not yield a useful result. Let us therefore regard q and p as h-dependent, as they result from the previous iterations of the method with the timestep h. If the method is convergent, it is reasonable to expect that p − α(q) is small and converges to zero as h is refined.
The following approach was inspired by Theorem 4.1 in [10]. be invertible with the inverse bounded in U s , i.e., there exists C > 0 such that where A = (a ij ) i,j=1,...,s ,Ā = (ā ij ) i,j=1,...,s , I n is the n × n identity matrix, and {Dα} denotes the block diagonal matrix Suppose also that (q, p) satisfy Then there existsh > 0 such that the nonlinear system (4.4a)-(4.4d) has a solution for h ≤h. The solution is locally unique and satisfies Proof. Substitute (4.4c) and (4.4d) in (4.4a) and (4.4b) to obtain for i = 1, . . . , s, where for notational convenience we left the Q i 's as arguments of α, Dα T and DH, but we keep in mind they are defined by (4.4c), so that (4.11) is a nonlinear system forQ i andṖ i . Let us consider the homotopy for i = 1, . . . , s. It is easy to see that for τ = 0 the system (4.12) has the solutionQ i = 0 andṖ i = 0, and for τ = 1 it is equivalent to (4.11). Let us treatQ i andṖ i as functions of τ , and differentiate (4.12) with respect to this parameter. The resulting ODE system can be written as with where D 2 denotes the Hessian matrix of the respective function, and summation over β is implied. The system (4.13) is further simplified if we substitute (4.13b) in (4.13a). This way we obtain an ODE system for the variablesQ of the form Since α is smooth, we have where Q i − Q j ≤ δ for δ assumed small, but independent of h. Moreover, since α and H are smooth, the term {B}, as a function ofQ, is bounded in a neighborhood of 0. Therefore, we can write (4.15) as (4.17) By (4.7), for sufficiently small h and δ, the matrix W (Q 1 , . . . , Q s ) + O(δ) + O(h) has a bounded inverse, provided that Q 1 , . . . , Q s remain in U . Therefore, the ODE (4.17) with the initial conditioṅ Q(0) = 0 has a unique solutionQ(τ ) on a non-empty interval [0,τ ), which can be extended until any of the corresponding Q i (τ ) leaves U . Let us argue that for a sufficiently small h we haveτ > 1. Given (4.7) and (4.9), the ODE (4.17) implies that Therefore, we haveQ and further for τ <τ . This implies that all Q i (τ ) remain in U for τ ≤ 1 if h is sufficiently small. Consequently, the ODE (4.15) has a solution on the interval [0, 1]. ThenQ i (1) and Q i (1) satisfy the estimates (4.10), and are a solution to the nonlinear system (4.4a)-(4.4d). The correspondingṖ i and P i can be computed using (4.4b) and (4.4d), and the remaining estimates (4.10) can be proved using the fact that α and H are smooth. This completes the proof of the existence of a numerical solution to (4.4a)-(4.4d). In order to prove local uniqueness, we substitute the second equation of (4.11) in the first one to obtain a nonlinear system forQ i , namely for i = 1, . . . , s. Subtract (4.21) from (4.22), and linearize aroundQ i . Based on the fact that ∆Q i = O(1), and using the notation introduced before, we get By a similar argument as before, for sufficiently small h the matrix {Dα}(A⊗I n )−(Ā⊗I n ){Dα T } has a bounded inverse, therefore (4.23) implies ∆Q = O(h ∆Q ), that is, for some constantC > 0. Note that for h < 1 C we have (1 −Ch) > 0, and therefore ∆Q = 0, which completes the proof of the local uniqueness of a numerical solution to (4.4a)-(4.4d). .

Remarks.
The condition (4.7) may be tedious to verify, especially if one uses a Runge-Kutta method with many stages. However, this condition is significantly simplified in the following special cases: 1. For a non-partitioned Runge-Kutta method we have A =Ā, and the condition (4.7) is satisfied if A is invertible, and the mass matrix M (q) = Dα T (q) − Dα(q), as defined in Section 2.1, is invertible in U and its inverse is bounded.
2. If Dα is antisymmetric, then the condition (4.7) is satisfied if (A +Ā) is invertible, and the matrix Dα(q) is invertible in U and its inverse is bounded.

Linear α µ (q)
An interesting special case is obtained if we have, in some local chart on Q, α µ (q) = − 1 2 Λ µν q ν for some constant matrix Λ. Without loss of generality assume that Λ is invertible and antisymmetric. The Lagrangian (2.2) then takes the form Since Λ is antisymmetric and invertible, then by Theorem 4.2 the scheme (4.28) yields a unique numerical solution to (4.27) if the Runge-Kutta matrix A = (a ij ) is invertible. Since A is invertible, this implieṡ Substituting this in (4.28b) yields Together with (4.28c) and (4.28e), this gives a Runge-Kutta method for (4.26). Moreover, substituting (4.30) and p = − 1 2 Λq in (4.28f), and using (4.28e), one has that is, (q,p) satisfy the algebraic constraint. If the coefficients of the method (4.28) satisfy the condition (4.5), then (4.28) is a variational integrator and the associated discrete Hamiltonian mapF L d is symplectic on T * Q, as explained in Section 3.1. Given Corollary 4.4, we further have: Corollary 4.5. If the coefficients a ij and b i in (4.28) satisfy the condition (4.5), then the discrete Hamiltonian mapF L d associated with (4.1) is symplectic on the primary constraint N , that is, Convergence. Various Runge-Kutta methods and their classical orders of convergence, that is, orders of convergence when applied to (non-stiff) ordinary differential equations, are discussed in many textbooks on numerical analysis, for instance [12] and [13]. When applied to differentialalgebraic equations, the order of convergence of a Runge-Kutta method may be reduced (see [1], [13], [23]). However, in the case of (4.27) Theorem 4.3 implies that the classical order of convergence of non-partitioned Runge-Kutta methods (4.28) is retained.
Theorem 4.6. A Runge-Kutta method with the coefficients a ij and b i applied to the DAE system (4.27) retains its classical order of convergence.
Proof. Let r be the classical order of the considered Runge-Kutta method, (q, p) ∈ N an initial condition, (q E (t), p E (t)) the exact solution to (4.27) such that (q E (0), p E (0)) = (q, p), and (q k , p k ) the numerical solution obtained by applying the method (4.28) iteratively k times with (q 0 , p 0 ) = (q, p). Theorem 4.3 states that the method (4.28) is equivalent to applying the same Runge-Kutta method to the ODE system (4.26). Hence, we obtain convergence of order r in the q variable, that is, for a fixed time T > 0 and an integer K such that h = T K, we have the estimate for some constant C > 0 (cf. Definition 3.6). By Corollary 4.4 we know that p K = − 1 2 Λq K , so we have the estimate which completes the proof, since Λ < +∞.
Of particular interest to us are Runge-Kutta methods that satisfy the condition (4.5), for instance symplectic diagonally-implicit Runge-Kutta methods (DIRK) or Gauss collocation methods (see [11]). The s-stage Gauss method is of classical order 2s, therefore we have: Corollary 4.7. The s-stage Gauss collocation method applied to the DAE system (4.27) is convergent of order 2s.
As mentioned in Section 3.6, the midpoint rule is a 1-stage Gauss method, therefore it retains its classical second order of convergence.
Backward error analysis. The system (4.26) can be rewritten as the Poisson systeṁ with the structure matrix Λ −1 (see [17], [11]). The flow ϕ t for this equation is a Poisson map, that is, it satisfies the property which is in fact equivalent to the symplecticity property (2.23) or (2.27) written in local coordinates on Q or N , respectively. Let F h ∶ Q → Q represent the numerical flow defined by some numerical algorithm applied to (4.35). We say this flow is a Poisson integrator if The left-hand side of (4.36) can be regarded as a quadratic invariant of (4.35). By Theorem 4.3 the method (4.28) is equivalent to applying the same Runge-Kutta method to (4.35). If its coefficients also satisfy the condition (4.5), then it can be shown that the method preserves quadratic invariants (see Theorem IV.2.2 in [11]). Therefore, we have: is invertible, the coefficients a ij and b i satisfy the condition (4.5), and p = − 1 2 Λq, then the method (4.28) is a Poisson integrator for (4.35).
The true power of symplectic integrators for Hamiltonian equations is revealed through their backward error analysis: a symplectic integrator for a Hamiltonian system with the Hamiltonian H(q, p) defines the exact flow for a nearby Hamiltonian system, whose Hamiltonian can be expressed as the asymptotic seriesH (q, p) = H(q, p) + hH 2 (q, p) + h 2 H 3 (q, p) + . . . (4.38) Owing to this fact, under some additional assumptions, symplectic numerical schemes nearly conserve the original Hamiltonian H(q, p) over exponentially long time intervals (see [11] for details). A similar result holds for Poisson integrators for Poisson systems: a Poisson integrator defines the exact flow for a nearby Poisson system, whose structure matrix is the same and whose Hamiltonian has the asymptotic expansion (4.38) (see Theorem IX.3.6 in [11]). Therefore, we expect the non-partitioned Runge-Kutta schemes (4.28) satisfying the condition (4.5) to demonstrate good preservation of the original Hamiltonian H. See Section 5 for numerical examples. Partitioned Runge-Kutta methods do not seem to have special properties when applied to systems with linear α µ (q), therefore we describe them in the general case in Section 4.3.

Nonlinear α µ (q)
When the coordinates α µ (q) are nonlinear functions of q, then the Runge-Kutta methods discussed in Section 4.2 lose some of their properties: a theorem similar to Theorem 4.3 cannot be proved, most of the Runge-Kutta methods (whether non-partitioned or partitioned) do not preserve the algebraic constraint p = α(q), i.e., the numerical solution does not stay on the primary constraint N , and therefore their order of convergence is reduced, unless they are stiffly accurate.

Runge-Kutta methods
Let us again consider non-partitioned methods with a ij =ā ij . Convergence results for some classical Runge-Kutta schemes of interest can be obtained by transforming (2.10) into a semi-explicit index-2 DAE system. Let us briefly review this approach. More details can be found in [10] and [13].
The system (2.10) can be written as the quasi-linear DAE where y = (q, p) and where I n denotes the n × n identity matrix. Let us introduce a slack variable z and rewrite (4.39) as the index-2 DAE systemẏ = z, (4.41a) This system is of index 2, because it has 4n dependent variables, but only 2n differential equations (4.41a), and some components of the algebraic equations (4.41b) have to be differentiated twice with respect to time in order to derive the missing differential equations for z. Note that C(y) is a singular matrix of constant rank n, therefore it can be decomposed (using Gauss elimination or the singular value decomposition) as C(y) = S(y) I n 0 0 0 T (y) (4.42) for some non-singular matrices S(y) and T (y). Since α(q) is assumed to be smooth, one can choose S and T so that they are also smooth (at least in a neighborhood of y). Premultiplying both sides of (4.41b) by S −1 (y) turns the DAE (4.41) intȯ where we introduced the block structure y = (y 1 , y 2 ), z = (z 1 , z 2 ), and Since T (y) is invertible, we can assume without loss of generality that the block T 11 (y) is invertible, too (one can always permute the columns of T (y) otherwise). Let us compute z 1 from (4.43c) and substitute it in (4.43a). The resulting system, has the form of a semi-explicit index-2 DAEẏ provided that has a bounded inverse. It is an elementary exercise to show that the partitioned Runge-Kutta method (4.4) is invariant under the presented transformation, that is, it defines a numerically equivalent partitioned Runge-Kutta method for (4.45). Runge-Kutta methods for semi-explicit index-2 DAEs have been studied and some convergence results are available. Convergence estimates for the y component of (4.45) can be readily applied to the solution of (4.39).
As in Section 4.2, of particular interest to us are variational Runge-Kutta methods, i.e., methods satisfying the condition (4.5), for example Gauss collocation methods (see [11], [12]). However, in the case when α(q) is a nonlinear function, the solution generated by the Gauss methods does not stay on the primary constraint N and this affects their rate of convergence, as will be shown below. For comparison, we will also consider the Radau IIA methods (see [13]), which, although not variational/symplectic, are stiffly accurate, that is, their coefficients satisfy a sj = b j for j = 1, . . . , s, so the numerical value of the solution at the new time step is equal to the value of the last internal stage, and therefore the numerical solution stays on the submanifold N . We cite the following convergence rates for the y component of (4.46) after [13] and [10]: • s-stage Gauss method-convergent of order s + 1 for s odd s for s even , • s-stage Radau IIA method-convergent of order 2s − 1.
With the exception of the midpoint rule (s = 1), we see that the order of convergence of the Gauss methods is reduced. On the other hand, the Radau IIA methods retain their classical order 2s − 1.
Symplecticity. Since the Gauss methods satisfy the condition (4.5), they generate a flow which preserves the canonical symplectic formΩ on T * Q, as explained in Section 3.1. However, since the primary constraint N is not invariant under this flow, a result analogous to Corollary 4.5 does not hold, i.e., the flow is not symplectic on N .

Partitioned Runge-Kutta methods
In Section 5 we present numerical results for the Lobatto IIIA-IIIB methods (see [11]). Their numerical performance appears rather unattractive, therefore our theoretical results regarding partitioned Runge-Kutta methods are less complete. Below we summarize the experimental orders of convergence of the Lobatto IIIA-IIIB schemes that we observed in our numerical computations (see Figure 5.2, Figure 5.6, and Figure 5.10): • 2-stage Lobatto IIIA-IIIB-inconsistent, • 3-stage Lobatto IIIA-IIIB-convergent of order 2, • 4-stage Lobatto IIIA-IIIB-convergent of order 2.
Comments regarding the symplecticity of these schemes are the same as for the Gauss methods mentioned above in Section 4.3.1.

Numerical experiments
In this section we present the results of the numerical experiments we performed to test the methods discussed in Section 4. We consider Kepler's problem, the dynamics of planar point vortices, and the Lotka-Volterra model, and we show how each of these models can be formulated as a Lagrangian system linear in velocities.

Kepler's problem
A particle or a planet moving in a central potential in two dimensions can be described by the Hamiltonian where (x, y) denotes the position of the planet and (p x , p y ) its momentum; H 0 is an arbitrary constant. The corresponding Lagrangian can be obtained in the usual way as If one performs the standard Legendre transformẋ = ∂H ∂p x ,ẏ = ∂H ∂p y , then L = L(x, y,ẋ,ẏ) will take the usual nondegenerate form, quadratic in velocities. However, one can also introduce the variable q = (x, y, p x , p y ) and view L = L(q,q) as (2.2), that is, a Lagrangian linear in velocities (see [6]). Comparing (5.2) and (4.25) we see that the corresponding Λ is singular. Without loss of generality we replace Λ with its antisymmetric part (Λ − Λ T ) 2, which is invertible, and consider the Lagrangian As a test problem we considered an elliptic orbit with eccentricity e = 0.5 and semi-major axis a = 1. We took the initial condition at the pericenter, i.e., q 1 init = (1 − e)a = 0.5, q 2 init = 0, q 3 init = 0, q 4 init = a (1 + e) (1 − e) ≈ 1.73. This is a periodic orbit with period T period = 2π. A reference solution was computed by integrating (4.26) until the time T = 7 using Verner's method (a 6-th order explicit Runge-Kutta method; see [12]) with the small time step h = 2 × 10 −7 . The reference solution is depicted in Figure 5.1.
We solved the same problem using several of the methods discussed in Section 4 for a number of time steps ranging from h = 3.5 × 10 −3 to h = 3.5 × 10 −1 . The value of the solutions at T = 7 was then compared against the reference solution. The max norm errors are depicted in Figure 5.2. We see that the rates of convergence of the Gauss and the 3-stage Radau IIA methods are consistent with Theorem 4.6 and Corollary 4.7. For the Lobatto IIIA-IIIB methods we observe a reduction of order. The 2-stage Lobatto IIIA-IIIB method turns out to be inconsistent and is not depicted in Figure 5 down on the center of gravity, and the computations cannot be continued too far in time. The Hamiltonian shows major variations whose amplitude grows in time. The non-variational Radau IIA scheme yields an accurate solution, but it demonstrates a gradual energy dissipation.

Point vortices
Point vortices in the plane are another interesting example of a system with linear α µ (q) (see [21], [24], [30]). A system of K interacting point vortices in two dimensions can be described by the Lagrangian with the Hamiltonian where (x i , y i ) denotes the location of the i-th vortex, Γ i is its circulation, and H 0 is an arbitrary constant.
As a test problem we considered the system of K = 2 vortices with circulations Γ 1 = 4 and Γ 2 = 2, respectively, and distance D = 1 between them. The vortices rotate on concentric circles about their center of vorticity at x C = 0 and y C = 0. We took the initial condition at x The analytic solution can be found (see [21]) as where ω = (Γ 1 + Γ 2 ) (2πD 2 ). This is a periodic solution with period T period ≈ 6.58. See Figure 5.5.
We performed similar convergence tests as in Section 5.1. The value of the numerical solutions at time T=7 were compared against the exact solution (5.6). The max norm errors are depicted in Figure 5.6. The results are qualitatively the same as for Kepler's problem.
We set H 0 = 0 in (5.5), so that H = 0 for the considered solution.

Lotka-Volterra model
The dynamics of the growth of two interacting species can be modeled by the Lotka-Volterra equationsu where u(t) denotes the number of predators and v(t) the number of prey, and the constants 1 and 2 were chosen arbitrarily. These equations can be rewritten as the Poisson system where the Hamiltonian is given by  with an arbitraty constant H 0 (see [11]). Using an approach similar to the one presented in Section 5.1, one can easily verify that the Lagrangian L(q,q) = log q 2 q 1 + q 2 q 1 + q 1q2 − H(q) (5.10) reproduces the same equations of motion, where q = (u, v). The coordinates α µ (q) (cf. Equation (2.2)) were chosen, so that the assumptions of Theorem 4.2 are satisfied for the considered Runge-Kutta methods.
As a test problem we considered the solution with the initial condition q 1 init = 1 and q 2 init = 1 (note that q = (1, 2) is an equilibrium point). This is a periodic solution with period T period ≈ 4.66. A reference solution was computed by integrating (4.26) until the time T = 5 using Verner's method with the small time step h = 10 −7 . The reference solution is depicted in Figure 5.9.
Convergence plots are shown in Figure 5.10. The convergence rates for the Gauss and Radau IIA methods are consistent with the theoretical results presented in Section 4.3.1-we see that the orders of the 2-and 3-stage Gauss schemes are reduced. The 2-stage Lobatto IIIA-IIIB scheme again proves to be inconsistent, and the 3-and 4-stage schemes converge quadratically, just as in Section 5.1 and Section 5.2.
We performed another series of numerical experiments with the time step h = 0.1 to investigate the long time behavior of the considered integrators. The results are shown in Figure 5.11 and Figure 5.12. We set H 0 = 2 in (5.9), so that H = 0 for the considered solution. The 1-and 3-stage

Summary
We analyzed a class of degenerate systems described by Lagrangians that are linear in velocities, and presented a way to construct appropriate higher-order variational integrators. We pointed out how the theory underlying variational integration is different from the non-degenerate case and we made a connection with numerical integration of differential-algebraic equations. We also performed numerical experiments for several example models.
Our work can be extended in several ways. In Section 5.3 we presented our numerical results for the Lotka-Volterra model, which is an example of a system for which the coordinate functions α µ (q) are nonlinear. The 1-and 3-stage Gauss methods performed exceptionally well and preserved the Hamiltonian over a very long integration time. It would be interesting to perform a backward error (or similar) analysis to check if this behavior is generic. If confirmed, our variational approach could provide a new way to construct geometric integrators for a broader class of Poisson systems.
It would also be interesting to further consider constrained systems with Lagrangians that are linear in velocities and construct associated higher-order variational integrators. This would allow to generalize the space-adaptive methods presented in [28], [29] to degenerate field theories, such as