On the significance of the stress-energy tensor in Finsler spacetimes

We revisit the physical arguments which lead to the definition of the stress-energy tensor $T$ in the Lorentz-Finsler setting $(M,L)$ starting at classical Relativity. Both the standard heuristic approach using fluids and the Lagrangian one are taken into account. In particular, we argue that the Finslerian breaking of Lorentz symmetry makes $T$ an anisotropic 2-tensor (i. e., a tensor for each $L$-timelike direction), in contrast with the energy-momentum vectors defined on $M$. Such a tensor is compared with different ones obtained by using a Lagrangian approach. The notion of divergence is revised from a geometric viewpoint and, then, the conservation laws of $T$ for each observer field are revisited. We introduce a natural {\em anisotropic Lie bracket derivation}, which leads to a divergence obtained from the volume element and the non-linear connection associated with $L$ alone. The computation of this divergence selects the Chern anisotropic connection, thus giving a geometric interpretation to previous choices in the literature.


Introduction
This article has a double aim in Lorentz-Finsler Geometry. The first one is to revisit the physical grounds of the stress-energy tensor T §3. The possible extensions of the relativistic T are discussed from the viewpoint of both fluids mechanics and Lagrangian systems. The second one is to revise geometrically the notion of divergence §4, yielding consequences about the conservation of T §5. With this aim, we introduce new notions of Lie bracket and derivative associated with a nonlinear connection and applicable to anisotropic tensors fields, which appear naturally in Finsler Geometry.
Finslerian modifications of General Relativity aim to find a tensor T collecting the possible anisotropies in the distribution of energy, momentum and stress, which will serve as a source for the (now Lorentz-Finsler) geometry of the spacetime [14,15,24,28,40]. Some of these proposals may be waiting for experimental evidence, postponing then how the basic relativistic notions would be affected. However, such a discussion is relevant to understand the scope and implications of the introduced Finslerian elements. In a previous reference [1], the fundamentals of observers in the Finslerian setting were extensively studied, including its compatibility with the Ehlers-Pirani-Schild approach. Now we focus on the stress-energy tensor T .
The difficulty to study such a T is apparent. Recall that, using the principle of equivalence, General Relativity is reduced infinitesimally into the Special one, which provides a background for interpretations. However, in the Lorentz-Finsler case, the infinitesimal model is changed into a Lorentz norm (instead of scalar product), implying a breaking of Lorentz invariance. This is a substantial issue in its own right which has been studied in the context of Very Special Relativity and others [3,5,10,8,23]. As an additional difficulty, the infinitesimal model changes with the point. 1 Two noticeable pre-requisites are the following: (a) only the value of the Lorentz-Finsler metric on causal directions is relevant [1,19] (this is briefly commented in the setup §2. 3), and (b) there is a big variety of possible extensions of the relativistic kinematic objects to the Finsler case, at least from the geometric viewpont (see the appendix §7). Taking into account these 1 Berwald spaces [7,9] are an exception, as the parallel transport becomes an isometry between the Lorentz norms. Thus, in some sense, these spaces would admit a principle of equivalence with respect to a Lorentz normed space (non-necessarily to Lorentz-Minkowski spacetime).
issues, the extension of the notion of stress-energy tensor to the Finslerian setting is discussed in §3.
We start at the fluids approach. As a preliminary question, energymomentum is discussed, §3.1. We emphasize that, even though this is well-defined as a tangent vector in each tangent space T p M , p ∈ M , different observers u, u ′ at p will use coordinates related by non-trivial linear transformations. Indeed, the latter will depend on both L and the chosen way to measure relative velocities. Moreover, when the stress-energy T is considered §3.2, the arguments in Classical Mechanics and Relativity which support its status as a tensor hold only partially in the Lorentz-Finsler setting. Indeed, T acquires a nonlinear nature which is codified in an (observer-dependent) anisotropic tensor, rather than in a tensor on M .
The Lagrangian approach is discussed in §3. 3. This approach has been developed recently by Hohmann, Pfeifer and Voicu [13,16], who introduced an energy-momentum scalar function. Here, we discuss the analogies and differences of this function with the canonical relativistic stress-energy tensor δS matter /δg µν and the 2-tensor T obtained from the fluids approach above. Relevant issues are the existence of different ways to obtain a 2-tensor starting at a scalar function, the recovery of this function from a matter Lagrangian and the possibility to consider the Palatini Lagrangian as the background one (rather than Einstein-Hilbert type Lagrangians used by the cited authors; recall that Palatini's becomes especially meaningful in the Finslerian case [22]). The important case of kinetic gases is considered explicitly (Ex. 3.2).
Once the definition of T has been discussed, we focus on its conservation §5, revisiting first the divergence theorem §4. This is crucial in the Finslerian setting because, as discussed before, the Lagrangian approach above does not guarantee a conservation law as the relativistic div(G) = 0.
§4 analyzes the divergence from a purely mathematical viewpoint. Now, L is regarded as pseudo-Finsler (the results will be useful not only in any indefinite signature but also in the classical positive definite case) and T will not be assumed to be symmetric a priori. Classically, the divergence of a vector field Z is defined with the derivation associated with the Lie bracket [Z, X] = L Z X, applied to the volume element. In the Finslerian case, however, the Lie derivative and bracket do not make sense for arbitrary anisotropic vector fields. This difficulty was circumvented by Rund [36], who redefined div(Z) in such a way that a type of divergence theorem held. However, the Lie viewpoint is restored here.
§4.1 Once a nonlinear connection HA (seen as a horizontal distribution on A) is prescribed, we can define a Lie bracket l H Z X and, then, a Lie derivative L H Z X (Defs. 4.1 and 4.5; Th. 4.4 (C)). Noticeably, the former l H Z is expressible in terms of the infinitesimal flow of Z (Prop. 4.7).
§4. 2 The divergence of Z is naturally defined by using this Lie bracket (Def. 4.9). For the computation of div(Z), however, one can use an anisotropic connection ∇ (this can be seen as a Finsler connection dropping its vertical part, see §2) and a priori Chern's one is not especially priviledged (Prop. 4.11). §4. 3. We give a general Finslerian version of the divergence theorem for any anisotropic vector field Z, emphasizing the role of the choice of an (admissible) vector field V : M → A, which in the Lorentzian case can be interpreted as an observer field; this is expressed in terms of integration of forms in the spirit of Cartan's formula (Th. 4.13,Rem. 4.14). We also explain how the boundary term can be expressed in different ways by using a normal either with respect to the pseudo-Riemannian metric g V or to the fundamental tensor, which were the choices of Rund [36] and Minguzzi [30] resp.
§5 gives some applications to conservation laws. §5.1. First, we discuss the definition of divergence for the case of T . Our definition for vector fields was not biased to the Chern anisotropic connection, but this will be used for div(T ) (Def. 5.3). The reason is that div(T ) should behave under contraction in a similar way as in the isotropic case (namely, as in formula (11)), which privileges Chern's connection (Prop. 5.1).

§5.2.
As an interlude about the appeareance of Chern's ∇, a comparison with the possible use of Berwald's and previous approaches in the literature is done.
§5.3. A conservation law for the flow of T V (X V ) is obtained (Cor. 5.9), stressing three hypotheses on the vanishing for V of elements related to the stress-energy T (div(T ) = 0), the anisotropic vector X (l H X g = 0, generalizing the isotropic case) and a derivative of V . The latter hypothesis is genuinely Finslerian and it means that some terms related to the nonlinear covariant derivative DV must vanish globally (V can always be chosen such that they vanish at some point). It is worth pointing out that our general formula for the integral of the divergence (36) recovers the classical interpretation of the divergence as an infinitesimal growth of the flow (now observer-dependent). So, div(T ) = 0 is equivalent to the conservation of energy-momentum in the instantaneous restspace of each observer, see Rem. 5.8.
We finish by applying this general result to two examples. First to Lorentz norms, showing that the conservation laws of Special Relativity still hold even though, now, the conserved quantity may be different for different observers. As a second example, we give natural conditions so that the flow of T V (X V ) (whenever it exists as a Lebesgue integral, eventually equal to ±∞) is equal in any two Cauchy hypersurfaces of a globally hyperbolic Finsler spacetime. Indeed, we refine a previous result by Minguzzi [30], who assumed that L was defined on the whole TM and T V (X V ) was compactly supported. We show that a combination of Rund's and Minguzzi's ways to compute the boundary terms allows one to obtain appropriate decay rates (namely, the properly Finslerian hypothesis (49)) which ensure the conservation.

Preliminaries and setup
First, let us set up some notation. In all the present text, M is a connected smooth (C ∞ ) manifold of dimension n ≥ 2. As in previous references [21,22], any coordinate chart (U, (x 1 , ..., x n )) of M naturally induces a chart (TU, (x 1 , ..., x n , y 1 , ..., y n )) of TM defined by the fact that for v ∈ TU , where π : TM → M is the canonical projection. We abbreviate ∂ ∂x i =: ∂ i , ∂ ∂y i =:∂ i ; these are vector fields on TU . At any rate, we will express our results in coordinate-free and geometric terms.
2.1. Anisotropic tensors. We shall employ the framework of anisotropic tensors, following [17,18,21], as it is simpler than previous ones. An open subset A ⊆ TM with π(A) = M is fixed; the elements v ∈ A are called observers. We will denote by T r s (M A ) the space of (smooth) rcontravariant s-covariant A-anisotropic tensor fields (r, s ∈ N ∪ {0}), and by T (M A ) := r,s T r s (M A ) the full anisotropic tensor algebra. F(A) = T 0 0 (M A ) will be the space of functions on A. This time we will also put X(M A ) := T 1 0 (M A ) for the space of anisotropic vector fields and Ω s (M A ) for the space of anisotropic s-forms (alternating anisotropic tensors, so that ). The space T (M ) of classical tensor fields will be seen as a subspace of T (M A ), formed by the isotropic elements, namely those which depend only on the point p ∈ M and not on the observer at it. In particular, X(M ) ⊆ X(M A ). There is a distinguished element of X(M A ): the canonical (or Liouville) anisotropic vector field, For an open set U ⊆ M , we will put X A (U ) for the set of (local) observer fields, that is, those V ∈ X(U ) such that V p ∈ A ∩ T p M for all p ∈ U . Given one of these and T ∈ T r s (M A ), their composition, denoted by T V ∈ T r s (U ), makes sense. Finally, for X ∈ X(M A ), there is also a canonical derivatioṅ For other options and the rudiments, see [21]. Nonlinear connections are characterized by their nonlinear coefficients N i j , and also by their nonlinear covariant derivative D X : for X ∈ X(U ). They also provide (at least locally) a nonlinear parallel transport of observers v ∈ A ∩ T γ(0) M along curves γ : [0, t] → M . Namely, a map P t : A γ(0) → A γ(t) defined as P t (v) = V (t), being V the only vector field along γ such that V (0) = v and DγV = 0 (see [21,Def. 12] and the comment below). An A-anisotropic connection is an operator ∇ : X(M ) × X(M ) → X(M A ) satisfying the usual Koszul derivation properties, see [17,18,22]. In a chart domain U , they are characterized by their Christoffel symbols Γ i jk : They can be seen as vertically trivial linear connections on the vector bundle VA → A [21,Th. 3]. On the other hand, every anisotropic connection has an underlying nonlinear connection, the only one with nonlinear coefficients N i j := Γ i jk y k . As a consequence, they define the covariant derivative ∇ : T r s (M A ) → T r s+1 (M A ) for any anisotropic tensor: ..,ir j 1 ,...,k,...,js .

2.3.
Lorentz-Finsler metrics. From now on, we will always assume that A is conic (λv ∈ A for v ∈ A and λ ∈ (0, ∞)). We shall follow the definitions and conventions in [20,21]. In particular, a Finsler spacetime (M, L) is a (connected) manifold M endowed with a (properly) Lorentz-Finsler metric . L is required to be smooth, positive homogeneous and, when restricted to each A p := T p M ∩ A (p ∈ M ), its vertical Hessian g is non-degenerate with signature (+, −, . . . , −); A p must be connected and salient, and its boundary in TM \ 0, which must be equal to L −1 (0), is a (strong) cone structure C. In particular, at each point p, L is a Lorentz norm. By positive homogeneity, L is determined by its indicatrix L −1 (1). Notice that the cone C yields a natural notion of timelike, lightlike and spacelike tangent vectors but L is not defined on the latter. Indeed, we are not interested in the value of L on spacelike vectors by physical reasons which are analyzed in [1]. Roughly, only particles (massive, massless) can be measured and, so, experimental evidences only can affect Σ and C. Even though this also happens in classical Relativity, the value of the Lorentz metric on the (future-directed) timelike vectors is enough to extend it to all the directions. Indeed, the anisotropies in Finsler spacetimes should be regarded as originated by the distribution of matter and energy in the causal directions rather than by (unobservable) spacelike anisotropies.
Even though it is the Lorentz-Finsler case which has a physical interpretation, in all other aspects the theory carries on if L is just pseudo-Finsler, namely positively 2-homogeneous with non-degenerate g on A. In fact, this is the context in which we will develop §4 and 5, as they are of a more mathematical character.
The Cartan tensor of L is ∂g ij ∂y k . It is actually symmetric, so one can define the mean Cartan tensor as for X ∈ X(M A ). L has also a canonically associated connection: the metric nonlinear connection, HA, of nonlinear coefficients This is the underlying nonlinear connection of several anisotropic connections. One is the (Levi-Civita)-Chern ∇, the only symmetric anisotropic connection that parallelizes g. It is the horizontal part of Chern-Rund's and Cartan's classical connections and it has Christoffel symbols where the δ i are those associated with (4). Another one is the Berwald ∇. This is the horizontal part of Berwald's and Hashiguchi's classical connections and it has Christoffel symbols Here, Lan i jk are the components of a tensor metrically equivalent to the Landsberg tensor of L, which, among many other ways, can be defined as for the N l k of (4) (see [17, (37)]). The Landsberg tensor is actually symmetric too, so one can define the mean Landsberg tensor of L as 3. Basic interpretations on the stress-energy tensor T Let us start with a discussion at each event p ∈ M of a Finsler spacetime (M, L). We can consider T p M endowed with the Lorentz norm L| TpM . In most of this section, the discussion relies essentially on the particular case when M is a real affine n-space with associated vector space V (which plays the role of T p M in the general case) and L is a Lorentz-Finsler norm on V with indicatrix Σ and cone C included in V . Given u, u ′ ∈ Σ, consider the corresponding fundamental tensors g u and g u ′ and take orthonormal bases B u , B u ′ , obtained extending u, u ′ . In a natural way, these bases live in T u V, T u ′ V and they can be identified with bases in V itself. Assuming this, the change of coordinates between B u , B u ′ is linear but not a Lorentz transformation, in general.
Extending the interpretations in Relativity, p ∈ M is an event, the affine simplification includes the case of Very Special Relativity [3,5,10], u ∈ Σ can be regarded as an observer, the tangent space to the indicatrix T u Σ (i.e., the subspace g u -orthogonal to u in T u V ≡ V ) becomes the restspace of the observer u, and B u is an inertial reference frame for this observer. The Lorentz invariance breaking corresponds to the fact that the bases B u and B u ′ are orthonormal for the different metrics g u , g u ′ and, thus, the linear transformation between the coordinates of B u and B u ′ (when regarded as elements of the same vector space T u V ≡ V ≡ T u ′ V ) is not a Lorentz one.
If the affine simplification is dropped, such elements (observers, restspaces) must be regarded as instantaneous at p ∈ M .
It is worth emphasizing that, according to the viewpoint introduced in [19] and discussed extensively in [1], the spacelike directions are not physically relevant for the Lorentz-Finsler metric. However, each (instantaneous) observer does have a restspace with a Euclidean scalar product. In the case of classical Relativity, Lorentz-invariance permits natural identifications between these restspaces, and they become consistent with the value of the scalar product on spacelike directions. Certainly, a Lorentz norm L could be extended outside these directions (maintaining the Lorentz signature for its fundamental tensor) but this can be done in many different ways, and no relation with the scalar products g u , u ∈ Σ would hold.
The dropping of natural identifications associated with the Lorentz invariance implies that many notions which are unambiguously defined in classical Relativity admit many different alternatives now. In the Appendix we analyze some of them for the relative velocity between observers as well as other kinematical concepts. This is taken into account in the following discussion about how the Finslerian setting affects the notion of energy-momentumstress tensor.
3.1. Particles and dusts: anisotropic picture of isotropic elements. In principle, there is no reason to modify the classical relativistic interpretation of p = mu as the (energy-) momentum vector of a particle of (rest) mass m > 0 moving in the observer's direction u ∈ Σ. Moreover, if the particle moves in such a way that m is constant, it will be represented by a unit timelike curve γ(τ ) such that p(τ ) = mγ ′ (τ ) will be its instantaneous momentum at each proper time τ . The (covariant) derivative p ′ = mγ ′′ would be the force F acting on the particle, which is necessarily g γ ′ -orthogonal to γ ′ (i.e., the force lies in the instantaneous restspace of the particle). Then, the relativistic conservation of the momentum in the absence of external forces would retain its natural meaning, namely, if the particle represented by (m, γ) splits into two (m 1 , γ 1 ) and (m 2 , γ 2 ) at some τ 0 then mγ ′ (τ 0 ) = m 1 γ ′ 1 (τ 0 ) + m 2 γ ′ 2 (τ 0 ). The Appendix suggests that the way how an observer u may measure the energy-momentum and conservation may be non-trivial. In particular, if one assumes that an observer u measures mγ ′ ∈ T p M by using a g u -orthonormal basis B u in general, g u (mγ ′ , mγ ′ ) = m 2 (= L(mγ ′ )). Moreover, as we have already commented, the coordinates for other observer u ′ will not transform by means of Lorentz transformation. However, as the transformation of their coordinates is still linear, and both of them will write consistently mγ ′ (τ 0 ) = m 1 γ ′ 1 (τ 0 ) + m 2 γ ′ 2 (τ 0 ) in their coordinates. Particles are also the basis to model dusts, which constitute the simplest class of relativistic fluids. A dust is represented by a number-flux vector field N = nU , where U represents the intrinsic velocity of the particle in the dust, i.e. a comoving observer, and n is the density of the dust for each momentaneously comoving reference frame. Comparing with the case of energy momentum, N is also an intrinsic object which lives at the tangent space of each point and U gives the priviledged observer who measures n.
However, the measures of n by different observers involve different measures of the volume. As explained in the Appendix, the length contraction may be fairly unrelated to the relative velocities of the observers. This implies a more complicated transformation of the coordinates by different observers. Anyway, the transformations between these coordinates would remain linear and, so, they could still agree in the fact that they are measuring the same intrinsic vector field.
Summing up, in the case of both particles and dusts, one assumes that the physical property lives in V (or, more properly, in each tangent space T p M of the affine space) and there is a priviledged (comoving) observer u. The transformation of coordinates for other observer u ′ may be complicated but, at the end, it is a linear transformation which can be determined by specifying the geometric quantities which are being measured as well as the geometry of Σ. Thus, by using the coordinates measured by each observer one could construct and anisotropic vector field at each p ∈ M , which will fulfill some constraints, as the measurement by one of the observers (in particular, the priviledged one) would determine the measurements by all the others.

3.2.
Emergence of an anisotropic stress-energy tensor. The situation, however, is subtler for more general fluids, which are modelled classically by a 2-tensor on the underlying manifold.
Let us start recalling the Newtonian and Lorentzian cases. In Classical Mechanics one starts working in an orthonormal basis of Euclidean space to obtain the components T ij of the Cauchy stress tensor, which give the flux of i-momentum (or force) across the j-surface in the background 2 . The laws of conservation of linear momentum and static equilibrium of forces imply that these components give truly a 2-tensor (linear in each variable) and the conservation of linear momentum implies that this tensor is symmmetric.
In the relativistic setting, each observer will determine some symmetric components T ij in its restspace by essentially the same procedure as above. Additionally, it constructs T 00 , T 0i and T i0 as the density energy, energy flux across i-surface and i-momentum density, resp. The interpretation of these magnitudes completes the symmetry 3 T 0i = T i0 as well as the linearity in the 0-component. However, the bilinearity in the components T µν has been only ensured for vectors in the restspace of the observer. In Relativity, one can claim Lorentz invariance in order to complete the reasons justifying that, finally, the components T µν will transform as a tensor 4 .
Nevertheless, it is not clear in Lorentz-Finsler geometry why the transformation of the components T ij from an observer u to a second one u ′ must be linear, taking into account that they apply to spacelike coordinates in distinct Euclidean subspaces and no Lorentz-invariance is assumed. Indeed, the following simple academic example shows that this is not the case.
Example 3.1. Assume that (M, L) is an affine space with a Lorentz norm with domain A and consider the anisotropic tensor 5 T = L −1 φ C ⊗ C, where C is the canonical (Liouville) vector field and φ : Σ → R is a smooth function which is extended as a 0-homogeneous function on A. Then, for each u ∈ Σ and w ∈ T u Σ one has T u (u, u) = φ(u), T u (w, w) = 0, T u (u, w) = 0. In this case, each T u is a symmetric 2-tensor, but the information on T requires the knowledge of φ(u) for all possible u ∈ Σ. Recall that this example holds even if (M, L) is the Lorentz-Minkowski spacetime regarded as a Finsler spacetime (but no Lorentz-invariance is assumed for T).
Therefore, the following issues about T appear: (a) Observer dependence: even if we assume that the components T µν measured by any observer u are bilinear and then, it is a standard tensor, the components measured by a second observer u ′ may transform by a linear map which depends on Σ as well as the experimental way of measuring (as in the case of the energy-momentum vector). (b) Nonlinearity: it is not clear even why such a linear transformation must exist, as bilinearity is only ensured in the direction of u and of its restspace. Thus, the tensor T u measured by a single observer u would not be enough to grasp the physics of the fluid at each event p ∈ M , as in the example above. (c) Contribution of the anisotropies of Σ: as an additional possibility, the local geometry of Σ at u underlies the measurements of this observer and might provide a contribution for the stress-energy tensor itself. Summing up, Lorentz-Finsler geometry leads to assume that the measurements by u are not enough to determine the state of the fluid and the stress-energy tensor should be regarded as a non-isotropic tensor field, determined by the measurements of all the observers.
Formally, this means an anisotropic tensor T ∈ T 2 0 (M A ) (see [21] for a summary of the formal approach), which can be expressed locally as T v depends only on the direction of v). As a first approach (recall footnote 3), we can assume T µν = T νµ . Consistently, we will assume that there exists a Lorentz-Finsler metric L on M with indicatrix Σ ⊂ TM and, so, indexes can be raised and lowered by using its fundamental tensor g. The fact that T has order 2 is important to establish classical analogies. However, other tensors might appear as more fundamental energy-momentum tensors and, then, one would try to derive a semi-classical 2-tensor as in §3. 3.
In principle, the intuitive relativistic interpretations would be transplanted directly to each v, whenever v ∈ Σ. That is, given two g v -unit vectors u, w, the value T v (u, w) of the 2-covariant stress-energy tensor perceived by the observer v (at x = π(v)) is obtained as the flux of w-energy-momentum per unit of g v -volume orthogonal to u. More precisely, let B(u) be a small coordinate 3-cube in a hypersurface g v -orthogonal to u and P B is the total flux of the energy-momentum of particles crossing B(u) (being positive from the −u side to the u side and negative the opposite direction), then the w-energy-momentum per unit of g v -volume is where ǫ = g v (w, w). As a Finslerian subtlety, recall that g v is only defined in T v (T x M ) and then in T x M (i.e., it is trivially extended to B(u) in a coordinate depending way), but the above limit depends only on the value of g v . Namely, if one considers two semi-Riemannian metrics g andg in a neighborhood of p such that g p =g p and B n are open subsets with p in the interior of B m for all n ∈ N and lim n→+∞ vol g (B m ) = 0, then In particular, we have the interpretations (recall signature (+, −, −, −)): , measures the flow of energy per unit of g v -volume in a surface g v -orthogonal to v and w (i.e. some small surface of area A flowing a lapse ∆t), while T v (v, u) measures the w-momentum density, (3) If z, w are g v -orthogonal to v and g v -unit, T v (z, w) measures the flow of w-momentum per unit of g v -volume in a surface g v -orthogonal to v and z,

Lagrangian viewpoint.
In the Lagrangian approach for Special Relativity, the background spacetime is assumed to be endowed with a flat metric η. So, the Lagrangian L is constructed by using the prescribed η and some matter fields φ α . The stress-energy tensor coincides with the canonical energy-momentum tensor associated with the Lagrangian, in most cases (the exceptions include theories involving spin). This canonical tensor appears as the Noether current associated with the invariance by spacetime translations (i.e., when In principle, these interpretations would hold unaltered for the case of an affine space with a Lorentz norm, including the case of Very Special Relativity. In General Relativity, however, the Lagrangian formulation introduces a background Lagrangian independent of matter fields (the Einstein-Hilbert one, eventually with a cosmological constant) and, then, a matter Lagrangian L matter which includes a constant of coupling with the background. Then, the safest way to define the stress-energy is the canonical one obtained as the corresponding action term δS matter /δg µν in the Euler-Lagrange equations 7 , Any tensor obtained in this way will have some advantages to play the role of a stress-energy tensor, because it will be automatically symmetric (in contrast to (8)) and will have vanishing divergence.
In the Finslerian setting, the variational viewpoint has been systematically studied in a very recent paper by Hohmann, Pfeifer and Voicu [16]. Previously, the background Lagrangian closest to the Einstein-Hilbert functional in the Finslerian setting had been studied in [35,13]. Such a functional is obtained as the integral of the Ricci scalar function on the indicatrix of the Lorentz-Finsler metric 8 L. Taking into account this background functional, they define the energy-momentum scalar function by taking the corresponding variational action term [16, formula (84)], Notice that, here, the functional coordinate for the Lagrangian is L and, thus, an (anisotropic) function rather than a 2-tensor is obtained. However, starting at this function some tensors become useful [16, formulas (88), (91)], in particular a canonically associated (anisotropic Liouville) 2-tensor Notice that, essentially, the information of these tensors is codified in T. Even though such a tensor is justified by the procedure of Gotay-Mardsen in [11], some issues as the following ones might deserve interest for a further discussion: (1) This is not the unique natural possibility to construct an anisotropic 2-tensor starting at T. For example, an alternative would be the vertical Hessian 9 , It is natural to wonder about the choice closer to the relativistic intuitions about the stress-energy. (2) Recently, the Palatini approach has also been studied for the Finslerian setting [22]. There, the dynamic variables are L and the components of an (independent) non-linear connection. Thus, a similar Lagrangian procedure would lead to a higher order tensor. In the relativistic setting this approach supports classical Relativity, as it recovers both equations and (in the symmetric case) the Levi-Civita connection. However, the Palatini approach is no longer equivalent in the Finslerian case, as it yields non-equivalent connections and it shows a variety of possibilities for the non-linear connections. So, it is natural to wonder about the most natural choice of a Lagrangianbased stress-energy tensor in this setting. Finally, let us discuss an example analyzed from the Lagrangian viewpoint in [14,16] taking into account also the observers' one in §3.2.
Example 3.2. The gravitational field sourced by a kinetic gas has been deeply studied in [14,16]. In the relativistic setting, this is derived from the Einstein-Vlasov equations in terms of a 1 particle distribution function (1PDF) φ(x,ẋ) which encodes how many gas particles at a given spacetime point x propagate on worldlines with normalized 4-velocityẋ. Specifically, the stress energy tensor is: being Σ x the indicatrix (future-directed unit vectors of the Lorentz metric) and dVol x the volume at each x. In [14], they propose to derive the gravitational field of a kinetic gas directly from the 1PDF without averaging, i.e., taking into account the full information on the velocity distribution. This leads to consider the function φ : Σ → R, u ≡ (x,ẋ) → φ(u) ≥ 0 as an energy-momentum function which plays the role of a stress-energy tensor (even though it is a scalar rather than a 2-tensor). Moreover, the original Lorentz metric is naturally allowed to be Lorentz-Finsler, which permits to obtain more general cosmological models [14,§III].
Indeed, up to a coupling constant, φ is regarded directly as the matter source in the Finslerian Einstein-Hilbert equation (i. e., it is placed at the right-hand side of this equation, [14, eqn. (7)]). It is worth pointing out: • φ can be reobtained as a Lagrangian energy-momentum by inserting it directly as a term in the background Lagrangian [16, eqn. (75)]. However, the Lagrangian is not natural then, as it depends on the variables of M (recall [16, Appendix 3, §(a)]). 9 The multiplication by L is so that taking second vertical derivatives of the 2homogeneous TL produces a 0-homogeneous tensor, in the same way that the vertical Hessian of the 2-homogeneous function L is the 0-homogeneous fundamental tensor g.
• As discussed above, such a function allows one to construct several tensors, in particular the vertical Hessian ∂ 2 φ/∂ẋ µ ∂ẋ ν (as in (10)), which also might play a role to compare with the relativistic T µν (x).
Anyway, starting at the 1PDF φ, another Finslerian interpretations would be possible. In particular, one can define the energy momentum distribution φ(u)u. Then, given an observer v ∈ Σ and a g v -unit vector, the w-energy momentum might be defined as In particular, when w = v this would be the energy perceived by v and when w is unit and g v -orthogonal to v would be (minus) the momentum in the direction w (compare with the discussion at the end of §3.2). So, an alternative stress-energy tensor perceived by each observer v ∈ Σ might be defined as the anisotropic tensor: where the integration in u is carried out with the volume form of (Σ π(v) , g v ), denoted by dvol gv .

Divergence of anisotropic vector fields
After studying the basic properties of the Finslerian stress-energy tensor T , our next aim is to analyze the meaning and significance of the infinitesimal conservation law div(T ) = 0. Along this and the next section, we will always consider an anisotropic tensor T ∈ T 1 1 (M A ) interpreted as an endomorphism of anisotropic vector fields. T ♭ ∈ T 0 2 (M A ) and T ♯ ∈ T 0 2 (M A ) will be defined on vectors and 1-forms by T ♭ (X, Y ) := g(X, T (Y )) and T ♯ (θ, η) := g * (T * (θ), η) resp., where g * is the inverse fundamental tensor and T * is the transpose of T . They will have components T ♭ ij = g il T l j =: T ij and T ♯ ij = T i l g lj =: T ij , and in principle we will not even assume that these are symmetric. We will be assuming that M is orientable an oriented. This is not restrictive: one could always reduce the theory to this case by pulling back all the objects (the fibered manifold A → M included) to the oriented double cover of M [27,Ch. 15].
Let us briefly recall the mathematically precise meaning of the conservation laws in classical General Relativity (g, T and X isotropic). One has with ∇ the Levi-Civita connection. The first contribution vanishes due to div(T ) = 0, and there are different situations in which the second one vanishes as well. For instance, if T ♭ (−, ∇ − X) is antisymmetric, then and if T ♭ is symmetric and ∇X ♯ is antisymmetric (equiv., X is a Killing vector field), then also Anyway, whenever trace(T (∇X)) = 0, one can integrate (11) and apply the pseudo-Riemannian divergence theorem to get the integral conservation law where D is a domain of appropriate regularity, ı is the interior product operator and dVol is the metric volume form. In a sense that will be made more precise in §5, this is expressing that the total amount of X-momentum in a space region only changes along time as much as it flows across the spatial boundary of the region. In this way, there is no "creation" nor "destruction" of X-momentum in any space region.
Extending the infinitesimal or the integral conservation laws poses, first and foremost, the problem of appropriately defining the divergence of an anisotropic T . Observe that a priori it is not clear even how to define the divergence of a vector field Z, isotropic or not, as one could consider trace(∇Z) for different anisotropic connections ∇, mainly Chern's and Berwald's. An alternative is to seek for a more geometric, hence unbiased, definition. For instance, the metric (anisotropic) volume form of L, for (x 1 , ..., x n ) positively oriented, is well-defined, and when Z ∈ X(M ) (i. e., Z is isotropic), so is the Lie derivative (see [17, §5]). So, by analogy with the classical case, one could think of L Z (dVol) for defining div(Z). It turns out that the unbiased definition, including all Z ∈ X(M A ), is achieved with a modification of this Lie derivative that we will regard as an extension of the classical Lie bracket. We devote the next subsection to the technical mathematical foundations of such an anisotropic Lie bracket, which needs of a nonlinear connection on A → M to be well-defined. All the maps T (M A ) → T (M A ) that will appear in §4.1 will be (anisotropic) tensor derivations in the sense of [17, Def. 2.6] and their local nature will be apparent, so we will not explicitly discuss it. For example, the Lie derivative along Z ∈ X(M ) is the only tensor derivation such that for X ∈ X(M ) and f ∈ F(A), 4.1. Mathematical formalism of the anisotropic Lie bracket. During this subsection, we fix an arbitrary nonlinear connection given by TA = HA ⊕ VA or by the nonlinear covariant derivative D (keep in mind (1) and (2)), and also an anisotropic vector field Z ∈ X(M A ).
For X ∈ X(M A ), it is very natural to consider the commutator of the horizontal lifts of Z and X: We recall that Z j X k [δ j , δ k ] is always vertical. Indeed, [δ j , δ k ] = R i jk∂ i , where R is the curvature tensor of the nonlinear connection (see [22], where this curvature is regarded as an anisotropic tensor and the homogeneity of the connection is not really required). This means that the horizontal part of Z H , X H has coordinates Z j δ j X i − X j δ j Z i , and this corresponds to a globally well-defined A-anisotropic vector field: Definition 4.1. l H Z X is the anisotropic Lie bracket of Z and X with respect to the nonlinear connection HA.
Remark 4.2. The word "anisotropic" could be omited in the previous definition, in the sense that for Z, X ∈ X(M A ), there is no other Lie bracket, isotropic or not, defined in general. Nonetheless, (17) makes apparent that when Z, X ∈ X(M ) (i. e., when Z and X are isotropic), l H Z X coincides with the standard Lie bracket [Z, X] regardless of the connection.
We also recall that the torsion of an A-anisotropic connection ∇ [17, where the Γ i jk 's are the Christoffel symbols of ∇. 10 Theorem 4.4. Let a nonlinear connection TA = HA⊕VA and an anisotropic vector field Z ∈ X(M A ) be fixed. (A) If ∇ is any A-anisotropic connection whose underlying nonlinear connection is HA, then for any X ∈ X(M A ), (where Tor is the torsion of ∇).
(B) By imposing the Leibniz rule with respect to tensor products and the commutativity with contractions, the map X → l H Z X extends unequivocally to an (anisotropic) tensor derivation l H for θ µ ∈ Ω 1 (M ) and X ν ∈ X(M ). In coordinates, if is also a tensor derivation. When Z ∈ X(M ), for all T ∈ T (M A ), where L Z is the Lie derivative (16), regardless of the nonlinear connection.
Proof. (A) It is straightforward to compute that the right hand side of (21) is F(A)-multilinear. Moreover, the identity is trivial on isotropic vector fields X, Z ∈ X(M ), as l H Z X = [X, Z] in this case, which concludes.
10 This is not to be mistaken by the torsion of the nonlinear connection HA, which would have coordinates N i j ·k − N i k ·j (even though this can be seen as a particular case of the torsion of some ∇ and hence it is also denoted by Tor in [22]).
Thus, in order to respect the Leibniz rule, the only possibility is to define Now, given θ ∈ T 0 1 (M A ) = Ω 1 (M A ), in order to respect again the Leibniz rule and the commutativity with contractions, the only possibility is to define l H Z θ on every X ∈ X(M A ) by (26), (17) and (27) make apparent that l H Z is already local on functions, vector fields and 1-forms, and they allow to compute Finally, given T ∈ T r s (M A ), one is led to define l H Z T by (22). Clearly, this indeed provides a tensor derivation and (23) follows from the evaluation of (22) at (dx i 1 , ..., dx ir , ∂ j 1 , ..., ∂ js ) together with (26) and (28).
is a tensor derivation for any X ∈ X(M A ), in particular for (see (17)). Thus, the difference L H Z = l H Z −∂ l H Z C is again a derivation. As for the last assertion, where Z ∈ X(M ), we are going to use [17,Prop. 2.7]. For X ∈ X(M ), we have  (26), (29), (1) and (16)). As L H Z and L Z act the same on isotropic vector field and anisotropic functions, they are equal.
(D) Observe that for X ∈ X(M ), the term∂ D Z V X vanishes in (19).
. Given a local reference frame E 1 , ..., E n ∈ X(U ), and taking into account the last two identities and the definitions of l H and L, it follows that ω(E 1 , ...,∂ D E i V Z, ..., E n ).
As ω(E 1 , ...,    (24): whenever the Lie derivative along Z was already defined, L H Z coincides with it. Even though the Lie bracket and the Lie derivative are equal in the classical regime, it is heuristically useful to regard l H as the anisotropic generalization of the former and L H as that of the latter, in order to distinguish them. It is actually l H , and not L, which will be relevant for the definition of divergence. The reason is that the former, as we will see below, has a clear geometric interpretation in terms of flows, while the latter would just add the term∂ l H Z C to that interpretation. Moreover, Th. 4.4 (D) actually corresponds to a Cartan formula for L Z whose full development we postpone for a future work. Thus, L Z (dVol) = L H Z (dVol) can be regarded as an initial guess for the divergence of Z, but we will not employ L H from now on.
Let us observe that given a diffeomorphism ψ t : M → M that is the flow of an isotropic vector field Z, we can define the pullback ψ * t (ω) of an anisotropic differential form ω ∈ Ω s (M A ) as the anisotropic form given by ψ * t (ω) v (u 1 , ..., u s ) := ω Pt(v) (dψ t (u 1 ), ..., dψ t (u s )), where P t (v) is the HAparallel transport of v along the integral curve of Z and u 1 , ..., u s ∈ T π(v) M .
where ψ t is the (possibly local) flow of Z.
Proof. Observe that ψ * t (ω) v can be obtained as ψ * t (ω V ) with V an extension of v such that D Z V = 0. Then (25) and the classical formula for the Lie derivative in terms of the flow imply (31).
Remark 4.8. Even though, for convenience, we stated the previous geometrical interpretation for an s-form ω, it should be clear that it holds true for any r-contravariant s-covariant A-anisotropic tensor.

4.2.
Lie Bracket definition of divergence. Finally, in this and the next subsections a pseudo-Finsler metric L defined on A is fixed again. In its presence, and in view of the Riemannian case and Prop. 4.7, the most natural way of defining the divergence of an anisotropic vector field Z is by l H Z (dVol). Here there is a canonical choice for HA: the metric nonlinear connection of L. The definition obtained this way is unbiased, in that one does not choose any anisotropic connection a priori. Notwithstanding, it will turn out to be most conveniently expressed in terms of the Chern connection. where HA and dVol are, resp., the metric nonlinear connection (4) and the metric volume form (15) of L.
Remark 4.10. Even though we will keep assuming it for simplicity, the hypothesis of M being orientable is not really needed for this definition. As in pseudo-Riemannian geometry, on small enough open sets U ⊆ M it is always possible to choose an orientation, define dVol U ∈ Ω n (M A ) with respect to it and put div(Z)| A∩TU dVol U := l H Z (dVol U ). The different definitions will be coherent because when the orientation changes, dVol U changes to −dVol U and In particular, when M is orientable, div(Z) is independent of the orientation choice.
Proposition 4.11. Let L be a fixed pseudo-Finsler metric defined on A, and let Z ∈ X(M A ). If ∇ is any symmetric A-anisotropic connection such that its underlying nonlinear connection is the metric one and ∇ Z (dVol) = 0, then div(Z) = trace(∇Z), (32) or in coordinates, This, in particular, is true for the (Levi-Civita)-Chern anisotropic connection of L, so one can take the Christoffel symbols to be those of (5).
Proof. One expresses the Z-Lie bracket of the volume form in terms of the anisotropic connection, analogously to the isotropic case. From (15) and the fact that l H Z is a tensor derivation, we obtain div(Z) |det g ab | = div(Z)dVol(∂ 1 , ..., ∂ n ) (26) and the fact that HA is the underlying nonlinear connection of ∇ give l H Z (dVol(∂ 1 , ..., ∂ n )) = Z H (dVol(∂ 1 , ..., ∂ n )) = ∇ Z (dVol(∂ 1 , ..., ∂ n )). From these and ∇ Z (dVol) = 0, where the last equality is reasoned analogously as in the proof of (25). For the Chern connection, it can be checked that ∇(dVol) = 0 by considering a parallel orthonormal basis with respect to a parallel observer V along the integral curves of any vector field. The coordinate expression of trace(∇Z) in this case concludes (33).

Divergence theorem and boundary term representations. Our
Lie bracket derivation allows us to obtain a statement of the Finslerian divergence theorem that subsumes both Rund's [36, (3.17)] and Minguzzi's [30,Th. 2]. This way, it does not need of computations in coordinates from the beginning nor of the "pullback metric" (g V in our notation). Naturally, our statement does not include Shen's [38, Th. 2.4.2], as this one is an independent generalization of the Riemannian theorem not dealing with anisotropic differential forms nor vector fields. Lemma 4.12. For X ∈ X(M A ), the vertical derivative of dVol is given bẏ where C m is the mean Cartan tensor of L (see (3)).
In the present article, by a domain D we understand a nonempty connected set which coincides with the closure of its interior D; then its boundary is ∂D = ∂D. Physically, it is very important to include examples in which different parts of ∂D have different causal characters, and this tipically leads to the boundary not being totally smooth. Hence, we will make a weaker regularity assumption that still allows one to apply Stokes' theorem on D. A subset of M has 0 m-dimensional measure if its intersection with any embedded m-dimensional submanifold σ ⊆ M is of 0 measure in the smooth manifold σ. Finally, the interior product of an s-form ω with a vector field X will be ı X ω := ω(X, −, ..., −).
where C m is the mean Cartan tensor and DV is computed with the metric nonlinear connection (4).
Proof. The idea is to apply Stokes' theorem to L Z V (dVol V ). But taking into account (25) and Lem. 4.12, it follows that concluding (36). (i) Even though we do not use the pseudo-Riemannian metric g V to derive Th. 4.13, from our physical viewpoint it is natural to use it to re-express the boundary term. If Γ is non-g V -lightlike, then for a g V -normal field N V and a transverse field X along i, the form is nonvanishing and independent of X. In particular, is independent of the scale of N V , which we will always assume to be g V -unitary and D-salient, so coincides with the hypersurface g V -volume form of Γ. Taking into account that i * (ı Z V (dVol V )) vanishes wherever Z V is tangent to Γ and that g V ( N V , N V ) = ±1, (37) allows us to represent and the right hand side of (36) as In fact, this is how Rund's divergence theorem follows from Th. 4.13. (ii) There is another way that one can try to represent the boundary term. Namely, assume that there exists a smooth ξ : p ∈ Γ → ξ p ∈ A∩ T p M with T p Γ = Ker g ξp (ξ p , −) and L(ξ p ) = ±1 (in the Lorentz-Finsler case, it will necessarily be L(ξ) = 1). This is called a Finslerian unit normal along Γ. Analogously as in (i), one can put here, due to the possible orientation difference between both sides, In fact, this is how Minguzzi deduces his divergence theorem [30,Th. 2]. Note, however, that he does it under the hypothesis of vanishing mean Cartan tensor (C m = 0), which implies that dΣ ξ V is independent of V . As we do not require this, Th. 4.13 is more general statement than Minguzzi's. (iii) The Finslerian unit normal presents some issues in the general case, as we are not taking A = TM \ 0. In our physical interpretation, with L Lorentz-Finsler, A consists of timelike vectors, so asking for a Finslerian unit normal is only reasonable when Γ is L-spacelike, that is, T p Γ∩(A ∩ ∂A) = ∅ for p ∈ Γ. In such a case, the strong concavity of the indicatrix {v ∈ A p : L(v) = 1} guarantees the existence and uniqueness of ξ: one defines ξ p to be the unique vector such that T p Γ + ξ p and the indicatrix are tangent at ξ p . (iv) Of course, if L comes from a pseudo-Riemannian metric on M , then (v) It should be clear from this discussion that the form that one integrates on the right hand side of (36) is always the same and that the only difference between Rund's and Minguzzi's divergence theorems is how each of them represents it. Notwithstanding, this is an important difference, for the boundary terms (38) and (39) could potentially have different physical interpretations.

Divergence of anisotropic tensor fields
Our developments of the previous section will allow us to obtain integral Finslerian conservation laws for a tensor T with div(T ) = 0. We obtain one for each V ∈ X A (U ) satisfying certain hypotheses. Physically, T can be interpreted as an anisotropic stress-energy tensor and V as an observer field. We will also revisit two of the main examples with a clearer physical interpretation: Special Relativity and the conservation of the "total energy of the universe". In order to do all this, let us see how the Chern connection enters the Finslerian definition of div(T ).

5.1.
Definition of divergence with the Chern connection. Prop. 4.11 motivates the most natural definition of divergence of T ∈ T 1 1 (M A ). Namely, by analogy with the classical case, we shall require (11) to hold for any anisotropic vector field X ∈ X(M A ). This makes the Chern connection appear now: it is the only Finslerian connection ∇ for which one can assure that (32) holds independently of Z := T (X). We shall also explore the conditions under which the term trace(∇Z) vanishes in the general Finslerian setting.
Proposition 5.1. Let L be a fixed pseudo-Finsler metric defined on A with metric nonlinear connection HA and Chern anisotropic connection ∇. Also, let S ∈ T 0 2 (M A ) be symmetric, v ∈ A, T ∈ T 1 1 (M A ) and X ∈ X(M A ). (A) The following are equivalent.
2 is the operator that contracts the contravariant index with the covariant one introduced by ∇. (C) One has trace(T (∇X))(v) = 0 assuming any of the following conditions.
, which is exactly the anti-self-adjointness of ∇ v X with respect to S v . Besides, (26) and (21) together with Tor = 0 for the Chern connection give which shows that l H X S v = ∇ v X S also is equivalent to the anti-self-adjointness. For (B), all the computations in (11) hold formally the same in the general Finslerian case due to Prop. 4.11.
As for the vanishing of trace(T (∇X))(v), it follows from (Ci) by the same computations as in (12). Indeed, the antisymmetry can be expressed as It also follows from (Cii) by (13). Indeed, l H X g v = 0 is equivalent to ∇ v X being anti-self-adjoint with respect to g v , and this can be expressed as Remark 5.2 (l H X g and Finslerian Killing fields). In classical Relativity (g, T and X isotropic), the second condition in (C ii) above would read (L X g) π(v) = 0, and L X g = 0 would be equivalent to X being a Killing vector field. In the general case, X being Killing can be defined by the conditions X ∈ X(M ) and L X L = 0 [17, §5], but (using Th. 4.4 (C), the facts that∂C = Id and C(C, −, −) = 0, and also (40)) This way, we see that neither of X being Killing or l H X g = 0 implies the other, and additionally we recover the characterization of [12, Prop. 6.1 (i)].
for the Christoffel symbols of (5).

Remark 5.4 (Divergence vs. raising and lowering indices).
(i) First and foremost, by construction, (11) indeed holds for any X ∈ X(M A ). At this point, it is important that the connection with which one defines trace(∇X) is the Chern one. (ii) Thanks to the fact that the Chern connection parallelizes g, namely ∇ k g ij = 0 and ∇ k g ij = 0, the following hold: (43) This means that one could define the divergences of S ∈ T 0 2 (M A ) and R ∈ T 2 0 (M A ) straightforwardly, 11 div(S) = C 1,3 (∇S) ∈ T 0 1 (M A ) = 11 Here, C1,3 is the operator that (metrically) contracts the first index of S with the one introduced by ∇, and C 1 1 is the operator that (naturally) contracts the first index of R with the one introduced by ∇. (iii) Regardless of this, in general we are not assuming the symmetry of T ♭ or T ♯ , we only did in Prop. 5.1 (Cii). Instead, at the beginning of §5 we fixed a convention for the order of the indices in T ij and T ij (for example, T ♭ (X, Y ) = g(X, T (Y )) = g(T (X), Y )). In the remainder of §4 and with said condition (Cii) only.

5.2.
Chern vs. Berwald. One needs to keep in mind a discussion present in [21]. The metric connection HA is the underlying nonlinear connection of an infinite family of A-anisotropic connections ∇. One of them is the (Levi-Civita)-Chern connection of L, which is the horizontal part of Chern-Rund's and Cartan's classical connections and has Christoffel symbols (5). All the others are this one plus an anisotropic tensor Q ∈ T 1 2 (M A ) with Q(−, C) = 0 when viewed as an F(A)-bilinear map X(M A ) × X(M A ) → X(M A ). In particular, for Q = −Lan ♯ , one gets the Berwald anisotropic connection of L, which is the horizontal part of Berwald's and Hasiguchi's classical connections and has Christoffel symbols (6). We did not a priory select any of these ∇'s.
In some of the previous literature [6,29,32,33], the Finslerian divergence of vector fields was chosen to be defined directly with the Chern connection. In [36,30], the quantity trace(∇Z), with ∇ the Chern anisotropic connection, was referred to as the divergence of Z, though only after it had appeared in the divergence theorem. We have proven that the most natural definition leads to this characterization, hence clarifying why using Chern's covariant derivative is not arbitrary. Moreover, we have seen that said derivative fulfills the natural requisite (11) and is compatible with the lowering and raising of indices; these are key properties when it comes to the stressenergy tensor T . Still, it is important to compare this with what happens when one uses the other most natural covariant derivative: Berwald's.
Remark 5.5 (Divergence in terms of the Berwald connection). Let ∇ be the Chern anisotropic connection of L, with Christoffel symbols (5), and ∇ be the Berwald one, with symbols (6).
(i) (33) and (41) read respectively where Lan m is the mean Landsberg tensor (see (7)) and the contraction operators have the obvious meanings. Moreover, for X ∈ X(M A ) trace(T (∇X)) = T i j ∇ i X j = T i j ∇ i X j + T i j Lan j ik X k = trace(T ( ∇X)) + trace(Lan ♯ (T (−), X)), which makes (11) consistent with the previous formulas.
(ii) One sees that the vanishing of Lan m (or of the mean Cartan C m , see [39, (6.37)]) implies that the divergence of elements of X(M A ) coincides with the trace of their Berwald covariant derivative. However Lan m = 0 (or even C m = 0) is not enough if one wants to obtain the same characterization for elements of T 1 1 (M A ). Remark 5.6 (Sufficient conditions for l H X g = 0 and being Finslerian Killing). In Rem. 13 one could see that X ∈ X(M ) together with ∇ C X = 0 is sufficient for X to be Killing. This condition does not privilege the Chern connection ∇ against the Berwald ∇: [17, (38)], where L ♭ is what here we would denote Lan ♯ ). However, when it comes to the stress-energy tensor, we have seen that the relevant condition is not this, but rather l H X g = 0. Prop. 5.1 (A) implies that ∇ v X = 0 is sufficient for l H X g v = 0, and this does privilege ∇ against ∇.

Finslerian conservation laws and main examples.
Compare the results here with the classical case (14) and also with [30].
is an anisotropic 2-tensor, and (iv) D ⊆ U is a domain with ∂D smooth up to subset of 0 (n − 1)dimensional measure on M and Supp(X V ) ∩ D compact, then where C m is the mean Cartan tensor and DV is computed with the metric nonlinear connection (4).
Proof. Just take Z = T (X) in Th. 4.13 and use part (B) or Prop. 5.1 .
Remark 5.8. Observe that (44) allows for an interpretation of the divergence of T in terms of the flow in the boundary. Consider a sequence of domains D m such that their volumes go to zero when m → +∞ and consider an observer V such that is infinitesimally parallel at p ∈ M , namely, DV = 0 in p ∈ M and X such that ∇ v X = 0. Then (44) and the mean value theorem imply that div(T ) v (X) = lim In particular, div(T ) v = 0 can be interpreted as that the observer v measures conservation of energy in its restspace.
Corollary 5.9. In the ambient of the previous corollary, assume: (i) div(T ) V = 0.
Proof. It follows from Cor 5.7, taking into account that the hypotheses (i), (ii) and (iii) imply that the three first integrals in (44) vanish.
(i) Obviously, div(T ) = 0 suffices, but we do not need to assume that the divergence vanishes for all observers. (iii) Although the hypothesis may seem artificial as it stands, there are a number of natural situations in which it is guaranteed. First, in classical Relativity (g, T and X isotropic), because C m = 0 anḋ ∂(T (X)) = 0; the result is then independent of V . Second, when the observer field is parallel (DV = 0), trivially. Third, when DV = θ⊗V for some 1-form V and T (X) is 0-homogeneous, because of Euler's theorem. And fourth, in the situation described in [30, §5.1] (Z is our T (X), s is our V and I is our C m ).
Remark 5.11 (Representations of (45)). One needs to keep in mind Rem. 4.14. For a smooth part Γ of ∂D, one can use the (salient) Riemannian unit normal to represent when Γ is non-g V -lightlike, and the Finslerian unit normal to represent when L is Lorentz-Finsler and Γ is L-spacelike. This makes it possible to have the very same conservation law (45) written in distinct ways, and in the examples below we will see that different expressions are preferable in different situations.
In the remainder of the section, we analyze the Finslerian conservation laws in two settings in which L is Lorentz-Finsler. In particular, g has signature (+, −, ..., −), A determines a time orientation, L > 0 on A, and (A, L) is maximal with these properties. We also have regularity conditions at ∂A, and in fact one sees that Th. 4.13 and Cor. 5.9 still hold when allowing that Z, X ∈ X(M A ), T ∈ T 1 1 (M A ) and V ∈ X A (U ). Despite this, in both settings it will be necessary to take V as L-timelike, so the regularity at ∂A will not be used.

5.3.1.
Example: Lorentz norms on an affine space. In this example, we shall particularize Cor. 5.9 to the easiest Finslerian setting in which we can assure that its hypothesis (iii) holds. Namely, the structure of an affine space automatically provides an infinite number of parallel observer fields, V ∈ X A (M ) with DV = 0.
To be preicse, suppose that M = E is an affine space equipped with a Lorentz norm on an open conic subset A * ⊆ E \ 0 (a positive pseudo-Minkowski norm with Lorentzian signature in [20,Def. 2.11]). Under the usual identifications, such a norm can be seen as a Lorentz-Finsler L on A ⊆ TE \ 0 ≡ E × E \ 0 that is independent of the first factor. Consequently, its fundamental tensor is nothing more than a Lorentzian scalar product g v for each v ∈ A * . The metric nonlinear connection of L coincides with the canonical connection of E, hence so do the Chern and Berwald anisotropic connections. 12 This is what implies that the parallel V ∈ X A (E) correspond exactly to the elements v ∈ A * . Let us introduce some notation. Given (p 0 , v) ∈ A with L(v) = 1, we can consider the Lorentzian scalar product g v and the orthogonal hyperplane Let Ω be a compact domain of R with ∂Ω ⊆ R smooth up to a null (n − 2)-dimensional measure set, and let n v be its salient unit (− g v | R )-normal. Then for t 0 < t 1 , the compact domain D ≡ [t 0 , t 1 ] × Ω ⊆ E has the required smoothness to apply Cor. 5.9, its boundary is ∂D = {t 1 } × Ω ∪ [t 0 , t 1 ] × ∂Ω ∪ {t 0 } × Ω, and its salient g v -normal is given by Remark 5.12. For a V ∈ X A (E) identifiable with v ∈ A * , we know that the hypothesis (iii) of Cor. 5.9 holds automatically. If (i) and (ii) hold too, then we get (45), for which we can use the representation (46). However, given the nature of the metric "nonlinear" and Chern "anisotropic" connections, it is easy to convince oneself that evaluating the result of anisotropic computations on this V is the same as first evaluating on V and then computing with isotropic tensors. For instance div(T ) V = div(T V ) and l H X g V = L X V (g V ). As a consequence, mathematically we get exactly the same conservation laws as if we just were in the Lorentzian affine space (E, g v ). Physically, though, different observers will measure different momenta.
where dσ V is identifiable with the volume form of − g v | Ω on {t µ } × Ω and coincides with the volume form of Physically, even though Lorentz norms generalize Very Special Relativity [3], the classical interpretations of Special Relativity are still valid; we list them for completeness: v is an instantaneous observer at an event p 0 , R is its restspace and R is the simultaneity hyperplane of v, namely the "universe at an instant, say t = 0, as seen by v". The affine space structure allows for a canonical propagation of v to all of the spacetime. Hence, if Ω is a space region at t = 0, then D is the "evolution of Ω along the time interval [t 0 , t 1 ] as witnessed by v". (47) expresses that the variation after some time of the total amount of X v -momentum in Ω is exactly equal to the amount of it that flowed across ∂Ω.

5.3.2.
Example: Cauchy hypersurfaces in a Finsler spacetime. Here we present a construction which manifestly generalizes that of the previous example, again with straightforward physical interpretations, and we find an estimate that allows us to interpret (47) when ∂Ω is "at infinity". We will take V ∈ X A (U ) with U ⊆ M open, and we recall that we will assume the hypotheses of Cor. 5.9.
Suppose that the Finsler spacetime (M, L) is globally hyperbolic. By this, we mean that there is some (smooth, for simplicity) L-Cauchy hypersurface S ⊆ M : every inextensible L-timelike curve γ : I → M (thusγ(t) ∈ A) meets S exactly once. Let us assume that there are two L-spacelike Cauchy hypersurfaces S 0 , S 1 ⊆ U which do not intersect. 13 Then the results of [2] can be automatically transplanted: there exists a foliation by spacelike Cauchy hypersurfaces M ≡ R × S such that S 0 ≡ {t 0 } × S and S 1 ≡ {t 1 }×S . Taking the Finslerian unit normal ξ to each level {t}×S produces an L-timelike field ξ ∈ X A (M ). We can take this ξ to be our V , but we will not do so for the most part of this example.
Suppose also that Ω 0,m is an exhaustion by compact domains of S 0 , namely Ω 0,m ⊆ Ω 0,m+1 and m∈N Ω 0,m = S 0 , such that ∂Ω 0,m ⊆ S 0 is smooth a. e. For p ∈ S 0 , let γ p be the integral curve of V starting at p, which necessarily meets S 1 at a unique instant t p ∈ R.
Put 13 The case when they interesect can be also conisdered by taking into account that, then, the open set M \ J + (S1 ∪ S2) is still globally hyperbolic and a Cauchy hypersurface S3 of this open subset will be also Cauchy for M (and it will not intersect any of the previous ones).
Remark 5.14. By construction, (i) Ω 1,m is again an exhaustion by compact domains of S 1 such that (ii) D m is a compact domain of U with ∂D m = Ω 1,m ∪ Γ m ∪ Ω 0,m ⊆ U smooth a. e. We do not really need to consider the union of all the D m 's.
Next, for Z ∈ X(M A ), we shall give the quantitative decay condition on (some components of) Z V so that the integral vanishes in the limit. The key fact for it will be that V is everywhere tangent to Γ m (this is composed of γ p 's). In particular, as V is g V -timelike, so must be Γ m .
Remark 5.15. The presence of V allows us to define an auxiliar Riemannian metric h V on U with norm − V , which gives a very natural way of quantifying. Namely, if {e 0 = V p /F (V p ), e 1 , ..., e n } is an orthonormal basis for g Vp , then we prescribe it to be also h Vp -orthonormal; equivalently, Then, by construction: (i) The volume form of h V coincides with that of g V , namely dVol V .
(ii) The salient unit h V -normal to Γ m coincides with the corresponding g V -normal. We denote it by N V , as in 5.11. (iii) The hypersurface volume form of Γ m with respect to h V coincides with the one computed with g V , namely dσ V = i * m (ı N V (dVol V )) with i m : Γ m ֒→ U the inclusion. Hence we speak just of the hypersurface volume of Γ m , namely σ V (Γ m ). As N V is g V -orthogonal to V , and hence g V -spacelike, we can use the representation Thanks to (48) and the fact that g V ( N V , V ) = 0, we intuitively see that if Z V is proportional to V at infinity and the hypersurface volume does not grow too much, then the integral will be negligible. To be precise, we require that where Corollary 5.16. In the above set-up, let T ∈ T 1 1 (M A ), X ∈ X(M A ) and V ∈ X A (U ) be such that the hypotheses of Cor. 5.9 hold on all the D m 's, and put Z := T (X). If the decay condition (49) holds too, then where Ω 1,m is constructed from Ω 0,m by intersecting the integral curves of V with S 1 .
Proof. Cor. 5.9 can be applied on D m , as Supp(Z V )∩D m is always compact. This and the representation (48) give Using the definition of h V (Rem. 5.15) and the Cauchy-Schwarz inequality, so if K m σ V (Γ m ) tends to 0, then so does the integral along Γ m in (51).
Remark 5.17. In Cor. 5.16, if one of the integrals of ı Z V (dVol V ) along S 0 or S 1 exists in the Lebesgue sense, then so does the other and (50) reads Note that they could be ±∞, as we have not assumed, for instance, that Z V is compactly supported in the union of all the D m 's. Rather, we have assumed the decay condition (49) alone.
Remark 5.18 (Sufficient conditions for (49)). As for ensuring the decay condition, there are two possible scenarios.
(i) The hypersurface volume σ V (Γ m ) stays bounded. Then, it is enough for (49) that K m → 0, and one could instead postulate the stronger condition that the maximum outside D m tends to 0, which is independent of the concrete compact exhaustion. (ii) σ V (Γ m ) grows without bound. In this case, one can just postulate that the decay of K m compensates the growth of σ V (Γ m ), but this does depend on the compact exhaustion Notice that this is a purely Finslerian difficulty. Indeed, suppose that g, T and X were isotropic and that Z = T (X) was timelike. Then one could just set V := Z and then carry out all the construction. Cor. 5.9 would be independent of the observer field (and its hypothesis (iii) would hold trivially), and K m = 0 regardless of Γ m . This is how we get the following statement of the classical law.
Corollary 5.19. In the above se-up, suppose that L comes from a Lorentzian metric on M . Let T ∈ T 1 1 (M ) and X ∈ X(M ) be such that div(T ) = 0 and T ♭ (−, ∇ − X) is antisymmetric, or T ♭ is symmetric and L X g = 0. If Z := T (X) is timelike, then where Ω 1,m is constructed from Ω 0,m by intersecting the integral curves of Z with S 1 .
Remark 5.20 (Conservation in terms of the Finslerian unit normal).
(i) One could try to represent also the integrals of (50) in terms of dσ V , as in §5.3.1. However, according to Rem. 5.11, that would require assuming that S µ is non-g V -lightlike, which is not very reasonable when all we know is that S µ L-spacelike and L-Cauchy. (ii) On the other hand, in terms of the Finslerian unit normal ξ, (50) reads when m → ∞. The sign in front of the second integral is explained as follows (see Rem. 4.14 (ii)). dΣ ξ V selects an orientation on each Ω µ,m : the one for which dVol V (ξ, −, ..., −) is positive. However, in (50) Ω 1,m already had an orientation O 1 and Ω 0,m had O 0 : the D m -salient ones. Necessarily, 14 exactly one of these agrees with the dΣ ξ V -orientation: O 1 if S 1 lays in the future of S 0 and O 0 if it is the opposite. Notice that this, and hence (52), would fail if the Cauchy hypersurfaces crossed.
(iii) In the case V = ξ, (52) becomes a conservation law in which all the terms are purely Finslerian.
Summing up, in this example we have proven a Finslerian (observerdependent) version of the classical law that the total amount of X-momentum in the universe is conserved (Cor. 5.16). Our formulation is asymptotic, so it is valid even for infinite total X V -momentum (Rem. 5.17). We have recovered the classical law (Cor. 5.19), which always holds under hypotheses on T and X alone, while in the general Finslerian case nontrivial difficulties appear in the regime of big separation between the Cauchy hypersurfaces (high σ(Γ m ), Rem. 5.18). Finally, we have expressed the law naturally in terms of the Finslerian unit normal (see (52)).

Conclusions
About the physical interpretation of T , §3: (1) Heuristic interpretations from fluids, §3.1 and 3.2 Possible breakings of Lorentz-invariance lead to non-trivial transformations of coordinates between observers. Such transformations are still linear and permit a well-defined energy-momentum vector at each tangent space T p M , §3.1. However, the stress-energy-momentum T must not be regarded as a tensor on each T p M , but as an anisotropic tensor. This depends intrinsically on each observer u ∈ Σ and may vary with u in a nonlinear way. Indeed, the breaking of Lorentz invariance does not permit to fully replicate the relativistic arguments leading to (isotropic) tensors on M , even though classical interpretations of the anisotropic T in terms of fluxes can be maintained, §3.2.
(2) Lagrangian viewpoint, §3.3. In principle, the interpretations of Special Relativity about the canonical energy-momentum tensor associated with the invariance by translations remain for Lorentz norms and, thus, in Very Special Relativity. In the case of Lorentz-Finsler metrics, some issues to be studied further appear: (a) The canonical stress-energy tensor in Relativity δS matter /δg µν leads to different types of (anisotropic) tensors in the Finslerian setting (a scalar function δS matter /δL on A ⊆ TM in the Einstein-Hilbert setting, higher order tensors in Palatini's). Starting at such tensors, different alternatives to recover the heuristic physical interpretations in terms of a 2-tensor appear. (b) In the particularly interesting case of a kinetic gas [14,16], the 1-PDF φ becomes naturally the matter source for the Euler-Lagrange equation of the Finslerian Einstein-Hilbert functional. However, the variational derivation of φ is obtained by means of a non-natural Lagrangian. This might be analyzed by sharpening the framework of variational completion for Finslerian Einstein equations [13]. About the divergence theorem for anisotropic vector fields Z, §4: (a) It can be seen as a conservation law for Z measured by each observer field V , even if the conserved quantity depends on V . (b) The computation of the boundary term is intrinsically expressed in terms of forms. However, several metric elements can be used to re-express it, in particular the normal vector field for: (i) the pseudo-Riemannian metric g V (Rund), or (ii) the pseudo-Finsler metric L, when L is defined on the whole TM (Minguzzi). About the conservation of the stress-energy T §5: (1) §5.1 and 5.2: The computation of div(T ) priviledges the Levi-Civita-Chern anisotropic connection, showing explicit equivalence with Rund's approach.
(2) Cors. 5.7 and 5.9: A vector field T (X) V on M is preserved assuming that some natural elements vanish on V for T , X and DV .

Appendix. Kinematics: observers and relative velocities
Here, we discusss a series of different possibilities for the notion of relative velocity between two observers, each one with a well-defined geometric construction. This is done as an academic exercise, because we do not discuss experimental issues (compare with [25,34]). However, it is worth emphasizing that all the possibilities studied here are intrinsic to the geometry of a flat model and, thus to any Finsler spacetime.
Start at an affine space endowed with a Lorentz norm let u, u ′ ∈ Σ be two distinct observers and consider the plane Π := Span{u, u ′ } ⊂ V , which intersects transversally C and inherits a Lorentz Finsler norm with indicatrix Σ Π := Π ∩ Σ. Recall that both tangent spaces T u Π and T u ′ Π inherit naturally a Lorentz scalar product by restricting the fundamental tensors g u and g u ′ , resp. Moreover, their (1-dimensional) restspaces l := T u Σ Π , l ′ := T u ′ Σ Π also inherit a positive definite metric. In what follows, only the geometry of Π will be relevant.
The Lorentz metric g Π up to a constant. Notice that Π∩C p is composed by two half-lines spanned by two C-lightlike directions w ± ; we will consider the orientation Π provided by the choice (w + , w − ). One can determine a scalar product g Π in Π (which is unique up to a positive constant), regarding both w + and w − as g Π -lightlike in the same causal cone. It is easy to check that Σ must be a strongly convex curve which converges asymptotically to the vector lines spanned by w ± . This implies both u ∈ Σ will be timelike for g Π and its restpace l will be g Π -spacelike; we can assume also that the orientation l + in l is induced by the chosen w + .
Notice that g u (u, w ± ) ≥ 0 by the fundamental inequality, but w ± might be timelike or spacelike for g u (although g u (u, w ± ) → 0 as u → w ± ). This possibility might be regarded as a possible measurement of the speed of light with respect to u by the observers in Π, namely, this velocity is in the orientation l + when w + is g u -spacelike and smaller than 1 when it is timelike. However, a priori it is not clear an operational way to carry out such a measurement. Moreover such a measurement might be regarded as something non-intrinsic to the speed of light but to the way of measuring it.
Nevertheless, as pointed out in [1, Section 6], there are several effects which might lead to a measurement of different speeds of light in different directions. So, we will consider that each Π has its own speeds of light c ± Π in each spacelike orientation l ± . Indeed, given u and an orientation l + , the speed of light c + Π will be defined as the the supremum of the relative velocities between u and all the observers u ′ such that u ′ − u yields the orientation l + . Next, we will explain several possible meanings of these velocities. To avoid cluttering, next we will write c Π , assuming that the appropriate choice in c ± Π is done for each u ′ .
Simple relative velocity. As g u determines naturally a Lorentz metric on V , we can define the simple relative velocity v s u (u ′ ) of u ′ measured by u as the usual g u -relativistic velocity between u, u ′ normalized to c Π , i.e. v s u (u ′ ) = c Π tanh(θ) where cosh θ = −g u (u, u ′ ) > 1, (the latter by the reversed fundamental inequality). Clearly, v s u ′ (u) = v s u (u ′ ) in general, but this does not seem a drawback in the Finslerian setting.
A support for the physical plausibility of this velocity is that one could expect that each observer u will work as in Special Relativity just choosing an orthonormal frame of g u . The possibility g u (v, v) = 1 might seem ackward from a dynamical viewpoint (see below), but it seems harmless as far as only kinematics is being considered. In principle, the comparison between the measurements of the two observers would be geometrically possible by using the unique isometry of (T u Π, g u ) to (T u ′ Π, g u ′ ) which maps u into u ′ and is consistent with orientations induced from Π. What is more, this isometry can also be extended to a natural isometry from (T u V, g u ) to (T u ′ V, g u ′ ), namely, regard (Σ, g) as a Riemannian metric and use the parallel transport from u to u ′ along the segment of the curve Π ∩ Σ from u to u ′ . However, the following fact might suggest to explore further possibilities.
Remark 7.1. Assume that Σ is modified into the indicatrixΣ of another Lorentz-Finsler norm so that (i)Σ = Σ around u and (ii) u ′ ∈Σ but its Σ restspacel ′ is different from l ′ . Then, the simple velocity would remain unaltered, i.e.,v s u (u ′ ) = v s u (u ′ ). Velocity as a distance between observers. Notice that Σ can be regarded as a Riemannian manifold with the restriction of the fundamental tensor g and, then, Σ ∩ Π can be regarded as a curve whose length can be computed. Then, the observers' distance velocity is defined as: v d (u, u ′ ) = c Π tanh length g {segment of Σ ∩ Π from u to u ′ } .
Notice that this velocity is symmetric and it generalizes directly the one in Special Relativity providing a geometric interpretation for the addition of velocities. Recall that v d (u, u ′ ) has been defined essentially as a distance in Σ ∩ Π, where Π depends of each pair of observers, thus, one might have v d (u, u ′ ) + v d (u ′ , u ′′ ) < v d (u, u ′′ ) when n > 2. If one prefers to avoid such a possibility, it is enough to consider g-distance in the whole space of observers Σ (observers' space distance velocity), at least in the case that c Π is regarded as independent of Π.
Remark 7.2. In the case studied in Remark 7.1, one would havev d (u, u ′ ) = v d (u, u ′ ) in general. However, the relative position of the restspaces l and l ′ does not play any special role.
Length-contraction and velocity. Consider a segment S of l with g ulength ℓ and the strip of V obtained by translating S in the direction of u. Let S ′ be the intersection of this strip with l ′ , which will be a new segment of g u ′ -length ℓ ′ . Let λ = ℓ ′ /ℓ be the length-contraction parameter. In the relativistic case, λ < 1 and λ → 0 as u ′ → C Π . The former property does not hold for a general Lorentz norm but the latter does. So, whenever λ < 1 holds, we can define the length-contractive velocity v c u (u ′ ) of u ′ with respect to u as: v c u (u ′ ) = c Π 1 − λ 2 . Again, this velocity is not symmetric. Because of the strong convexity of Σ, a different observer u ′ will have a different restspace l ′ , but this does not imply a different length ℓ ′ nor velocity v c u (u ′ ). However, this velocity gives a comparison between restspaces which was absent in the previous two velocities.
Symmetric Lorentz velocities in Π. Let us consider the Lorentzian scalar product g Π en Π, unique up to a positive constant (which will be irrelevant for our purposes) introduced above. Recall that u and u ′ were timelike for g Π and, moreover, both l and l ′ were spacelike. Now, we can define two velocities between u and u ′ : the simple Lorentz velocity, v s (u, u ′ ) = c Π tanh(θ) where cosh θ = − g Π (u, u ′ ) g Π (u, u)g Π (u ′ , u ′ ) , and the length-contractive Lorentz velocity, v c (u, u ′ ) = c Π tanh(θ) where cosh θ = − |g Π (n, n ′ )| g Π (n, n)g Π (n ′ , n ′ ) , where, in the latter, n, n ′ are g Π -timelike vectors orthogonal to l, l ′ , resp. Clearly, both velocities are symmetric. Their appearance might be physically sound because the intrinsic Lorentz metric g Π (up to a constant) can be regarded as an object available (or, at least, a compromise one) for all the observers, as it would depend directly on physical light rays.