Next Article in Journal
An Empirical Orthogonal Function Study of the Ionospheric TEC Predicted Using the TIEGCM Model over the South Atlantic Anomaly in 2002 and 2008
Next Article in Special Issue
Realization of Bounce in a Modified Gravity Framework and Information Theoretic Approach to the Bouncing Point
Previous Article in Journal
Reconstructing Torsion Cosmology from Interacting Holographic Dark Energy Model
Previous Article in Special Issue
Bulk Viscous Fluid in Symmetric Teleparallel Cosmology: Theory versus Experiment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Role of Constraints and Degrees of Freedom in the Hamiltonian Formalism

Centre for Theoretical Physics, The British University in Egypt, El Sherouk City 11837, Egypt
Universe 2023, 9(2), 101; https://doi.org/10.3390/universe9020101
Submission received: 18 January 2023 / Revised: 31 January 2023 / Accepted: 15 February 2023 / Published: 16 February 2023
(This article belongs to the Collection Modified Theories of Gravity and Cosmological Applications)

Abstract

:
Unfortunately, the Hamiltonian mechanics of degenerate Lagrangian systems is usually presented as a mere recipe of Dirac, with no explanation as to how it works. It then comes to discussing conjectures of whether all primary constraints correspond to gauge symmetries, and it goes all the way to absolutely wrong claims such as the statement that electrodynamics or gravity only have two physical components each, with others being spurious. One has to be very careful because non-dynamical, or constrained, does not mean unphysical. In this article, I give a pedagogical introduction to the degenerate Hamiltonian systems, showing both very simple mechanical examples and general arguments about how it works. For the familiar field theory models, I explain why the gauge freedom there “hits twice” in the sense of producing twice as many first-class constraints as gauge symmetries, and why primary, and only primary, constraints should be put into the total Hamiltonian.

1. Introduction

Hamiltonian mechanics is a very nice formulation for the whole plethora of dynamical models, both for classical and quantum physics, as well as for mathematics, since it allows to naturally work with a system of first-order differential equations. In case of non-degenerate Lagrangians, we get unconstrained Hamiltonian systems which express the time-derivatives of all their canonical variables as functions of the variables themselves. Therefore, we need one and only one Cauchy datum for every canonical variable in order to solve the equations, at least locally.
In every good textbook on classical mechanics, it is explained how the Hamiltonian formalism of non-degenerate systems works. However, when we face degenerate systems, we have to be more careful and to also introduce some constraints on the values of momenta. These are called primary constraints, and they reduce the number of needed Cauchy data because some combinations of canonical variables get to strictly vanish.
Here comes another advantage of the Hamiltonian analysis. Since all equations are of the first derivative order, in order to find the number of Cauchy data, or the degrees of dynamical freedom, we simply need to know how many combinations of canonical variables are subject to such restrictions. There are some limitations to this statement however. First, there might be a gauge freedom, which makes some variables simply not determined by the equations at all. We call them unphysical and care only about gauge-invariant quantities. Second, the restrictions might be more numerous than the primary ones. The equation of a primary constraint staying zero all the time could demand that some other combinations of variables also vanish, and we call those requirements secondary constraints. Third, Hamiltonian constraints do not always have a clear meaning in the Lagrangian setting. For example, if a variable was totally absent from the Lagrangian, it would have an arbitrary dependence on time, however its momentum would be constrained to be zero. I will discuss all of these issues.
Degenerate systems are very important in modern theoretical physics. Indeed, when we have a gauge symmetry, it certainly means that the Lagrangian does not depend on some combination(s) of the fundamental variables in any way, neither through coordinates nor through velocities. Therefore, the system is degenerate and we have primary constraint(s). Very commonly, and I will explain why, such gauge-related constraint produce another constraint. For example, electrodynamics has one gauge freedom and two constraints, while gravity has four gauge freedoms and eight constraints.
Unfortunately, the procedure of working with such systems is usually explained as just a recipe which, as a matter of fact, works well. It works indeed. However, then comes a strange conjecture from the lectures of Dirac [1] that every “first-class”, to be defined below, constraint corresponds to a gauge freedom. It is taken very seriously in most of the literature, even though, if taken literally, it is simply wrong. In a sense, the familiar secondary first-class constraints are indeed the consequences of the gauge symmetry, however they do not increase the number of these symmetries, nor possible gauge choices. The number of gauge freedoms is the number of primary first-class constraints only, not more, and even this amount is not always realized as a usual gauge symmetry, due to some other types of singular behavior of Lagrangians, as I also explain below.
Following the wrong conjecture, as well as putting also the secondary constraints into the Hamiltonian, leads people to claiming that there are only two physical variables in Maxwell electrodynamics, and the same in General Relativity. It is certainly true that those indeed have this number of dynamical modes, but it does not justify calling other non-gauge variables spurious. Doing so they basically say that Coulomb forces and Newtonian potentials are unphysical, which is a bit too much to say. In this paper, I first show how the Hamiltonian analysis works in the simplest mechanical cases, although also mentioning some modern physics issues, then I explain why it works and how it must be used, and finally I review its application to electromagnetism and gravity in a bit more detail.

1.1. A Brief Review of Hamiltonian Mechanics

I will denote the dependence of a Lagrangian on coordinates and velocities as
L ( x , y , z ) = L ( x , x ˙ , y , y ˙ , z , z ˙ )
which will mean that L is taken as a Lagrangian for a system of all the mentioned variables, even if it does not depend on some or all of them. For every Lagrangian variable, the momentum is defined as
π x L x ˙ .
In cases when the matrix 2 L x ˙ i x ˙ j is non-degenerate, the Lagrangian system of L ( x i ) is called non-degenerate, too, and this definition (1) of π = π ( x , x ˙ ) can be inverted to x ˙ = x ˙ ( π , x ) , at least locally.
Then we introduce the Hamiltonian function
H ( x i , x ˙ i , π x i ) = i π x i x ˙ i L ( x i ) .
For a non-degenerate system, and up to a possible non-uniqueness of solving for velocities in terms of momenta, we can finally define the canonical Hamiltonian as
H ( x i , π x i ) = H ( x i , x ˙ i ( π x i , x i ) , π x i ) = i π x i x ˙ i L ( x i ) . A B C D x ˙ i : π x i = L x ˙ i .
One can easily find the differential equations satisfied by the Hamiltonian by simply looking at the variation of the Hamiltonian function (2) restricted to the surface defined by imposing the equation for momenta (1):
δ H = δ i π x i x ˙ i L ( x i ) . A B C D π x i = L x ˙ i = i x ˙ i δ π x i L x i δ x i .
which implies
H π x i = x ˙ i and H x i = L x i .
Of course, the first equation simply solves velocities in terms of momenta, while the second one can then be compared with the Lagrangian equation by observing that the latter tells us that
L x i = d d t L x ˙ i = π ˙ x i ,
and therefore the Hamiltonian shape of the equations of motion is
x ˙ i = H π x i and π ˙ x i = H x i .
For a convex Lagrangian, it can be simply viewed as the new Legendre transform, which is involutive, and therefore brings us back to the Lagrangian with the variational principle written in terms of π x ˙ H .
The equations of motion (5) can be presented in the beautiful language of symplectic geometry,
f ˙ ( x , π ) = { f , H } ,
such that the canonical coordinates and momenta are the Darboux coordinates for the (anti-symmetric) Poisson bracket: { x i , π x j } = δ i j , with brackets among coordinates or among momenta being zero. Therefore, we have a natural symplectic geometry on the phase space defined by the Poisson bracket of two functions f ( x , π ) and g ( x , π ) :
{ f , g } i f x i · g π x i f π x i · g x i .
Here I will not go into any more detail on that. There are many nice and rigorous expositions of the Hamiltonian mechanics for non-degenerate systems; see for example the book by Arnold [2].

1.1.1. Non-Quadratic Kinetic Lagrangians–Hamiltonians with Branches

Let me briefly mention that it all works with no surprise if the Lagrangian is a positive-definite quadratic form of velocities. Having higher powers of velocities brings its own issues. Of course, even if the Hessian 2 L x ˙ i x ˙ j is non-degenerate, it might have some singular phase space points with a vanishing determinant. On top of that, even in the regular parts, the solution x ˙ = x ˙ ( π , x ) is often non-unique.
As a very simple example, consider L ( x ) = 1 3 x ˙ 3 . The momentum is π = x ˙ 2 , and it is an even function of x ˙ . The Hamiltonian is easily found to be H = ± 2 3 π 3 / 2 , and its two branches correspond to positive and negative velocities. This example is very trivial, however there are other options. For instance, one can also play with time crystals [3].

1.1.2. Degenerate Systems—The Dirac’s Recipe

Now I will turn to the main topic of this paper: Hamiltonian systems with constraints. We call a Lagrangian degenerate if its Hessian 2 L x ˙ i x ˙ j has a vanishing determinant, and therefore it has a non-trivial kernel and cannot be inverted. Such systems cannot be written as x ¨ i = f i ( x , x ˙ ) .
In this case, the definition (1) of the momenta in terms of velocities is not invertible either. The variations of velocities are mapped into variations of momenta by action of the Hessian matrix. Assuming some smoothness and constancy of rank, the dimension of its kernel gives the number of independent velocity variations, which do not change the momenta at all. On the other hand, this dimension is the codimension of its image in the momentum space, and the equation of the image submanifold is the set of the primary constraints, in the number equal to the dimension of that kernel. Let us define the primary constraints Φ a as equations
Φ a ( π x i , x i ) = 0
to be put on top of the usual Hamiltonian equations, and the index a running from 1 to the dimension of the Hessian’s kernel. In other words, the primary constraints are necessary and sufficient (again, at least locally) conditions for a set of momenta π x i to be in the image of the mapping (1).
We can define the Hamiltonian function (2) the same way again. The sad news is that we cannot find the velocities as functions of momenta in order to substitute them there. However, the calculation of its variation (4) goes the same way as before, the only difference being that the assumption of the Equation (1) put into this variation generically cannot work, for any velocity, unless we always put the momentum onto the primary constraint manifold. However, it is not what we can control when taking derivatives of the Hamiltonian once we have forgotten the initial origin of this function.
Moreover, we cannot even find a unique canonical Hamiltonian with the usual property (4) of the first variation. Indeed, if we change it as H H + a Φ a · F a ( x i , π x i ) with arbitrary functions F a , it does not change the Equation (4) because in the case of imposing the condition (1), both on the background and on the variation, we always have Φ a = 0 . Therefore, the next natural step is to take an arbitrary possible choice of the canonical Hamiltonian and define the total Hamiltonian
H T = H + a λ a Φ a
with arbitrary functions λ a called Lagrange multipliers.
The main recipe is that the equations are then
x ˙ i = H T π x i and π ˙ x i = H T x i and the constraints Φ a = 0 .
Dirac preferred to call the last equality a weak one and to denote it as Φ a 0 . It is weak in the meaning that it is valid on the constraint surface only, and not when one goes away from it, and in particular the first two equations do get non-trivial terms from differentiating the Φ -s. For me, it is just yet another equation which I do not want to distinguish anyhow. On the contrary, if I want to stress that something is identically zero, I use the sign of 0 . After all, it is all but absolutely common that even when a function takes the zero value, its derivative might be non-vanishing. Interestingly enough, Dirac in his paper [4] denoted weak equalities with the usual equality sign, while for strong equalities he used the identical equality notation, even though his definition of a strong equality there was rather that of O ( ϵ 2 ) as opposed to O ( ϵ ) type.
My claim about the system of Equation (8) is that, in every reasonable case, this is already the final answer, and nothing else is needed in order to have the equations identical to the Lagrangian ones. One way to look at this is that we are doing the variation (4) always respecting the condition that the momenta are only those which indeed fulfill the condition (1) and therefore do correspond to at least some velocity. I will later give more detailed explanations. However, intuitively one can see it as varying ( p x ˙ H + λ Φ ) , with the last term both imposing the necessary requirements on the momenta and reproducing every possible canonical Hamiltonian.
Let me stress it once more. The Equation (8) with primary, and only primary, constraints are all that we need. However, when we want to easily and reliably count the degrees of freedom, we need to be more careful.
All in all, when I take Φ a = 0 as an equation of motion, it immediately implies that Φ ˙ a = 0 , too. Taken together with other equations, it might or might not entail new constraints, Φ ˜ m , which are called secondary. Using the other equations for that can also be rephrased in terms of Poisson brackets (6):
0 = Φ ˙ a = { Φ a , H T }
which takes the form of a function of coordinates and momenta, with no time-derivatives of those. Then we can do the same with the secondary constraints, to possibly get tertiary ones, and so on. I will call all of them secondary Φ ˜ m . At every step, if the Poisson bracket of a constraint with the total Hamiltonian is not either identically zero or expressed in terms of the constraints themselves (weakly zero in the language of Dirac), it means either a restriction on the Lagrange multipliers or yet another constraint.
Once all the constraints are found, they reduce the number of independent Cauchy data, while those Lagrange multipliers of primary constraints which have not been fixed in this process count unphysical variables in the system. Here comes the classification of the constraints. We can compute the whole matrix of Poisson brackets of all constraints with each other. If every element of this matrix is “weakly zero”, i.e., the constraints form a closed algebra with respect to the brackets, we say that they are first-class. If the matrix is, to the contrary, non-degenerate on the constraint surface, then they are second-class. If the matrix is neither of that, one can separate the constraints, maybe after some redefinitions, into first- and second-class ones.
If all the constraints are first-class, then the conditions of { Φ a , H T } = 0 and { Φ ˜ m , H T } = 0 can not be satisfied by conditions on the Lagrange multipliers. Therefore, in this case every primary constraint goes with an arbitrary λ in the Hamiltonian equations of motion (8), normally representing a gauge freedom. If a primary constraint led to a secondary one with which it is of second-class, then it seems that its Lagrange multiplier gets fixed by preservation of its secondary friend. Since every constraint kills a Cauchy datum, while first-class ones are also associated with gauge symmetries, the common lore says that each first-class constraint and each pair of second-class constraints reduce the number of degrees of freedom by one.

1.1.3. My Comments

In this paper I argue that it is not that simple, and it is better to always check what happens in each particular case, even though the Hamiltonian formalism can sometimes be more transparent than the Lagrangian one, indeed. My main point is that the amount of gauge freedom is normally reflected in the number of only primary first-class constraints. Even this statement requires certain regularity of the Lagrangian. It can happen that some variables enter the Lagrangian only being multiplied by a factor that vanishes together with its first variation on every configuration in which other variables satisfy all the equations. Then the former variables will enjoy the full unphysical freedom of being arbitrary functions of time, at the same time without any off-shell symmetry of such Lagrangian. I will give an example below.
In what concerns the Dirac conjecture, the relation between first-class constraints and gauge symmetries is not trivial. In a sense, all of them are normally related to some gauge symmetry. However, in non-trivial cases when the “gauge hits twice”, not even the primary ones are the precise generators of the gauge transformations, with this role being played by some combination of them with the secondary one(s) [5]. If we take the conjecture as a statement that all first-class constraints are independent generators of gauge transformations, then it is simply wrong, in all the cases which have secondary first-class constraints.
There is one more point about that. It is claimed sometimes [6] that it would be better to add all the constraints to the Hamiltonian, not only primary but also secondary ones, in order to get what is called an extended Hamiltonian,
H E = H T + m λ ˜ m Φ ˜ m = H + a λ a Φ a + m λ ˜ m Φ ˜ m .
First, it is not needed. The primary constraints are already enough to reproduce the Lagrangian equations of motion in the Hamiltonian language. Second, at least with the first-class constraints, it is simply wrong. In this case, adding a secondary constraint to the Hamiltonian also adds a new arbitrary Lagrange multiplier producing a fake effective gauge symmetry on top of the real one.

2. Very Simple Examples of Systems with Constraints

I will first illustrate the workings of constrained Hamiltonian formalism with very simple examples, then give some general mechanical statements, and finally discuss the usual field theories we have in physics. There are several good expositions of Hamiltonian mechanics with constraints [6,7,8], however more as a recipe, and the important subtleties are rarely paid enough attention to.

2.1. Trivial Gauge Symmetry

Let me start from a very simple example
L ( x ) = 0
which has one variable x ( t ) and a gauge symmetry which kills it. Finally, we have nothing physical, one gauge freedom and zero degrees of freedom. With the same success, I could have taken L = 1 , for anyway its variation under arbitrary δ x is zero.

2.1.1. Failure of Forgetting to Put Primary Constraints into the Hamiltonian

We get the zero momentum π x = 0 , which is its only (first-class) constraint Φ π x , and the Hamiltonian function (2), H = π x x ˙ L , is equal to zero on the constraint surface for any choice of x ˙ . If we simply take
H = 0 ,
then the equations of motion (5) tell us that
x ˙ = 0 and π ˙ x = 0 .
It gives then higher freedom of the momentum than in the initial Lagrangian formulation, but at the same time requires more from the variable x which used to be absolutely free. Even adding the equation of π x = 0 by hand would not help to restore the arbitrariness of x ( t ) . A primary constraint is not a mere selection of a surface inside a phase space, it genuinely changes the dynamics. Therefore, we must use the primary constraints in the Hamiltonian.
It is also a first sign of the need for being careful when calculating the number of degrees of freedom in the Hamiltonian formulation. If I assume that a zero Hamiltonian H 0 for n variables has no constraints, it means 2 n required Cauchy data, and works like n degrees of freedom. However, half of these data simply set the momenta to arbitrary constant values which have no influence on the coordinates. For finding all the coordinates we need only n Cauchy data, which set their otherwise arbitrary constant values, and then it goes like n halves degrees of freedom. In particular, as we saw above, in a theory with one variable it gives 1 2 degree of freedom.

2.1.2. Zero Hamiltonian with a Primary Constraint

Now, let us correctly take the primary constraint into account and construct the total Hamiltonian (7)
H T = H + λ Φ = 0 + λ π x
with an arbitrary Lagrange multiplier λ . The new Equations (8) are
x ˙ = λ and π ˙ x = 0 with the constraint π x = 0 .
Now we are totally back to the Lagrangian situation. The momentum is fully fixed by the primary constraint, and the full freedom in the behavior of the coordinate is restored by the Lagrange multiplier being an arbitrary function. The equation x ˙ = λ still requires one Cauchy datum to solve it for x in terms of λ , but it makes no real sense since the Lagrange multiplier is assumed to be an absolutely arbitrary function of time, as no extra condition restricts it; and the same is finally true of x.
Note that we actually could have taken an absolutely different canonical Hamiltonian here. Of course, H = 0 is indeed equal to the function H = π x x ˙ L at the surface of π x = 0 with L = 0 . However, any canonical H of the form H = π x · f ( x , π x ) would do the same. The total Hamiltonian H T = π x · f ( x , π x ) + λ π x produces then equivalent equations of motion. In other words, the canonical Hamiltonian (3) for a degenerate system is not unique. In fact, if it was not for the primary constraint put into H T , these choices would have given different equations of motion.
If for the model being trivial, the choice of H = 0 looks like the only natural one above, we could also take a somewhat less trivial case of L ( x , y ) = 1 2 y ˙ 2 y 2 with the Hamiltonian H T = 1 2 π y 2 + y 2 + λ π x yielding the usual harmonic oscillator of y ( t ) and total freedom for x ( t ) again the same way as before. It is still very natural, to not add any contribution of π x to the canonical Hamiltonian. A bit more entertaining, however, is to take
L ( x , y ) = 1 2 x ˙ y ˙ 2 x y 2 .
In this case, the function H = 1 2 x ˙ y ˙ 2 + x y 2 can be represented as H = 1 2 π x 2 + ( x y ) 2 , or H = 1 2 π y 2 + ( x y ) 2 , with only symmetry considerations suggesting either H = 1 4 π x 2 + π y 2 + 1 2 ( x y ) 2 or H = 1 2 π x π y + 1 2 ( x y ) 2 . Looking at the x ± y 2 variables offers yet another symmetric choice of H = 1 8 π x π y 2 + 1 2 ( x y ) 2 . Those are all the same at the surface of the primary constraint π x + π y = 0 , with the same equations of motion produced by H T = H + λ π x + π y .

2.1.3. Adding Total Time Derivatives

Note that Lagrangians different by only a total time derivative are equivalent, at least at the level of equations of motion. Therefore, a gauge transformation might also be preserving a Lagrangian up to a total time derivative, or a Lagrangian density up to a total divergence, in field theories.
Therefore, another option to take the model with empty physical content is
L ( x ) = f ( x ) x ˙ .
We get then a non-zero momentum π x = f ( x ) , the natural choice of zero canonical Hamiltonian, and H T = λ ( π x f ( x ) ) . The equations, together with the constraint, then read
x ˙ = λ , π ˙ x = f · λ , π x = f .
They are obviously satisfied by an arbitrary x ( t ) and π x ( t ) = f ( x ( t ) ) .
For the Lagrangian variable x ( t ) , the result is of course the same as it was with the zero Lagrangian. However, the Hamiltonian description goes in a slightly different way, due to the different definition of the momentum, and therefore the constraint Φ = π f instead of Φ = π . It shows that if a gauge symmetry is realized only up to a total derivative term, the Hamiltonian representation of it might well go in a more elaborate style. One good field theory example of it is the Lorentz symmetry [9] of the Teleparallel Equivalent of General Relativity (TEGR).

2.2. A Very Simple Gauge Symmetry Can Also Hit Twice

There exists a simple slogan [6,10] that “a gauge symmetry hits twice”, meaning that it produces two Hamiltonian constraints. As we have seen in the trivial example, or as can be found in Refs. [6,10] as well, it is not always true. I will come to this later, but let me also mention now that another example is TEGR. If both its diffeomorphism and Lorentz symmetries hit twice, then using the usual naive counting, it would have 16 2 · 4 2 · 6 = 4 degrees of freedom in four spacetime dimensions.
However, one can easily construct a simple mechanical model in which a gauge hits twice indeed:
L ( x , y ) = 1 2 x ˙ + y 2 .
We get the momenta π x = x ˙ + y and π y = 0 , the latter being the primary constraint Φ = π y . Then the total Hamiltonian (7) can be written as
H T = 1 2 π x 2 y π x + λ π y ,
with all the possible additions of π y to the canonical part if one wants to write the same story in a slightly more complicated form. The equations of motion (8) are
x ˙ = π x y y ˙ = λ and π ˙ x = 0 π ˙ y = π x with the constraint π y = 0 .
These equations, or the formal “preservation” of the primary constraint, yield the secondary constraint Φ ˜ π x = 0 which in turn is trivially preserved due to one of the equations, or to the fact of H T not depending on x. Then what we get is x ˙ = y with an arbitrary y ( t ) . It seems we have one pure gauge variable, two first-class constraints, and one Cauchy datum needed, therefore 1 2 degrees of freedom. Note however that the same system of equations can be interpreted as an arbitrary x and y = x ˙ which then means no Cauchy data at all. The very naive counting looked like 1 2 , but in reality there is 0 dynamical degrees of freedom, even though not both variables are pure gauge. The physical combination of variables is x ˙ + y , and it is fully constrained to be equal to zero. The traditional counting of dynamical modes by subtracting the number of first-class constraints works well in this case.
The Lagrangian story is of course the same. The equations of motion are
d d t ( x ˙ + y ) = 0 and x ˙ + y = 0 .
The first one is a dynamical equation from variation of x, and the second one is a constraint, the only constraint of the Lagrangian formalism. The action has a gauge symmetry under
x x + u ( t ) , y y u ˙ ( t )
so that only one combination of the two variables, x ˙ + y , is physical. However, the primary constraint Φ = π y transforms only y, leaving x intact, which is not a gauge transformation. This is the reason of the secondary constraint appearing. In other words, it is the same as what happens in electrodynamics: the gauge symmetry mixes a velocity (time derivative) with another coordinate (other spatial derivatives), producing two first-class constraints by only one gauge symmetry.
Note that the variation of y ( t ) in the action has produced the only Lagrangian constraint x ˙ + y = 0 , with another combination of the two variables being pure gauge. This constraint corresponds to the secondary Hamiltonian constraint, while the primary one has no relation to the Lagrangian variables, it is simply π y = 0 , which in itself does not restrict any velocity. In the Lagrangian language, we have one gauge freedom, which strips one combination of variables of physical meaning, while the physical one is subject to a constraint. If we choose a gauge of x = 0 , the physical equation (not a gauge choice!) tells us that y = 0 , which means zero Cauchy data. At the same time, the gauge of y = 0 has a remaining symmetry of constant u, which is reflected in the solution of an arbitrary constant x, not contributing to the physical gauge-invariant variable x ˙ + y . In the Hamiltonian language, we have both momenta constrained, while for the two coordinates there is one arbitrary Lagrange multiplier, so that naively the system needs one Cauchy datum. The origin of this number is in the fact that the Hamiltonian equations look in such a way as if taking the gauge in terms of y was more natural.

Beware of Secondary Constraints Put into the Hamiltonian

It is often stated, for example in the classic book of Henneaux and Teitelboim [6], that it is more natural to take both primary and secondary constraints on equal footing. This is totally wrong! First, it is not natural even from the basic framework of the Hamiltonian formalism. The primary constraints are the restrictions on possible values of momenta. They might lead to some other restrictions on the variables, called secondary constraints, but those are not new restrictions, as they directly follow from the Hamiltonian equations and the primary constraints (8). Second, adding the secondary constraints to the total Hamiltonian (7) gives an extended Hamiltonian (9) which often changes the physics of the system.
In particular, for our model we get
H E = H T + λ ˜ Φ ˜ = 1 2 π x 2 y π x + λ π y + λ ˜ π x ,
with the new equations of motion
x ˙ = π x y + λ ˜ y ˙ = λ and π ˙ x = 0 π ˙ y = π x with the constraints π y = 0 π x = 0 .
As also in the correct approach, we have both momenta equal zero; however now the two independent Lagrange multipliers make both Lagrangian variables arbitrary functions of time. In other words, we have spoiled the correspondence with the Lagrange equations. The total Hamiltonian was equivalent to the initial Lagrangian system with one gauge freedom and one physical variable, although a constrained one. At the same time, the extended Hamiltonian introduced one more arbitrary Lagrange multiplier, which resulted in a new model with two unphysical freedoms and not any single physical variable, thus rather corresponding to L ( x , y ) = 0 .
Note also that primary and secondary constraints are not the same, even if the number of Lagrange multipliers is kept correct. If we had put only Φ ˜ in our Hamiltonian, it would have produced constant π y but not necessarily zero, and constant y with an arbitrary x ( t ) , a totally different system.

2.3. Trivial Second-Class Constraints

The simplest example of a second-class constraint pair would be
L ( x ) = x 2 .
It immediately has a constraint of π x = 0 and the naturally possible canonical Hamiltonian H = x 2 .
The total Hamiltonian is H T = x 2 + λ π x . Its equations x ˙ = λ and π ˙ x = x together with the primary constraint π x = 0 reproduce the Lagrangian equation x = 0 , which is also the secondary constraint. In this case, the Lagrangian system has one single constraint x = 0 which, being of non-derivative character, is enough for reducing the number of Cauchy data to zero. The Hamiltonian representation does the same by having two second-class constraints. At the same time, its Lagrange multiplier is fixed to λ = 0 .
Note that the incorrectness of adding the secondary constraint to the Hamiltonian in the previous example was due to another arbitrary Lagrange multiplier changing the level of predictability. It is not necessarily the case with the second-class constraints since all the multipliers are constrained then. For example, obviously H E = x 2 + λ π x + λ ˜ x reproduces the same Lagrangian equation of x = 0 and demands λ = λ ˜ = 0 .
It happened because both the coordinate and its momentum are strictly fixed by the pair of constraints, which can not be changed by adding any new terms to the Hamiltonian. Related to that, and as a simple exercise about the non-uniqueness of the Hamiltonian, one can check that changing the canonical Hamiltonian (3) in this case from x 2 to, say, π x 2 + x 2 would change neither the H T nor the H E results, while taking it as π x + x 2 would shift the multiplier λ by 1 with no change to the physics either.
I will not discuss second-class constraints any longer. Let me only mention that introducing Dirac brackets is often taken as a must for the Hamiltonian analysis. This is of course not true. Their purpose is different. If we want to canonically quantize a system with second-class constraints by promoting the Poisson brackets to commutators of operators, then we cannot impose all the constraints in terms of operators, not even as the physical subspace of the Hilbert space belonging to the kernels of all the constraints, unless there is no physics and its subspace is empty. Here come Dirac brackets, as a part of an algorithm for quantitation, which we do not need in this paper.

2.4. First-Class without a Gauge

Now I take another Lagrangian
L ( x , y ) = 1 2 y x ˙ 2
which gives the primary constraint of π y = 0 and the total Hamiltonian (7)
H T = π x 2 2 y + λ π y .
The Lagrangian equations of motion were
y x ¨ + x ˙ y ˙ = 0 and x ˙ 2 = 0
implying an arbitrary constant value of x and an absolutely arbitrary y ( t ) . The Hamiltonian Equation (8) yields
x ˙ = π x y y ˙ = λ and π ˙ x = 0 π ˙ y = π x 2 2 y 2 with the constraint π y = 0
with the same result for the Lagrangian variables while requiring both momenta to be zero.
Note that the secondary constraint of π x = 0 followed from the primary constraint and the Hamiltonian equation for π y . Of course, it is nothing but requiring that the primary constraint is preserved in time. In terms of required Cauchy data, it is one half of a degree of freedom, not zero as the traditional counting would suggest. At the same time, the set of constraints,
Φ = π y and Φ ˜ = π x ,
is a system of two first-class constraints, with no gauge freedom in the Lagrangian model.
This model was also considered in the form of L = e y x ˙ 2 in the book [6] by Henneaux and Teitelboim. Their claim is that it has a gauge freedom for y but not for x, and therefore is a counterexample to Dirac’s conjecture. My point is that the Dirac’s conjecture is actually never correct. So, I do not argue against this conclusion. However, in this case there is simply no gauge freedom in the Lagrangian at all. It shows that sometimes a primary first-class constraint might be unrelated to any gauge freedom.
Why they call it a gauge system in the book [6] is because an arbitrary y ( t ) satisfies the equations of motion. However, in this case, it is not due to a symmetry of the Lagrangian, not even up to a total derivative. It is due to another kind of singularity here: the variable y enters the Lagrangian being multiplied by a square of something which is equal to zero when equations of motion are satisfied, and therefore its contribution to the equations does not identically disappear but rather gets multiplied by something which is zero due to another equation.
If the Lagrangian was of the form L ( x , y ) = f ( y ) g 2 ( x , x ˙ ) with whatever smooth functions f, with f 0 , and g, it would be precisely the same story with the equations of motion satisfied by any function y ( t ) and a function x ( t ) which is a solution of the equation g = 0 . Indeed, y is then arbitrary. However, I will not call it a gauge symmetry. It shows that the unphysical nature of a variable, in the sense of not being determined by equations of motion, might sometimes not be coming from a gauge freedom.
On the other hand, this non-gauge freedom is indeed a kind of singularity I mentioned. If we had taken a Lagrangian L = y x ˙ , it would have been a model with two constant in time Lagrangian variables governed by a zero canonical Hamiltonian and a pair of second-class constraints. From the Cauchy data counting, it has one degree of freedom, although it rather looks like two halves. An analogous result would be with L = y x ˙ 2 x 2 when x 0 , or in a Lorentz-invariant field model such as the mimetic fluid [11,12] λ ( μ ϕ ) ( μ ϕ ) 1 , for in these cases the extra variable is multiplied in the Lagrangian by something which does have a non-vanishing first variation around the physical trajectory. Needless to say, if one wants to have it singular again, we can always put an extra square as in the f g 2 model mentioned above.

2.4.1. On Bifurcations, New Mimetic, Teleparallel Craze, and All That

Most likely, the main reason why Henneaux and Teitelboim [6] took the previous example with e y instead of simply y was in order to avoid a point with zero π x . In fact, in the case of L = y x ˙ 2 the variable y being zero does not change anything in the equations of motion. The Hamiltonian looks singular when y 0 , however everything works fine if we simply ignore the fact that 1 y might become infinite. On the other hand, one could try to consider an extra primary constraint of π x = 0 when y = 0 . If applied naively, it gives a zero canonical Hamiltonian with both momenta being primary constraints, and therefore producing an incorrect result of arbitrary x ( t ) .
In fact, the reason of nothing special at y = 0 is that the momentum π x as a function of y had only a simple root, which was not enough for it acting as a fundamental constraint. If we had L ( x , y ) = y 2 x ˙ 2 , then the locus of zero y would be a special place with a bifurcation: the function x ( t ) is indeed absolutely arbitrary there. All the y 0 regimes can then be described as above with substitution of y by y 2 , while the story at y = 0 goes like a zero Hamiltonian with constraints for both momenta and y being set to zero.
There is no general understanding of how to work with these bifurcations at the Hamiltonian level. Although usually it happens only at a very specific point, which we are trying to avoid as pathological, for example, like a locus of f = 0 in f ( R ) gravity. It is more problematic that, even though the bifurcations appearing at the primary level are usually quite obvious right away, they can also come at the later stages, and in a less immediately visible manner. Namely, the nature of constraints can be changing.
As yet another example of a non-gauge singular situation at the Lagrangian level, let us consider a Lagrangian
L ( x , y ) = 1 2 x ˙ 2 x 2 ( y x ) 2 .
Its total Hamiltonian with the primary constraint Φ = π y
H T = 1 2 π x 2 + x 2 ( y x ) 2 + λ π y
produces the secondary constraint Φ ˜ = x 2 ( y x ) . Their Poisson bracket is { Φ ˜ , Φ } = x 2 . Therefore, at x 0 we have a pair of second-class constraints, while at x = 0 it turns into two first-class constraints. The very obvious Lagrangian equations show that indeed these regimes are genuinely different. We either have x = 0 and totally arbitrary y ( t ) , or y = x and the dynamical equation of x ¨ = 0 .
A more familiar case of that is the Proca theory. When we have a vector field with the Lagrangian density L ( A ) = 1 4 F μ ν F μ ν + V ( A μ A μ ) with F μ ν μ A ν ν A μ , the usual U ( 1 ) gauge symmetry is broken by the potential term V, and therefore all four components are physical with three of them being dynamical and A 0 being constrained by a pair of second-class constraints. However, if there exists a point with V = 0 , then the gauge invariance is spontaneously restored for linear perturbations around it, and the constraints get to have their Poisson bracket vanishing. It effectively brings us back to the gauge-invariant case with only three physical components and two of them dynamical. Therefore, the number of degrees of freedom appears to not be well-defined.
Perhaps the most cautious attitude would be to simply avoid such situations. However, sometimes people want to use them on purpose. For example, mimetic gravity [11,12] was built by a metric transformation g ˜ μ ν = g α β ( α ϕ ) ( β ϕ ) g μ ν from the General Relativity (GR). If the physical metric is g ˜ μ ν while g μ ν and ϕ are used as fundamental variables, then the role of conformal mode is played by the scalar field which, due to more derivatives, gives more solutions in the shape of an effective ideal fluid [12]. Very many different modifications have been considered [13,14]. However, one can also stick to the simple initial idea and generalize it to disformal transformations which then produce the same mimetic gravity as long as the metric transformation is not invertible, otherwise it is pure GR with a spurious scalar field on top [15]. Recently, it was noticed [16,17] that in most cases the metric transformation is invertible, except at some singular loci which then produce extra solutions of mimetic type, too. Personally, I am skeptical about using such ill-posed theories [18], but it would be interesting to learn how to work with them.
Another amazing story of recent years is modified teleparallel gravity such as f ( T ) . It breaks the local Lorentz invariance in the space of tetrads (I assume its pure tetrad formulation), with however a rather chaotic zoo of remnant symmetries [19]. The structure of the constraint algebra is so difficult that there is even no consensus on the number of degrees of freedom [20]. What is almost fully clear is that there is some new dynamics on top of the usual two modes of GR [20,21]. However, the trivial Minkowski space is an obvious strong coupling regime with the Lorentz symmetry restoration, and therefore no new modes in the linear perturbations. What is much more surprising is that no new dynamical modes are found around the usual cosmologies either [22,23], even though the Lorentz symmetry is broken there and the new features of cosmological perturbation theory arise, such as non-zero gravitational slip [22]. It might be the case that such theories do require a genuinely novel insight into the Hamiltonian methods.

2.4.2. A Brief Remark on Proca-Nuevo De Nuevo

Let me also mention that, inspired by ghost-free massive gravity [24,25], yet another vector field theory has been proposed [26], with the Lagrangian built in terms of elementary symmetric polynomials of the eigenvalues of a square root of the matrix δ ν μ + μ A ν + ν A μ + μ A α ν A α . It has a primary constraint which sounds promising since dynamical temporal components are usually deadly. Very recently, a new investigation of the model appeared [27], although only in 2D. Lagrangian analysis of constraints [28] was used to claim that the primary constraint is the only one in the story. Then, assuming that there must be the same number of Hamiltonian constraints, and that it cannot be first-class in case of no gauge symmetry, the conclusion was that it is a case of a constraint which is second-class with itself, and therefore the model has 3 2 degrees of freedom [27].
A self-second-class constraint is a rare and exciting situation, known for example in versions of Hořava gravity. In my opinion, it is a very interesting topic; however, given the subtleties of the dictionary between the Lagrangian and the Hamiltonian formalisms I reviewed above, it would be very important to do an explicit Hamiltonian analysis, as well as to get more Lagrangian details. Of course, the structure of the model is not very simple, but in the case of massive gravity the full Hamiltonian analysis was successfully done by the brute force approach [29,30], and some other methods of working with the square roots of matrices are also known [31,32].

3. Hamiltonian Mechanics with Constraints

Now let me discuss what happens in general, mainly for Lagrgangians quadratic in velocities. My main goal here is to show that in general a total Hamiltonian is enough to reproduce the Lagrangian physics. For demonstrating the fact that insisting on extended Hamiltonians is incorrect, it is enough to have some explicit examples, in the previous and in the following Sections.
Let us take a Lagrangian
L ( x , y ) = 1 2 x ˙ 2 + b ( x , y ) x ˙ + c ( x , y ) y ˙ + d ( x , y )
which has the momenta π x = x ˙ + b and π y = c . It produces the primary constraint Φ = π y c ( x , y ) , and the total Hamiltonian (7) can be chosen as
H T = 1 2 ( π y b ) 2 d + λ π y c .
It is only for clarity of presentation that I take only two variables. It will not change much of what follows, if I take x as an n-dimensional vector, and y—as an m-dimensional one.
The Lagrangian equations of motion are easily found to be
x ¨ = c x b y · y ˙ + d x 0 = b y c x · x ˙ + d y
while the Hamiltonian ones (8) are
x ˙ = π x b y ˙ = λ and π ˙ x = ( π x b ) b x + d x + λ c x π ˙ y = ( π x b ) b y + d y + λ c y with the constraint π y c = 0 .
Differentiating x ˙ = π x b , we get x ¨ = π ˙ x b x x ˙ b y y ˙ which upon substitution of the equation for π ˙ x , and taking into account π x b = x ˙ and λ = y ˙ , gives the first Lagrangian equation. Analogously, differentiation of the constraint, π ˙ y = c x x ˙ + c y y ˙ , transforms the equation for π ˙ y into the second Lagrangian equation. Therefore, the Hamiltonian method works well indeed.
What has happened here is the following. As we discussed in the Introduction, for non-degenerate systems the canonical Hamiltonian (3) has a very nice property (4) of H π = x ˙ and H x = L x , then the former equation means simply that π = L x ˙ while π ˙ = H x comes back to the Lagrangian equation of d d t L x ˙ = L x . Degenerate systems are more intricate. As long as we keep the momenta strictly equal to their Lagrangian definition, the variation of the canonical Hamiltonian is still given by δ H = x ˙ δ π L x ˙ δ x . However, once we have found some corresponding expression H = H ( x , π ) and forgotten about x ˙ , it fails to be unique and can contain arbitrary combinations of primary constraints.
In the current example, it corresponds to adding arbitrary combinations of π y c to the Hamiltonian, which we explicitly took into account by adding the primary constraint term in H T . Therefore, the equation for x ˙ has correctly found the velocity of x in terms of its momentum, while y ˙ remains arbitrary, at least at this stage. The momentum π y did not depend on y ˙ , which therefore cannot be found from the values of momenta, and the primary constraint term takes care of that by putting a Lagrange multiplier in place of the non-existing y ˙ ( π y ) . On the other hand, these additions also make the variations H x and H y different from L x and L y , by the terms with derivatives of c ( x , y ) , which are taken care of by the primary constraint in H T , too.
Therefore, different values of λ in H T correspond to all possible definitions of the canonical Hamiltonian with the correct variation at the primary constraint surface. When we impose also the equation of the primary constraint, π y c = 0 in our case, it comes back to the desired variations. Either a normal to constraint surface variation determines the only correct λ , which is then like the contact force acting on a particle from the surface on which it moves, and the variable y appears to be physical but constrained, or we have it simply arbitrary, and this is then a case of gauge freedom and its induced primary first-class constraint.
Note that we have checked it without ever talking about the secondary constraints. The equations simply tell us everything about the Lagrangian variables, and looking at other consequences is needed only for counting the number of degrees of freedom. In particular, if b y c x , so that b x ˙ + c y ˙ is a total time derivative, and d y 0 , then y ( t ) is a purely gauge mode, the first Lagrangian equation is the only one and it is dynamical; at the same time the Hamiltonian formalism has an arbitrary Lagrange multiplier and one first-class constraint, just for the formal definition of the momentum of the unphysical variable.
Otherwise, both variables are physical but we are getting the secondary constraint expressed by the second Lagrangian equation. In this case there is a pair of second-class Hamiltonian constraints. The Lagrangian formalism immediately has a constraint for x ˙ , and substituting it into the equation for x ¨ we get another constraint for y ˙ . In the Hamiltonian language, the restriction of y ˙ comes as a fixed value of the Lagrange multiplier λ required by preservation of the secondary constraint. All in all, we have two Cauchy data and therefore one, or two halves, degree of freedom. A special case would be if b y c x is still satisfied but with d y not identically zero. Then y is fully constrained to an extremum of d while x ( t ) is having a fully dynamical equation of motion.
Note that if x and y, as well as b and c, were multidimensional, it could potentially cause a complicated constraint algebra, in terms of Poisson brackets, with many possible secondary constraints and so on. However, the formal demonstration that the Lagrangian and the Hamiltonian ( H T ) equations do tell us the same physics would go with no big change.

A Hint of More General Considerations

We have just discussed the Hamiltonian formalism for Lagrangian systems quadratic in velocities. Let me now give some abstract description of how it should work in general. Imagine we have a Lagrangian L ( x i ) with its equations
2 L x ˙ i x ˙ j x ¨ j + 2 L x ˙ i x j x ˙ j L x i = 0
and are trying to construct a Hamiltonian (3) for it.
Let us assume that, around a point of the configuration space we are looking at, there exists a matrix Q such that
2 L x ˙ i x ˙ j = Q · 0 0 0 · Q 1
with 0 and Q T = Q 1 , with T now meaning “transpose”. Only to make it less cumbersome have I taken it as two-dimensional, and it can be thought of as two multidimensional blocks with then being a non-degenerate matrix.
From the definition of momenta (1), π x i L x ˙ i , we get the following variations of them in terms of the Lagrangian velocities:
δ π = Q 0 0 0 Q 1 δ x ˙ + 2 L x ˙ x δ x .
If I denote the projector to the second component as P 2 0 0 0 1 , then the primary constraint in terms of the variations acquires the form of
P 2 Q 1 δ π 2 L x ˙ x δ x = 0
showing that Q P 2 Q 1 is the projector to the direction in which the momentum π can only be changed due to a change of coordinates and does not feel the velocities. At the same time, in the complimentary direction of Q P 1 Q 1 , there is a fixed relation between the momentum and the velocity.
Then, as we have discussed before, even in the degenerate case with the non-invertible relation (11) between variations, we can build a canonical Hamiltonian (3) trying to reproduce the desired variations (4) as
δ H = i x ˙ i δ π x i L x i δ x i .
What I have swept under the rug is the displeasing question of which precise velocity stays there. Even if we do only a variation allowed by the constraint, it can actually be any velocity that corresponds to the value of π , therefore near our point it is, roughly speaking, something definite in Q P 1 Q 1 subspace, and it can be arbitrary in Q P 2 Q 1 .
What exactly we get, depends on how we construct the canonical Hamiltonian. Recall that it is not unique. Our usual way was to avoid adding any terms proportional to the primary constraint. For example, in the calculations above with the Lagrangian (10) we could have added something proportional to π y c to the canonical Hamiltonian. However, we did not. This was a way to avoid the unwanted velocity in the term π y c y ˙ by simply putting it to zero. This way, not only did we expel π y from the canonical Hamiltonian but also got rid of the function c ( x , y ) in it.
In other words, we construct the canonical Hamiltonian from only the “ Q P 1 Q 1 -part” of momenta which, together with keeping track of the variational relation (12), means that we are losing some of the δ x variations. Without trying to give a good mathematical meaning to all that, let me say that we basically get only the Q P 1 Q 1 projection of x ˙ i from the H π x i derivative, while in calculating L x we first subtract a term such as x ˙ T · Q P 2 Q 1 L x ˙ from the Lagrangian. In our previous example (10), we can check it by putting λ = 0 for now in its Hamiltonian equation. Indeed, the H π part gave only the non-trivial value of x ˙ having y ˙ = 0 , while the H x equations lack derivatives of c ( x , y ) since it had dropped off from the canonical Hamiltonian.
What was removed from the equations by this restriction on the canonical Hamiltonian, is restored by adding the λ Φ term to the total Hamiltonian H T . Indeed, the variation of the constraint has the form of Equation (12), and then it gives an arbitrary Q P 2 Q 1 component of the velocity and a Q P 2 Q 1 2 L x ˙ x term to the equation for π ˙ , therefore taking care of the change in variation of x in the canonical piece. Note that this is precisely what happened with λ -terms in the Hamiltonian equations for the model (10).
The important point is that, precisely as before, it has restored too much. We have got an arbitrary value of the velocity component corresponding to the kernel of the Hessian. It is expected that the multiplier λ either remains arbitrary, which usually shows a case of gauge symmetry, or it will be fixed by equations (and give a secondary second-class constraint).
Needless to say, what I have given here is not a mathematical proof of the Lagrangian–Hamiltonian correspondence for beyond the quadratic-in-velocities case. For example, we would have to worry about the arguments of the coefficients given by partial derivatives of the Lagrangian, and so on. It is an interesting problem, but I only meant to provide a gist of how it should work.

4. Our Usual Field Theories

After having discussed the mechanical cases, which in my opinion are very important for understanding the theoretical foundations and the workings of the method, I turn to the field theories we often use in physics, namely electromagnetism and gravity.

4.1. Electrodynamics

I will consider Maxwell electrodynamics in vacuum. Therefore, the action is very simple,
L ( A ) = 1 4 F μ ν F μ ν with F μ ν μ A ν ν A μ by definition .
The momenta (1) are easily found to be
π 0 = 0 and π i = F 0 i = A ˙ i i A 0 ,
from which we immediately see the primary constraint Φ = π 0 and the total Hamiltonian
H T = 1 2 π i 2 + π i i A 0 + 1 4 F i j 2 + λ π 0 .
As before, we could have added an arbitrary function of π 0 to the canonical Hamiltonian, but it is anyway taken care of by the primary constraint term.
Note that the Lagrangian equations of motion are
μ F μ ν = 0 .
At the same time, the Lagrangian has a gauge symmetry of A μ A μ + μ ψ , therefore only three components out of four are physical. And still, not all of them are dynamical because one of the equations, the temporal one, is a constraint, 0 = i F 0 i = i A ˙ i A 0 . Naively, it looks like reducing the number of degrees of freedom by 1 2 . However, one possible gauge choice is i A i = 0 , which can be achieved by the gauge transformation with ψ such that ψ = i A i , and then the result is a full strength constraint of A 0 = 0 . We have two dynamical variables then and one more physical but constrained. Of course, the latter corresponds to the Gauß’s law of · E = 0 in absence of electric charges, with E being the electric field.
The Hamiltonian equations we get are
A ˙ 0 = λ A ˙ i = π i + i A 0 and π ˙ 0 = i π i π ˙ i = j F j i with the constraint π 0 = 0 .
Again, it fully reproduces the Lagrangian theory. Indeed, from the primary constraint and the equation for π ˙ 0 , which is also simply { Φ , H T } , we get the secondary constraint
i π i = 0
which, using the equation for A ˙ i , immediately reproduces the Lagrangian constraint. We easily find the dynamical equations:
j F j i = π ˙ i = A ¨ i i A ˙ 0 = F ˙ 0 i .
Incidentally, we see that the full gauge invariance is there: only the combinations of F μ ν matter. Again, naively, the equation for A 0 looks like requiring one Cauchy datum. However, this is of course spurious since it makes no sense to worry about an initial value of something whose derivative is given by an arbitrary function. At a deeper level, this is irrelevant because having fixed the value of A 0 , we are still allowed to do another gauge transformation with a parameter depending on spatial coordinates only, while if we have fixed the value of A ˙ 0 , then we can still have another gauge parameter which is at most linear in time. This symmetry is not so clear from the Hamiltonian representation. However it is also there because the vector potential A i comes only in combinations of F 0 i and F i j , even in the Hamiltonian equations.

On Incorrectness of Extended Hamiltonians Again

Let me state again that extended Hamiltonians are not correct descriptions of gauge theories. In this case we would get
H E = 1 2 π i 2 + π i i A 0 + 1 4 F i j 2 + λ π 0 + λ ˜ i π i
with an extra, and totally unjustified, arbitrary variable λ ˜ in the equations. Indeed, the only modification is to one of the equations:
A ˙ i = π i + i A 0 + i λ ˜ .
Now we are allowed to change temporal and longitudinal fields independently, and this is totally wrong. It is not the usual electrodynamics! It is all right for perturbatively quantizing photons in vacuum, but it totally neglects Coulomb forces as if those were unphysical.
Indeed, if we accept both Lagrange multipliers, even the electric field is not physical. We can then produce whatever effective electric charges: i F 0 i = λ ˜ , out of nothing. Note that the case with λ ˜ = λ ˜ ( t ) changes nothing, and this is because of the symmetry of our Hamiltonian: λ ˜ ( t ) i π i π i i λ ˜ ( t ) = 0 . However, allowing here for two fully arbitrary Lagrange multipliers is wrong. It would be as if I could freely do gauge transformations for temporal and spatial components separately A 0 A 0 + ϕ ˙ A i A i + i ψ with independent ϕ = d t d t λ and ψ = d t d t λ + d t λ ˜ , which is not a symmetry of electrodynamics.
In particular, sometimes researchers state [10,33] that in electrodynamics we must fix two gauges, A 0 = 0 and i A i = 0 . No, this is not a gauge choice! As I showed above as well, we cannot arbitrarily change both temporal and longitudinal fields. This choice is possible only in electrodynamics without sources, and it is not only a gauge choice but also a solution of the Lagrangian constraint equation. Indeed, since we have the Gauß’s law (for vanishing electric charge density)
i A ˙ i A 0 = 0 ,
once we have chosen a gauge of A 0 = 0 , we get d d t ( i A i ) = 0 . Then it can be reduced to i A i = 0 using a gauge parameter ψ depending on spatial coordinates only, which is a totally legitimate remnant symmetry then. By no means is it a new gauge choice. It is simply the fact that A 0 = 0 does not fully fix a gauge.
Out of pure curiosity, note also that our total Hamiltonian Equation (14) has reproduced the initial definition of the spatial momenta: π i = A ˙ i i A 0 which is not surprising since the definition of momenta was invertible in its spatial part. This success is totally destroyed by introduction of the extended Hamiltonian. If we demand it by hand however, then the extended Hamiltonian picture comes back to the real physics, and fixes the new Lagrange multiplier to be a function of only time, which makes it practically non-existent together with any incorrect effects it has produced. However, there is no natural reason for doing so on the Hamiltonian side.

4.2. General Relativity

Another important gauge theory we have is General Relativity. Its particularly convenient representation for the analysis is in the ADM variables [34]. Those are given in terms of the metric written as
g μ ν d x μ d x ν = ( N 2 N i N j γ i j ) d t 2 2 N i d t d x i γ i j d x i d x j .
Therefore, we use the spatial metric γ i j g i j and all the spatial indices are operated on by this spatial metric, while all the other metric components are expressed in terms of new variables, the lapse N and shift N i .
Using the extrinsic curvatures of the constant time slices
K i j = 1 2 N ( 3 ) Γ i N j + ( 3 ) Γ j N i γ ˙ i j ,
and after some very straightforward but relatively cumbersome algebra [35], one can find the curvature scalar
g R = γ N R ( 3 ) + K i j K i j K i i K j j 2 0 γ K i i + 2 γ ( 3 ) Γ j K i i N j 2 γ ( 3 ) N
in terms of that for the spatial metric. Neglecting the total derivatives, we come to the Lagrangian
L ( N , N i , g i j ) = γ N R ( 3 ) + K i j K i j K i i K j j .
Then we see that π N = π N i = 0 while π i j π γ i j = γ K k k γ i j K i j , and the total Hamiltonian is
H T = d 3 x γ N R ( 3 ) + 1 γ 1 2 π j j 2 π i k π i k + 2 N i k ( 3 ) π i k + λ π N + λ i π N i .
The secondary constraints are directly obvious from the Hamiltonian. More work is required to check that all the constraints are first-class, which is quite natural however because all the four pairs again come from a gauge symmetry that mixes temporal and spatial derivatives. In other words, the secondary constraints come about because the gauge transformations are not simply changes of lapse and shift.
It is very common in the literature to omit the λ -terms, and to simply treat this Hamiltonian system as the one with zero canonical Hamiltonian, the secondary constraints in the role of primary ones, and the lapse and shift as Lagrange multipliers. In this case, it is in fact correct since the equations are not changed then, except for renaming some field variables into Lagrange multipliers. Indeed, if I change from H T = y f ( x , π x ) + λ π y to H T = λ f ( x , π x ) , the only difference then is that the equations for x and π x instead of the factor of y, which was arbitrary anyway as y ˙ = λ , now simply get another arbitrary factor, that of λ itself. The only point to remember then is that the diffeomorphisms become somewhat hidden since they now involve transformations of the new Lagrange multipliers, which is unusual.
As in the case of electrodynamics, the naive counting of dynamical degrees of freedom works well. From the 10 variables, they subtract 8 first-class constraints and get 2 degrees of freedom. What is however wrong is to claim that there are only two physical combinations of the metric components in General Relativity. That would mean to recognize only gravitational waves (GWs) as physical and to neglect all the Newtonian forces. This is not how the world is built.
The best way to see that there are more physical components is to study the cosmological perturbation theory [36]. The coordinate change functions δ x μ are all the four independent gauge parameters, and they leave six variables physical. On top of the two graviton polarizations, there are two components of a physical vector and two physical scalars (Bardeen potentials). While the vectors are not very important for cosmology because they are decaying almost inevitably, the scalars play an instrumental role in calculating the CMB observables and the large scale structure. They are not dynamical by themselves, and the acoustic waves appear because of the matter contents of the Universe, but how these waves evolve and lead to structure formation is crucially governed by gravity, and not just in its two GW polarizations.

4.2.1. Gravity with Tetrads—Not Every Gauge Hits Twice

So, as we have seen, the familiar gauge symmetries normally “hit twice” because of mixing different derivatives of different components. At the same time, it is not difficult to construct examples of not hitting twice. In the case of electrodynamics, I could have put A μ B μ everywhere instead of only the simple vector A μ . This is a very trivial symmetry, and it would be associated with four new first-class constraints of π A μ + π B μ = 0 , which would not “hit twice” and would really be the new gauge generators precisely by themselves.
However, in gravity we have a more appealing way to go. Namely, let us write the GR in terms of tetrads. At the same time, I will not introduce any spin connection. What I have in mind is teleparallel-type generalizations of general relativity rather than coupling fermions. We take
L ( e μ a ) = g R ( g ) where g μ ν η a b e μ a e ν b
and get the gauge symmetry of Lorentz rotations e μ a Λ b a e μ b . Since the metric is invariant under such transformations, and with no derivatives of Λ involved in them, it is very easy to see that there appear six Lorentz constraints, which do not hit twice. They simply subtract the six newly introduced variables (from 10 components of the metric to 16 of the tetrad), and do not do anything else.
An interesting point is that I was not entirely honest in the paragraph above. It is that simple only if our Lagrangian is indeed written in terms of the metric. However, there is also the Teleparallel Equivalent of General Relativity (TEGR), which is the basis of modified teleparallel theories. The action of TEGR is not given in terms of the metric. Its Lagrangian is locally Lorentz invariant only up to a surface term (which is its difference from the usual Lagrangians of GR), and therefore the Lorentz behavior of the Hamiltonian formalism becomes much more difficult and interesting [9].

4.2.2. On the Canonicity of ADM

The final topic I want to touch upon is the strange claims that appeared some time ago [37] and continue to be seen in the current literature, including this Topical Collection [38]. The main statement there is that the ADM formalism is not a canonical Hamiltonian treatment of general relativity, and is not equivalent to Dirac’s approach to Hamiltonian relativity. Of course, it simply cannot be true. The only difference between different approaches to Hamiltonian GR is in the boundary terms omitted in order to not have higher derivatives and to not bother with unnecessary complications of the Ostrogradsky procedure, and also in the choice of variables.
The ADM variables of course complicate the way diffeomorphisms look, simply because those are different variables. Nevertheless, all the symmetries are there, since the Lagrangian density is still a scalar density, up to a boundary term. Nothing is done there, except rewriting it in less obviously symmetric variables. Castellani [5], for example, was able to trace the gauge transformations there. It does not matter if we have to parametrize the gauge transformations in a more difficult way for finding them in one procedure or another. One can add any function of canonical variables multiplied by a primary constraint to the canonical Hamiltonian. It will not change any physics, but will reparametrize the Lagrange multiplier. The only real question is whether the symmetry is there—and it is.
However, one point I would like to specifically mention is their claim that ADM variables are not canonical variables for gravity [37]. First of all, there is nothing sacred about being canonical. If I do a non-canonical transformation, it simply means that my Poisson bracket will be changed, and my Hamiltonian equations will look less nice. Moreover, one and the same theory can be written with different choices of canonical variables, simply by having different Hamiltonians. One way to get that is by using boundary terms. For example, Lagrangians L ( x , y ) = 1 2 x ˙ 2 + y ˙ 2 and L ( x , y ) = 1 2 x ˙ 2 + y ˙ 2 + x y ˙ + y x ˙ are equivalent. However, they have different definitions of momenta. Therefore, canonical variables of one of them are not canonical for another one. The difference has no physical implication however, because the Hamiltonians are also different and the final equations for x and y are equivalent.
Therefore, different Hamiltonian formulations of GR might indeed have different symplectic structures, and it does not mean that only one of them is the correct one. On the other hand, they argue so by saying that the change of variables from metric to ADM is not canonical [37]. This is not true either. If two theories are obtained from each other by only a change of Lagrangian variables, then their canonical variables are related by a conformal transformation.
It is very simple to explain what went wrong with their calculation of { N , π i j } in which they got a non-zero result [37]. If we go from one set of variables to another, the momenta are also changed, even for those variables that are not changed. In particular, g ˙ 00 = 2 N N ˙ 2 N i N ˙ j γ i j + N i N j g ˙ i j . Therefore, if the new variables are N, N i and g i j , then
π ADM i j = π metric i j + π metric 00 N i N j = π metric i j + π metric 00 g 0 i g 0 j ( g 00 ) 2 ,
and therefore
{ N , π ADM i j } = 1 g 00 , π metric i j + π metric 00 g 0 i g 0 j ( g 00 ) 2 = 0 .
Of course, π 00 = 0 in the Dirac approach, so that it would mean that π ADM i j = π metric i j on shell, or in terms of Lagrangian velocities, so that a direct and naive implementation of the change of variables in the degenerate Lagrangian would not be canonical in this sense. However, barring the possibility of different neglected boundary terms, the transition from metric to ADM variables is a canonical transformation if we employ the transformations such as π ADM i j = π metric i j + π metric 00 N i N j , not changing anything except indeed reparametrizing the primary Lagrange multipliers in terms of canonical variables. This is yet another manifestation of the canonical Hamiltonian (3) not being unique for a degenerate system.

5. Conclusions

My main conclusion is that we should start being more serious about the constrained Hamiltonian analysis and stop taking it as simply some sacred prescription that has to be taken with no deep thought. Even the simplest examples can demonstrate very interesting behavior. Moreover, a very important point is that the adjectives of “physical” and “dynamical” are not the same. Both Maxwell electrodynamics and general relativity have only two dynamical modes each, but it is incorrect to say the same about their physical modes.
When it comes to our usual theories, they are well understood and the Hamiltonian formalism indeed works well even in the most naive ways. However, nowadays we are also studying many new and exotic fields theories, up to often classifying all possible models in a certain big class. In these situations, and with much less good physical intuition currently available, we must exercise even more care than ever before.

Funding

This research received no external funding.

Data Availability Statement

There are no new data related to this study.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Dirac, P.A.M. Lectures on Quantum Mechanics; Dover Publications: New York, NY, USA, 2001. [Google Scholar]
  2. Arnold, V.I. Mathematical Methods of Classical Mechanics; Springer: Berlin/Heidelberg, Germany, 1989. [Google Scholar]
  3. Shapere, A.; Wilczek, F. Classical Time Crystals. Phys. Rev. Lett. 2012, 109, 160402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Dirac, P.A.M. Generalized Hamiltonian Dynamics. Can. J. Math. 1950, 2, 129. [Google Scholar] [CrossRef]
  5. Castellani, L. Symmetries in constrained Hamiltonian systems. Ann. Phys. 1982, 143, 357. [Google Scholar] [CrossRef]
  6. Henneaux, M.; Teitelboim, C. Quantization of Gauge Systems; Princeton University Press: Princeton, NJ, USA, 1994. [Google Scholar]
  7. Gitman, D.M.; Tyutin, I.V. Quantization of Fields with Constraints; Springer: Berlin/Heidelberg, Germany, 1990. [Google Scholar]
  8. Rothe, H.J.; Rothe, K.D. Classical and Quantum Dynamics of Constrained Hamiltonian Systems; World Scientific: Singapore, 2010. [Google Scholar]
  9. Golovnev, A.; Guzmán, M.J. Lorentz symmetries and primary constraints in covariant teleparallel gravity. Phys. Rev. D 2021, 104, 124074. [Google Scholar] [CrossRef]
  10. Valenzuela, M. Quantization of a pseudoclassical system with gauge and time-reparametrization invariance. arXiv 2022, arXiv:2212.02414. [Google Scholar]
  11. Chamseddine, A.H.; Mukhanov, V. Mimetic Dark Matter. J. High Energy Phys. JHEP 2013, 135, 1–5. [Google Scholar] [CrossRef] [Green Version]
  12. Golovnev, A. On the recently proposed Mimetic Dark Matter. Phys. Lett. B 2014, 728, 39. [Google Scholar] [CrossRef] [Green Version]
  13. Sebastiani, L.; Vagnozzi, S.; Myrzakulov, R. Mimetic gravity: A review of recent developments and applications to cosmology and astrophysics. Adv. High Energy Phys. 2017, 2017, 3156915. [Google Scholar] [CrossRef] [Green Version]
  14. Chamseddine, A.H.; Mukhanov, V. Resolving Cosmological Singularities. J. Cosmol. Astropart. Phys. 2017, 2007, 009. [Google Scholar] [CrossRef] [Green Version]
  15. Deruelle, N.; Rua, J. Disformal Transformations, Veiled General Relativity and Mimetic Gravity. J. Cosmol. Astropart. Phys. 2014, 2014, 002. [Google Scholar] [CrossRef]
  16. Jiroušek, P.; Shimada, K.; Vikman, A.; Yamaguchi, M. Disforming to Conformal Symmetry. J. Cosmol. Astropart. Phys. 2022, 2022, 019. [Google Scholar] [CrossRef]
  17. Jiroušek, P.; Shimada, K.; Vikman, A.; Yamaguchi, M. New Dynamical Degrees of Freedom from Invertible Transformations. arXiv 2022, arXiv:2208.05951. [Google Scholar]
  18. Golovnev, A. The Variational principle, Conformal and Disformal transformations, and the degrees of freedom. J. Math. Phys. 2023, 64, 012501. [Google Scholar] [CrossRef]
  19. Ferraro, R.; Fiorini, F. Remnant group of local Lorentz transformations in f(T) theories. Phys. Rev. D 2015, 91, 064019. [Google Scholar] [CrossRef] [Green Version]
  20. Golovnev, A.; Guzmán, M.J. Foundational issues in f(T) gravity theory. Int. J. Geom. Methods Mod. Phys. 2021, 18, 2140007. [Google Scholar] [CrossRef]
  21. Golovnev, A.; Guzmán, M.J. Non-trivial Minkowski backgrounds in f(T) gravity. Phys. Rev. D 2021, 103, 044009. [Google Scholar] [CrossRef]
  22. Golovnev, A.; Koivisto, T. Cosmological perturbations in modified teleparallel gravity models. J. Cosmol. Astropart. Phys. 2018, 2018, 012. [Google Scholar] [CrossRef] [Green Version]
  23. Bahamonde, S.; Dialektopoulos, K.F.; Hohmann, M.; Said, J.L.; Pfeifer, C.; Saridakis, E.N. Perturbations in Non-Flat Cosmology for f(T) gravity. arXiv 2022, arXiv:2203.00619. [Google Scholar]
  24. de Rham, C.; Gabadadze, G.; Tolley, A.J. Resummation of Massive Gravity. Phys. Rev. Lett. 2011, 106, 231101. [Google Scholar] [CrossRef] [Green Version]
  25. de Rham, C. Massive gravity. Living Rev. Relativ. 2014, 17, 7. [Google Scholar] [CrossRef] [Green Version]
  26. de Rham, C.; Pozsgay, V. New class of Proca interactions. Phys. Rev. D 2020, 102, 083508. [Google Scholar] [CrossRef]
  27. Errasti Díez, V. (Extended) Proca-Nuevo under the two-dimensional loupe. arXiv 2022, arXiv:2212.02549. [Google Scholar]
  28. Errasti Díez, V.; Maier, M.; Méndez-Zavaleta, J.A.; Tehrani, M.T. A Lagrangian constraint analysis of first order classical field theories with an application to gravity. Phys. Rev. D 2020, 102, 065015. [Google Scholar] [CrossRef]
  29. Hassan, S.F.; Rosen, R.A. Resolving the Ghost Problem in non-Linear Massive Gravity. Phys. Rev. Lett. 2012, 108, 041101. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Hassan, S.F.; Rosen, R.A. Confirmation of the Secondary Constraint and Absence of Ghost in Massive Gravity and Bimetric Gravity. J. High Energy Phys. 2012, 2012, 123. [Google Scholar] [CrossRef] [Green Version]
  31. Golovnev, A. On the Hamiltonian analysis of non-linear massive gravity. Phys. Lett. B 2012, 707, 404. [Google Scholar] [CrossRef] [Green Version]
  32. Golovnev, A.; Smirnov, F. Dealing with ghost-free massive gravity without explicit square roots of matrices. Phys. Lett. B 2017, 770, 209. [Google Scholar] [CrossRef]
  33. Ferraro, R. Noether’s second theorem in teleparallel gravity. arXiv 2022, arXiv:2210.13541. [Google Scholar] [CrossRef]
  34. Arnowitt, R.; Deser, S.; Misner, C.W. Republication of: The dynamics of general relativity. Gen. Relativ. Gravit. 2008, 40, 1997. [Google Scholar] [CrossRef] [Green Version]
  35. Golovnev, A. ADM analysis and massive gravity. In Proceedings of the 7th Mathematical Physics Meeting: Summer School and Conference on Modern Mathematical Physics, Belgrade, Serbia, 9–19 September 2012. [Google Scholar]
  36. Mukhanov, V.F.; Feldman, H.A.; Brandenberger, R.H. Theory of cosmological perturbations. Phys. Rep. 1992, 215, 203. [Google Scholar] [CrossRef] [Green Version]
  37. Kiriushcheva, N.; Kuzmin, S.V. The Hamiltonian formulation of General Relativity: Myths and reality. Cent. Eur. J. Phys. 2011, 9, 576. [Google Scholar] [CrossRef] [Green Version]
  38. Frolov, A.M. Metric Gravity in the Hamiltonian Form—Canonical Transformations—Dirac’s Modifications of the Hamilton Method and Integral Invariants of the Metric Gravity. Universe 2022, 8, 533. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Golovnev, A. On the Role of Constraints and Degrees of Freedom in the Hamiltonian Formalism. Universe 2023, 9, 101. https://doi.org/10.3390/universe9020101

AMA Style

Golovnev A. On the Role of Constraints and Degrees of Freedom in the Hamiltonian Formalism. Universe. 2023; 9(2):101. https://doi.org/10.3390/universe9020101

Chicago/Turabian Style

Golovnev, Alexey. 2023. "On the Role of Constraints and Degrees of Freedom in the Hamiltonian Formalism" Universe 9, no. 2: 101. https://doi.org/10.3390/universe9020101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop