Next Article in Journal
Excited States of Maximal Warm Holes
Next Article in Special Issue
Super Riemann Surfaces and Fatgraphs
Previous Article in Journal
Charged Kaon Femtoscopy with Lévy Sources in sNN = 200 GeV Au+Au Collisions at PHENIX
Previous Article in Special Issue
Maxwell Field of a Charge in Hyperbolic Motion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Quantum Mechanics and Quantum Field Theory: Algebraic and Geometric Approaches

1
Department of Mathematics, Moscow Engineering Physics Institute (MEPhI), Kashirskoe Shosse 31, 115409 Moscow, Russia
2
Department of Mathematics, University of California, Davis, CA 95616, USA
*
Author to whom correspondence should be addressed.
Universe 2023, 9(7), 337; https://doi.org/10.3390/universe9070337
Submission received: 19 June 2023 / Revised: 13 July 2023 / Accepted: 14 July 2023 / Published: 17 July 2023
(This article belongs to the Special Issue Universe: Feature Papers 2023—Field Theory)

Abstract

:
This is a non-standard exposition of the main notions of quantum mechanics and quantum field theory, including recent results. It is based on the algebraic approach in which the starting point is a star-algebra and on the geometric approach in which the starting point is a convex set of states. Standard formulas for quantum probabilities are derived from decoherence. This derivation allows us to go beyond quantum theory in the geometric approach. Particles are defined as elementary excitations of the ground state (and quasiparticles as elementary excitations of any translation invariant state). The conventional scattering matrix does not work for quasiparticles (or even for particles if the theory does not have particle interpretation). The analysis of scattering in these cases is based on the notion of an inclusive scattering matrix, which is closely related to inclusive cross-sections. It is proven that the conventional scattering matrix can be expressed in terms of Green functions (LSZ formula) and the inclusive scattering matrix can be expressed in terms of generalized Green functions that appear in the Keldysh formalism of non-equilibrium statistical physics. The derivation of the expression of the evolution operator and other physical quantities in terms of functional integrals is based on the notion of the symbol of an operator; these arguments can be applied in the geometric approach as well. In particular, this result can be used to provide a simple derivation of the diagram technique for generalized Green functions. The notion of an inclusive scattering matrix makes sense in the geometric approach, although it seems that a definition of the conventional scattering matrix cannot be provided in this situation. The geometric approach is used to show that quantum mechanics and its generalizations can be considered as classical theories where our devices are able to measure only a part of the observables.

1. Lecture 1

1.1. Introduction

In the usual exposition of quantum mechanics, we live in Hilbert space and consider operators in this space. Self-adjoint operators correspond to observables. This is the approach that physicists use almost always, but it has its drawbacks. We will talk about other approaches. This is first of all the algebraic approach, where the starting point is an algebra of observables, an associative algebra with involution, in which the self-adjoint elements are observables. This approach is almost as old as quantum mechanics itself. In addition, we will talk about the geometric approach, in which the starting point is a set of states [1,2,3,4,5,6,7]. This approach was proposed a couple of years ago, and it is much more general than the algebraic approach.
The main thing, which in our opinion is not emphasized enough in the usual presentation of quantum mechanics (a little bit more is said in quantum field theory), is that the notion of a particle is not a primary notion in quantum theory. It is a secondary notion. Particles are elementary excitations of the ground state. Quasiparticles (another important notion) are elementary excitations of any translation-invariant state. The basic notion that we have in the physics of elementary particles is the notion of scattering; this will be our main topic. We talk not only about the conventional scattering matrix (related to scattering cross-sections) but also about the notion of an inclusive scattering matrix, which is closely related to the notion of inclusive scattering cross-sections. The scattering matrices can be expressed in terms of Green’s functions by the well-known formula belonging to Lehmann, Simanzyk, and Zimmermann [8], and the inclusive scattering matrices can be expressed in terms of generalized Green’s functions, which first appeared in nonequilibrium statistical physics in Keldysh formalism (see for example [9]).
This text is based on the first ten lectures of the course taught by Albert Schwarz in 2022. It will be expanded to include, in particular, BRST formalism and elements of BV formalism with applications to gauge theories and string theory. However, even the expanded version will contain only a small part of quantum field theory; more complete exposition can be found in the books [10,11,12,13,14,15,16,17,18,19,20,21,22,23].

1.1.1. Convex Sets

Before turning to physics, we want to say a few words about convex sets, which will be used many times.
A convex set C is a subset of vector space that, together with every two points, contains a segment connecting these points. The important thing is that in a convex set one can consider a mixture of points of the set. If we take some points of the set and ascribe a non-negative number to each point such that the sum of numbers equals one, then the sums of points e i C with coefficients p i 0 , p i = 1 will belong to the convex set as well. The sum p i e i C is called the mixture of points e i with probabilities p i . One can consider the numbers p i as weights, in which case this sum will represent the center of gravity. Another important notion is that of the extreme point of a convex set. An extreme point is a point that does not lie inside any segment with ends belonging to the set. The extreme points of a polyhedron are vertices. For a ball, the extreme points lie on the boundary sphere.
We always assume that the vector space we are considering has some topology, that there is a notion of limit, and as such there is the notion of a closed set. We will assume that all convex sets we consider are closed, meaning that one can consider a mixture of a finite number of points and a mixture of a countable number of points. In the latter case, the sum must be considered infinite.
It is possible to consider a mixture of points of any subset of a convex set if the subset is equipped by a probability distribution. If, say, on a sphere bounding a ball, we have a probability distribution with probability density ρ ( λ ) , then we can consider the mixture of states on the sphere; simply take the integral instead of the sum. If λ are parameters describing points on the sphere and ω ( λ ) are points on the sphere, then we can consider such a mixture by taking the integral ω ( λ ) ρ ( λ ) d λ instead of the sum. If the convex set is compact, then each point is a mixture of its extreme points.

1.1.2. Quantum Theory: Geometric Approach

Let us repeat what we need from quantum mechanics. For us, a state in quantum mechanics is a density matrix. A density matrix is a self-adjoint operator K which is positive definite and has a trace equal to one: Tr K = 1 . The set of density matrices is convex, and its extreme points are called pure states. These pure states correspond to vectors in Hilbert space. Each normalized vector Ψ corresponds to a density matrix, which is defined as an orthogonal projection on this vector:
K Ψ ( x ) = x , Ψ Ψ .
Note that if two vectors are proportional ( Ψ = λ Ψ ), then the corresponding density matrices coincide: K Ψ = K Ψ .
The density matrix K has a basis of eigenvectors e i with non-negative eigenvalues p i , which sum to 1. We can say that each of these vectors corresponds to a pure state and that the density matrix is a mixture of these pure states. To check this, we can use a representation in which the matrix K is diagonal; then, the diagonal elements are equal to p i . As the trace is equal to 1, their sum equals one: p i = 1 . Because the matrix is positive definite, the diagonal elements are non-negative: p i 0 . Consequently, we can say that the density matrix is a mixture of pure states with probabilities p i .
In textbook quantum mechanics, all this is told in reverse order, starting with pure states. Density matrices are defined as mixed states.
We have discussed one way of representing the density matrix as a mixture of pure states. In fact, it can be done in an infinite number of different ways.
In the geometric approach, the starting point is the set of states. We assume that the set of states is a bounded closed convex subset of a topological linear space. It turns out that these assumptions are sufficient to develop a meaningful theory.

1.1.3. Algebraic Approach to Quantum Theory

Whereas in the geometric approach the starting point is the space of states, in the algebraic approach the starting point is the algebra of observables A . Recall that an algebra is a vector space in which one can multiply elements with a distributive law; we further require that the algebra is associative and is equipped with an involution (such an algebra is called a *-algebra). A typical example of a *-algebra (for our purposes, the basic one) is an algebra of bounded operators in a Hilbert space. In this algebra, there is an involution A A * , which corresponds to a transition to the adjoint operator. It has the property that if we pass twice to the adjoint operator, we return to the original one ( A * * = A ). If we take the adjoint operator to the product, it will be the product again, but in reverse order: ( A B ) * = B * A * . In addition, the involution is antilinear. In operator algebra these are simply properties; however, for arbitrary associative algebras with involution (*-algebras) these are axioms.
We always assume that there is some topology in an lgebra in which all operations are continuous. We usually do not talk about these topologies, first because it takes time, and second because different topologies can be equally sensible. Sometimes it is necessary to have a norm in which the algebra is a Banach space, in which case the inequality | | A B | | | | A | | · | | B | | is required (this is the definition of a Banach algebra). Sometimes something else is required, such as when the algebra is a C * -algebra (this means that the norm of the product A * A is equal to the square of the norm of the operator A, that is, | | A * A | | = | | A | | 2 ). We usually will not specify which topology is chosen. We want to emphasize that when considering a homomorphism or an automorphism of a *-algebra, we always assume that it is continuous and agrees with involution.
If there is an algebra with involution, the self-adjoint elements ( A = A * ) correspond to physical quantities. Self-adjoint elements themselves do not form a subalgebra. The product of self-adjoint elements is not necessarily a self-adjoint element. This disadvantage of the algebraic approach was noticed as early as the 1930s, leading to the notion of Jordan algebra. Jordan noticed that although the product of self-adjoint elements is not self-adjoint, the anticommutator A B = A B + B A , where A and B are self-adjoint, is again self-adjoint. He axiomatized this operation. The theory of Jordan algebras was constructed in the 1930s, mainly in the famous works of Jordan, Wigner, and von Neumann [24]. Although Jordan algebra is a very beautiful and really useful object in many parts of mathematics, it has not yet had much use in physics. Now it has naturally appeared in the geometric approach, and may come back to physics again.
If we start from a *-algebra (an associative algebra with involution), we can define the notion of state as follows: a state is a linear functional ω on an algebra A that satisfies the non-negativity condition on the elements of the form A * A :
ω ( A * A ) 0 .
We say that linear functionals corresponding to states are positive functionals.
States differing by a numerical factor are identified. It is often convenient to consider only states satisfying the condition ω ( 1 ) = 1 (i.e., normalized states).
Now, we can define the notion of the expectation value of a physical quantity in a given state. For the function f ( A ) of A A , the mathematical expectation (the average) is provided by the formula
f ( A ) ω = ω ( f ( A ) ) .
For a C * -algebra, we can define f ( A ) for any continuous function f of a self-adjoint element A; then, knowing the expectation values for all continuous functions, we can define the notion of the probability distribution of a physical observable in a normalized state (a state for which ω ( 1 ) = 1 ).
The notion of state alone is not enough, however; the notion of evolution is needed as well, because the goal of physics, as of any science, is to make predictions. A physicist, first of all, considers the following problem: if an initial state is known, what is the best way to predict what will happen afterward?
In the algebraic approach, in order to define the notion of evolution one should first consider the group A u t ( A ) of automorphisms of the *-algebra A . Recall that automorphisms of A must always commute with involution; hence, the group of automorphisms, naturally acting on linear functionals, transforms positive functionals into positive ones (states into states). In any approach to quantum theory, states must depend on time. There must be an evolution operator U ( t ) that transforms a state at the initial moment into a state ω ( t ) at some other time t, in other words, ω ( t ) = U ( t ) ω ( 0 ) .
In the algebraic approach, we can assume that the operators U ( t ) come from automorphisms of the algebra A denoted by the same symbol. As in textbook quantum mechanics, we have the Schrödinger picture, in which the state evolves, and the Heisenberg picture, in which the operator evolves. These two pictures are equivalent:
ω ( t ) ( A ) = ω ( A ( t ) ) .
Note that observing the dynamics of a state ω when an algebra element A is fixed is the same as observing the dynamics of an algebra element when the state does not change.
In physics, the evolution operator is usually calculated from the equation of motion describing the same evolution operator, except over infinitesimal time. If there is invariance with respect to the time shift, then it can be argued that the operator describing a change over an infinitesimal time interval is itself independent of time. It has already been said that a change over finite time must be an automorphism of algebra; thus, changes over infinitesimal time intervals are infinitesimal automorphisms. Knowing an infinitesimal automorphism H, we can solve the equation of motion d U / d t = H U . The solution obeying U ( 0 ) = 1 can be written in the form U ( t ) = e H t A . As a result, we obtain a one-parameter group of automorphisms consisting of transformations of the form e H t (evolution operators). In quantum mechanics textbooks, the imaginary unit is depicted in the exponent; we do not write it, though of course this is irrelevant. The state ω ( t ) = U ( t ) ω ( 0 ) obeys the equation of motion
d ω d t = H ω ( t ) .
The operator H is analogous to a Hamiltonian in quantum mechanics; thus, we say that H is a “Hamiltonian”.
We have not provided a formal definition of infinitesimal automorphism. One possible formal definition is as follows: an infinitesimal automorphism is a tangent vector to a curve in an automorphism group in a unit element of this group. We require a little more, namely, that this curve be a one-parameter subgroup.
It is important to note that an infinitesimal automorphism is a derivation. This means that it must satisfy the Leibniz rule; applying it to the product x y , one must first apply it to the first factor while leaving the second one unchanged, then to the second factor while leaving the first one unchanged: A ( x y ) = ( A x ) y + x ( A y ) . This follows from the very definitions of automorphism and infinitesimal automorphism. If A is an infinitesimal automorphism, then 1 + t A for small t is already an automorphism (more precisely, 1 + t A plus something of higher order on t is an automorphism). If only the definition of automorphism is applied, we simply obtain the Leibniz rule.
Conversely, if A is a derivation, that is, if the Leibniz rule is satisfied and if A is consistent with involution, that is, the condition ( A x ) * = A ( x * ) is satisfied, then we can hope that A is an infinitesimal automorphism. In order to check whether this is true, we need to write the equation
d U / d t = A U ,
where U ( t ) is an element of the algebra A . If this equation has a solution with the initial condition U ( 0 ) = 1 , then A is an infinitesimal automorphism. It can play the role of a “Hamiltonian” (as in textbook quantum mechanics, where any self-adjoint operator can play the role of a Hamiltonian). If the algebra is finite-dimensional, we can apply the existence theorem for solutions of differential equations. In this case, the notions of derivation and infinitesimal automorphism are equivalent. Because algebra in physics is infinite-dimensional, in our situation not every derivation defines an infinitesimal automorphism. It is necessary for the equation d U / d t = A U to have a solution.
It is easy to check whether derivations form a Lie algebra. The same is true for derivations that agree with involution. One can say that derivations consistent with involution form the Lie algebra of the group of automorphisms A u t ( A ) . For the case of infinite-dimensional groups, the notion of Lie algebra is not very well defined; nevertheless, it is an important notion that works in many cases.
We may consider the case when the equation of motion does not depend on time; however, this is not necessary. The “Hamiltonian” may depend on time, in which case the equation of motion for the evolution operators has the following form:
d U d t = H ( t ) U ( t ) .
If the operator H ( t ) does not depend on t, then the evolution operators form a one-parameter group:
U ( t + τ ) = U ( t ) U ( τ ) .
In textbook quantum mechanics, a density matrix K corresponds to a linear functional ω ( A ) = Tr K A on the algebra of bounded operators; this functional satisfies the condition ω ( A * A ) 0 . It is easy to see that this condition follows from the positive definiteness of operator K. The evolution of the density matrix is described by an equation in which the right-hand side is a commutator with a self-adjoint operator (up to a constant factor). This equation has the form d K / d t = H ( K ) , where H ( K ) = [ H ^ , K ] / i , H ^ is a Hamiltonian as in textbook quantum mechanics, and H is a “Hamiltonian”.
Here, we introduce the following notations: operators in Hilbert space are operators with a hat, and operators acting on density matrices are operators without a hat. According to Stone’s theorem, self-adjoint operators (not necessarily bounded) in Hilbert space correspond to one-parameter subgroups of the group of unitary operators. In Stone’s theorem, the subgroups are continuous in the strong sense. We do not explain what this is here, as we will not need it. If a self-adjoint operator is bounded, then the corresponding one-parameter subgroup is differentiable in the sense of norm convergence. In what follows, we do not pay attention to these subtleties.
We want to say a few more words about the relation between the algebraic approach and the standard approach based on Hilbert spaces and to explain why the algebraic approach is better. Suppose that we have an involution-preserving representation of the algebra A by operators in the Hilbert space H . In other words, consider an involution-preserving homomorphism of the algebra A into an algebra of operators. Let us denote the operator corresponding to the element A of the algebra A by A ^ ; then, each normalized vector Φ H specifies a normalized state ω of the algebra A by the formula
ω ( A ) = A ^ Φ , Φ .
Moreover, each density matrix K specifies a state according to the formula ω ( A ) = Tr ( K A ^ ) . In other words, it is possible to obtain states from vectors in Hilbert space. A natural question arises as to whether all states can be obtained this way. The answer to this question is positive. Every state can be represented by a vector in Hilbert space, and this is the reason why physicists are able to work in Hilbert space all the time.
This is inconvenient in many cases, as for the same algebra of observables it is necessary to consider different Hilbert spaces. For example, in statistical physics we consider equilibrium states. Each equilibrium state lies in its own Hilbert space. This is not always convenient.
One Hilbert space, as a rule, is sufficient in quantum field theory, because there we usually consider a Hilbert space in which the ground state lies. Its elements correspond to excitations of the ground state. In quantum field theory, we usually consider only excitations of the ground state. However, it is impossible to use only one Hilbert space in quantum electrodynamics.
Now, we are going to prove that every state of an algebra with involution is represented by a vector from a Hilbert space. We construct a pre-Hilbert space E for each algebra A and a state ω . Here, it is convenient to work with pre-Hilbert spaces. We construct a representation A A ^ of the algebra using operators in pre-Hilbert space E in such a way that some cyclic vector, which we denote by θ E , corresponds to the state ω . This means that ω ( A ) = A ^ θ , θ . The fact that a vector is cyclic means that any other vector can be obtained from it using operators from algebra (all vectors have the form A ^ θ , where A A ).
The construction we are going to explain is unambiguous (up to equivalence), as will be seen from the proof. Let us assume that we already have such a representation. We can define the scalar product in A by the formula
A , B = ω ( B * A ) .
Knowing this scalar product in algebra, we can calculate the scalar product of vectors A ^ θ and B ^ θ :
A ^ θ , B ^ θ = B ^ * A ^ θ , θ = ( B * A ) ^ θ , θ = ω ( B * A ) .
Because the vector θ is cyclic, each vector of the space has the form A ^ θ . Now, we can say that there is a mapping ν : A E which transforms A into A ^ θ and this mapping is surjective. It follows that E is obtained from A by factorization. We need to factorize over all vectors which provide 0 in the scalar product with any other vectors (zero vectors).
Now, we can answer the question of how to construct E from the algebra A and ω . First, we should take the algebra A , introduce the scalar product in it as ω ( B * A ) , and factorize with respect to zero vectors. We obtain a pre-Hilbert space (the scalar product in A descents to a scalar product in quotient space). We did two things here: first, we built a pre-Hilbert space, and second, we proved that our construction is essentially unique; now, nothing else can be done. We derived it from cyclicity. This reasoning (the Gelfand–Naimark–Segal or GNS construction) is the most important element of this lecture. We will use this construction many times. Instead of pre-Hilbert space, we can consider its completion, i.e., Hilbert space E ¯ (in this case, vector θ will be cyclic in a weaker sense, that is, vectors of type A ^ θ will be dense in E ¯ ).
To illustrate, let us take some stationary state (a state that does not change during evolution) and apply the GNS construction to it. Then, we obtain some Hilbert space and a cyclic vector in it which will be stationary (i.e., will not depend on time).
Assertion: if we start with a stationary state, then the evolution operators U ( t ) descend to the Hilbert space.
This is very easy to understand. In GNS construction, we used ω ( B * A ) as a scalar product. However, this scalar product is invariant with respect to operators U ( t ) because ω is invariant (that is, it is not changed by evolution operators). Because the scalar product is invariant, the operators U ( t ) descend into unitary operators U ^ ( t ) . The operators U ^ ( t ) form a one-parameter group. It has a generator (infinitesimal automorphism) H ^ , and this is what in physics is called the Hamiltonian (actually, this is not precisely true, because in physics the Hamiltonian is assumed to be a self-adjoint operator; therefore, an imaginary unit is needed in the definition: U ^ ( t ) = e i H ^ t ).
We say that ω is a ground state if the spectrum of the operator H ^ is non-negative. Note that the ground state will have zero energy under this definition. This is consistent with the standard definition. If we apply the GNS construction to the algebra of bounded operators and the state corresponding to the eigenvector of the Hamiltonian with eigenvalue E, then the generator of the group U ^ ( t ) constructed with the GNS construction is H ^ E . In quantum field theory, we always say that we count the energy from the ground state and ignore the infinite contribution to this energy. The algebraic GNS construction does this automatically.
We have been considering the algebraic approach to quantum mechanics. This approach works perfectly well in classical mechanics as well. To show it, we will work in the Hamiltonian formalism and then repeat the same reasoning. The pure state is described by generalized momenta p = ( p 1 , , p n ) and generalized coordinates q = ( q 1 , , q n ) representing points in 2 n -dimensional space, which is called phase space. This is a pure state; however, just as in quantum mechanics, mixed states can be considered.
A mixed state is a probability distribution on a phase space or a positive measure on a phase space (the measure of the whole space is assumed to be equal to 1). All these probability distributions form a convex set. Pure states are the extreme points of this set. A pure state is a probability distribution that is supported at exactly one point. It is described by a probability density, which is a delta function. Any function is a superposition of delta functions (in other words, any function can be represented as an integral of delta functions). This means that any probability distribution on a phase space corresponds to a probability distribution on pure states, and pure states can be identified with extreme points of the space of all states.
In classical mechanics, every state can be represented in a single way as a mixture of pure states. This distinguishes classical mechanics from quantum mechanics, where a state can be represented as a mixture of pure states in many different ways. Later, we will explain how quantum mechanics can be derived from classical mechanics by restricting the set of observables. It is natural from a physical point of view to assume that our devices can measure only a part of the observables. We show that in such a situation classical mechanics can lead to quantum mechanics (Section 10.2).
Next, we recall the well-known Hamilton equation of motion
d q d t = H p , d p d t = H q
and the Liouville equation for the probability density function, which is written in terms of Poisson brackets as
d d t ρ ( p , q , t ) = { H , ρ ( p , q , t ) } .
The Poisson brackets in this case are defined by the formula
{ f , H } = f p H q + f q H p .
In order to determine the evolution operator U ( t ) , it is necessary to solve the equation d d t U ( t ) = L U ( t ) , where L ρ = { H , ρ } . This equation is equivalent to the Liouville equation. To verify this, we must check that for pure states it reduces to the Hamiltonian equations. In classical mechanics, just as in quantum mechanics, it is possible to follow how observables evolve instead of states. The observables are real functions f ( p , q ) on the phase space. It can easily be deduced from Hamilton’s equations that the evolution of observables is governed by the equation
d d t f ( p ( t ) , q ( t ) ) = f p H q + f q H p
or
d d t f ( p ( t ) , q ( t ) ) = { f , H } .
Now, we see that we are in the same situation as in quantum mechanics, or conversely, quantum mechanics is in the same situation as classical mechanics. We have observables that can be multiplied, they form an algebra A , and there is a notion of involution (involution simply being complex conjugation). Each state ω corresponds to a linear functional on the algebra of observables A , and the integral of the function is taken with respect to the probability distribution
ω ( f ) = f ω .
This functional satisfies the positivity condition ( ω ( f ) 0 if f 0 ). Functions of the form A * A are certainly positive (the square of the modulus); hence, the functional ω ( f ) is a state in the sense of the algebraic approach. We see that classical mechanics enters into the algebraic approach to quantum mechanics as a small piece, though there is a difference in that in classical mechanics the algebra of observables is commutative.

2. Lecture 2

2.1. Quantum Mechanics as a Deformation of Classical Mechanics: Weyl Algebra

Now, we will try to explain how a mathematician could derive quantum mechanics from classical mechanics. He knows that quantum mechanics in the limit of the small Planck constant reduces to classical mechanics. This means that quantum mechanics is obtained as a deformation, a small modification of classical mechanics. We should have a family of algebras A that depends on the Planck constant . If the Planck constant is equal to zero, we should obtain classical mechanics, i.e., a commutative algebra with a product A · B . Let us assume that all these algebras are defined on the same vector space, i.e., addition and multiplication by a number are independent of the Planck constant and multiplication A · B of elements of the algebra A depends on the Planck constant. Now, consider a commutator in this algebra as a function of the Planck constant:
[ A , B ] = A · B B · A
Our main requirement is that this commutator tends to zero when the Planck constant tends to zero: 0 . Let us assume that the dependence on the Planck constant is smooth. This means that the commutator can be represented as an expression linear with respect to plus something of a higher order:
[ A , B ] = i { A , B } + O ( 2 ) .
It is easy to prove that the linear part has the same properties as the Poisson bracket. This means that the operation { A , B } is a derivation with respect to both arguments (it satisfies the Leibniz rule):
{ A · B , C } = { A , C } · B + A · { B , C }
and, in addition, it satisfies the axioms of the Lie algebra. To prove this, we use the following properties of the commutator in associative algebra:
[ A , B ] = [ B , A ] ,
[ A , [ B , C ] ] + [ B , [ C , A ] ] + [ C , [ A , B ] ] = 0 ,
[ A B , C ] = [ A , C ] B + A [ B , C ] .
These equations must be satisfied for each Planck constant . Let us decompose all these equations with respect to the Planck constant. In the second equality it is necessary to decompose to the second order, and in the rest to the first order. Equating the leading terms with respect to , we obtain the desired properties.
We have proven that in the limit 0 we obtain classical mechanics from quantum mechanics. The commutator is related to the Poisson bracket in this limit.
Let us see whether it is possible to go the other way around. We have seen how quantum mechanics turns into classical mechanics; now, we ask whether we can obtain quantum mechanics from classical mechanics. To do this, we first describe all possible Poisson brackets in the case when the algebra A is an algebra of polynomial functions on some vector space with coordinates ( u 1 , , u n ) . In order to calculate how the Poisson bracket works, it is only necessary to know the Poisson bracket of the coordinates. This is because we are dealing with polynomials and we have a property that allows calculation of the Poisson bracket of products. A polynomial is a linear combination of the products of the coordinates; hence, the bracket of two polynomials can be calculated. The result of the calculation is as follows:
{ A , B } = 1 2 σ k l ( u ) A u k B u l ,
where σ k l ( u ) denotes the Poisson bracket of coordinates u k , u l .
If σ is antisymmetric and independent of u, we can check whether this expression satisfies the conditions imposed on the Poisson bracket. This is exactly the situation that arises when we deal with the standard Poisson bracket in phase space.
Now, we can ask how to deform the Poisson bracket to obtain a family of associative algebras. This problem is not easy, and was solved only quite recently by Kontsevich [25]. In the case when the Poisson bracket of two coordinates does not depend on u, it is much easier to answer this question. We define the algebra A as an associative algebra with generators u ^ k , obeying the relation
u ^ k u ^ l u ^ l u ^ l = i σ k l .
Of course, this could also be done when σ depends on u, in which case we would not know whether we would obtain an associative algebra. If σ does not depend on u, then we obtain an associative algebra, which is called a Weyl algebra. If we start with polynomials, this is the only way to deform the Poisson bracket. We can introduce involution to a Weyl algebra by assuming that the generators u ^ k are self-adjoint.
We have obtained commutation relations, which in a slightly different form are well-known from textbook quantum mechanics. To show this, we require that the matrix σ be non-degenerate. This is an antisymmetric matrix, and literally cannot be diagonalized, though it can be written in a suitable basis as a block-diagonal matrix consisting of two-dimensional blocks. If we take advantage of this, we can reduce the commutation relations in the Weyl algebra to the commutation relations
p ^ k p ^ l = p ^ l p ^ k , q ^ k q ^ l = q ^ l q ^ k , p ^ k q ^ l q ^ l p ^ k = i δ k l .
These are commutation relations for coordinates and momenta in the standard exposition of quantum mechanics. They are called canonical commutation relations (CCR).
Instead of self-adjoint generators, one can take other generators that are not self-adjoint but are adjoint to each other and satisfy the relations a ^ k a ^ l = a ^ l a ^ k , a ^ k * a ^ l * = a ^ l * a ^ k * , a ^ k a ^ l * a ^ l * a ^ k = δ k l . For example, one can take a ^ k = 1 2 ( q ^ k + i p ^ k ) , a ^ k * = 1 2 ( q ^ k i p ^ k ) .
This is how the creation and annihilation operators are denoted, though at this point it is only a formal mathematical object. We have introduced them formally and written commutation relations for them. These commutation relations are also called canonical commutation relations.
Can we now say that the resulting algebra A is a deformation of the commutative algebra? Formally, we cannot, as when we defined the notion of deformation we required that all these algebras be defined on the same space, otherwise it would be difficult to consider all of them simultaneously. While my commutative algebra consisted of polynomials, the new algebra is unknown. It can be made to consist of polynomials very simply, however.
The algebra A is generated by the elements q ^ k and p ^ k , that is, its elements are sums of monomials composed of generators q ^ k and p ^ k . Because of the commutation relations, we can shift all the q ^ k generators to the left and p ^ k to the right (or vice versa), then remove their hats. Then, we obtain a regular polynomial; we can say that the element from our algebra is represented by a polynomial, which we call a q-p-symbol. Now, the algebra is defined on the space of polynomials. This is not a very good representation because it is not consistent with involution. Nevertheless, it is very useful in many cases.
If we start with generators a ^ k , a ^ k * , we can use the same idea: shift a ^ k * to the left, a ^ k to the right, and you obtain what is called the normal form of a Weyl algebra element. Now, you can remove the hats and obtain a polynomial, which is called the Wick symbol. Physicists do not use the term “Wick symbol”, they use the words “normal form”. The Wick symbols agree with involution (involution in algebra corresponding to a complex conjugation of polynomials).
We can consider a Weyl algebra with an infinite number of generators. Thus far, we have considered the parameter k to be discrete (although the number of u k or a k with the index k could be infinite); however, we can alternatively consider this parameter to be continuous. For example, consider an algebra with generators a ^ ( k ) , a ^ * ( l ) and relations
a ^ ( k ) a ^ ( l ) = a ^ ( l ) a ^ ( k ) , a ^ * ( k ) a ^ * ( l ) = a ^ * ( l ) a ^ * ( k ) ,
a ^ ( k ) a ^ * ( l ) a ^ * ( l ) a ^ ( k ) = δ ( k , l ) .
In this case, instead of the Kronecker symbol we use its continuous counterpart, the δ function. Because the function δ ( k , l ) is a generalized function, the generators a ^ ( k ) , a ^ * ( l ) must be treated as generalized functions as well. A generalized function is a function that only makes sense under the integral sign. Only the elements a ^ ( f ) = f ( k ) a ^ ( k ) d k and a ^ * ( g ) = g ( l ) a ^ * ( l ) d l , representing formal integrals, are meaningful. These elements depend linearly on f and g, respectively, and satisfy the following commutation relations: a ^ ( f ) a ^ ( g ) = a ^ ( g ) a ^ ( f ) , a ^ * ( f ) a ^ * ( g ) = a ^ * ( g ) a ^ * ( f ) ,   a ^ ( f ) a ^ * ( g ) a ^ * ( g ) a ^ ( f ) = f , g ¯ . In order for these relations to make sense, the scalar product f , g ¯ must be defined. Because the scalar product depends antilinearly on g, we take g ¯ in the last formula. For simplicity, we usually assume that the index k is discrete.
Transition to symbols is an operation that is closely related to the operation of quantization. What is quantization? Starting with a classical Hamiltonian, we want to obtain its quantum counterpart. If the Hamiltonian depends on u k , then simply replacing u by u ^ creates the problem of what order to put these generators in. In the classical approach, the order is not important, as u 1 u 2 and u 2 u 1 are the same; however, in quantum mechanics, with a Weyl algebra, the result depends on the order. This is what is called “ordering ambiguity”, meaning that there is no unambiguous quantization procedure. It can be made unambiguous by choosing the notion of a symbol; however, there are many symbols.
There are such cases where the quantum Hamiltonian has a natural definition. This is, for example, a standard situation in classical mechanics when the Hamiltonian is represented as a sum of kinetic and potential energies. The kinetic energy depends only on the momenta, and the potential energy depends only on coordinates; then, one can use hats and the construction will be absolutely unambiguous because quantum momenta commute with each other and quantum coordinates commute with each other.
There is a standard way to write the equation of motion in the Heisenberg picture: taking the classical equations of motion, in place of Poisson brackets one must write commutators:
u ^ k d t = i [ H ^ , u ^ k ] .
In this formula, we take = 1 . In what follows, the Planck constant is always taken as equal to one unless otherwise stated.
The equation of motion is meaningful if the Hamiltonian H ^ is an element of a Weyl algebra. We have already explained that the equation of motion must contain an operation that satisfies the Leibniz rule. Such an operation is called derivation. A commutator of the form D h ( a ) = [ h , a ] satisfies the Leibniz rule [ h , a b ] = [ h , a ] b + a [ h , b ] for any algebra, and as long as H ^ is an element of a Weyl algebra everything is fine. The only problem is that H ^ is very often not an element of a Weyl algebra in the case of an infinite number of degrees of freedom. A typical example is the Hamiltonian of the form
H ^ = ϵ k a ^ k * a ^ k .
When the number of indices is infinite, it is an infinite sum. This Hamiltonian does not belong to Weyl algebra. Nevertheless, one can formally take the commutator of the Hamiltonian, H ^ and a ^ k . We obtain the equation of motion, which we will encounter more than once:
d a ^ k d t = i ϵ k a ^ k , d a ^ k * d t = i ϵ k a ^ k * .
Thus, in the case of an infinite number of degrees of freedom, the Hamiltonian is simply a formal expression of the form
H ^ = Γ m , n ( k 1 , , k m , l 1 , , l n ) a ^ k 1 * a ^ k m * a ^ l 1 a ^ l n ,
which by itself does not have the meaning of an operator or of an element of a Weyl algebra, but nevertheless makes sense under the commutator sign in the equations of motion. Though this does not always happen, there are very simple conditions when it makes sense. When a commutator of a ^ k or a ^ k * with a product is taken, one should commute with each factor of that product. Due to the Kronecker symbol in CCR, the commutator receives a contribution only from coefficients which have one of their indices coincident with k. If for any index there exists only a finite number of nonzero coefficients in the Hamiltonian containing this index, the equation of motion makes sense.

2.2. Quadratic Hamiltonians

Let us consider quadratic Hamiltonians of the form
H ( u ) = 1 2 H k l u k u l .
The ordering plays no role here, as by changing the order we obtain an irrelevant constant (the Hamiltonian is used only under the commutator sign, where the constant disappears). Thus, the classical equations of motion and quantum equations of motion are exactly the same. Moreover, if we know how to solve the classical equations of motion, we immediately know how to solve the quantum equations of motion, as all equations of motion are linear and the difference between classical and quantum mechanics arises only when the operators are multiplied.
The same problem can be made even simpler, i.e., it is possible to simplify the Hamiltonian. If we assume that the Hamiltonian is positive definite and the matrix H k l is nondegenerate, we can represent the Hamiltonian as a sum of squares.
Further, we can simplify the matrix σ , preserving the representation of the Hamiltonian as a sum of squares. In order to preserve this property, only orthogonal transformations should be taken. Note that σ is an antisymmetric matrix. If an antisymmetric matrix is multiplied by the imaginary unit, we obtain a matrix that corresponds to a self-adjoint operator, i.e., it can be diagonalized. Though this diagonalization occurs in the complex domain, we can say that we have a complex conjugate eigenvector along with each eigenvector. We can consider two complex conjugate eigenvectors, take the real and imaginary parts, and obtain a representation of the matrix σ k l in block-diagonal form with two-dimensional blocks. These two-dimensional blocks will be antisymmetric matrices; thus, the Hamiltonian will take the form of a sum of Hamiltonians of the form
H ^ = 1 2 ( p ^ 2 + ϵ 2 q ^ 2 ) ,
where [ p ^ , q ^ ] = 1 i . This is an extremely important simplification which can always be done for a positive quadratic Hamiltonian. We have discussed this in the case of a finite number of degrees of freedom. It is important to note that there exists a similar simplification in the case of an infinite number of degrees of freedom; the only difference is that when a self-adjoint operator is diagonalized, a continuous spectrum may appear in addition to a discrete spectrum. We will come back to this.
Let us consider the Hamiltonian H ^ = 1 2 ( p ^ k 2 + ϵ k 2 ( q ^ k ) 2 ) , assuming that ϵ k > 0 and solve the corresponding equations of motion:
d p ^ k d t = ϵ k 2 q ^ k , d q ^ k d t = p ^ k .
These are the standard equations of motion of a harmonic oscillator, exactly as in classical mechanics. While they can be solved in dozens of ways, the simplest way is to introduce new variables:
a ^ k = 1 2 ( ϵ k q ^ k + i p ^ k ϵ k ) , a ^ k * = 1 2 ( ϵ k q ^ k i p ^ k ϵ k ) .
In these variables the equations of motion
d a ^ k d t = i ϵ k a ^ k , d a ^ k * d t = i ϵ k a ^ k * ,
correspond to the Hamiltonian
H ^ = ϵ k a ^ k * a ^ k .
They have a very simple solution:
a ^ k ( t ) = e i t ϵ k a ^ k ( 0 ) , a ^ k * ( t ) = e i t ϵ k a ^ k * ( 0 ) .
In the case of an infinite number of degrees of freedom, by virtue of the spectral theorem one can assume that everything is diagonal; in this case, instead of the sum we obtain an integral, and the Hamiltonian has the following form:
H ^ = d λ ϵ ( λ ) a ^ * ( λ ) a ^ ( λ ) .
In physics, λ usually consists of continuous and discrete indices and the integral involves both integration and summation over a discrete index. In the special case when the theory is translation-invariant, we assume that the operators a ^ * ( x ) and a ^ ( x ) depend on coordinates x , which can be shifted without changing the Hamiltonian. This means that the Hamiltonian has the form
H ^ = d x d y ϵ ( x y ) a ^ * ( x ) a ^ ( y ) ,
where the coefficient depends only on the difference x y . There may be discrete indices in the expression in question, in which case we should sum over these indices.
It is possible to pass to the momentum representation (i.e., take the Fourier transform). Then, the Hamiltonian takes the form
H ^ = d k ϵ ( k ) a ^ * ( k ) a ^ ( k ) .
We will consider translation-invariant Hamiltonians all the time, and this formula will be essential.

2.3. Stationary States

Now, let us briefly discuss stationary (time-independent) states. If the evolution operators are denoted by U ( t ) , then a stationary state ω obeys U ( t ) ω = ω .
If we work in the formalism of density matrices K, using the fact that the equations of motion for the density matrix are written as a commutator with the Hamiltonian H ^ , then we can conclude that the density matrix represents a stationary state if it commutes with the Hamiltonian. In particular, if the density matrix is a function of the operator H ^ , then the state is stationary.
When speaking about a pure state represented by a vector Ψ in Hilbert space, the stationary state satisfies the condition H ^ Ψ = E Ψ , i.e., the stationary state is an eigenvector of the Hamiltonian. The Hamiltonian can be interpreted as the energy operator, and E is the energy level. While the vector Ψ changes with time, the state does not change:
Ψ ( t ) = U ^ ( t ) Ψ = e i t E Ψ Ψ .
In the algebraic approach, the Hamiltonian can be a formal expression; however, it is possible to apply the GNS construction to stationary state ω and obtain a Hilbert space in which there are unitary operators describing the evolution (time shift). The generator of the time translation group has the meaning of the Hamiltonian (energy operator). Its eigenvalues can be interpreted as energy levels of excitations of the state ω . More precisely, such an interpretation is perfectly correct when ω itself is stationary and translation-invariant (i.e., invariant with respect to both spatial and time translations).
This can be illustrated as follows: a translation-invariant state can be represented as a horizontal line. Excitation must be perceived as a bump concentrated in a finite region on this horizontal line. While the energy of the translation-invariant state is infinite, the difference between the energy of the bump and the energy of the translation-invariant state can be finite.
An important and simple remark, which is explained in textbook quantum mechanics for a less general situation, is as follows. Consider a classical Hamiltonian H ( u ) which has a minimum at a non-degenerate critical point. This means that the quadratic part in the Taylor expansion is positive definite; there are no zero modes. The quantum Hamiltonian is quadratic in the first approximation; in appropriate coordinates, it will have the form:
H ^ = d λ ϵ ( λ ) a ^ * ( λ ) a ^ ( λ ) + ,
where the terms denoted as … start with cubic terms in a ^ * , a ^ (there can be no linear terms, as we are at the critical point). The higher order terms with respect to a ^ * , a ^ are higher-order terms with respect to the Planck constant. At least in semi-classical approximations, we can neglect these terms. If we are not working in a semi-classical approximation, we can use perturbation theory with respect to the omitted terms.

2.4. Fock Space

Let us consider representations of the Weyl algebra W (or, which is the same thing, representations of canonical commutative relations). Among these representations there is a particularly remarkable one, the simplest one, which is called the Fock representation, and the space in which it exists is called the Fock space. The Fock representation is defined simply: there exists the following cyclic vector | 0 which is annihilated by all operators a ^ k :
a ^ k | 0 = 0 .
We denote by A ^ the operator corresponding to the element A W . This condition unambiguously defines the representation (up to equivalence).
Let us describe Fock representation more explicitly. The vector | 0 is cyclic. The definition of cyclicity depends on whether we are in pre-Hilbert or Hilbert space. If we are in pre-Hilbert space, we should find all vectors by applying algebra elements to a cyclic vector; in other words, the space is the smallest set containing the cyclic vector and is invariant with respect to all operators A ^ . In Hilbert space, we should find all vectors by applying algebra elements to cyclic vectors and taking limits. In other words, the Hilbert space is the smallest closed set that is invariant with respect to operators A ^ and contains the cyclic vector.
Let us act by applying the creation operators a ^ k * on | 0 :
( a ^ k 1 * ) n 1 . . . ( a ^ k s * ) n s | 0 .
Here, creation operators are applied many times (a finite number of times, as only finite combinations of these operators exist in Weyl algebra). We take all linear combinations of vectors (2).
Can we obtain something new if we apply the operator a ^ k to these expressions? The answer is no, because it is possible to move the operator a ^ k to the right using commutation relations such that it acts on | 0 . Then, the operator a ^ k disappears due to the condition a ^ k | 0 = 0 . Therefore, only expressions of the form (2) and their linear combinations belong to Fock space if we use the definition of cyclicity appropriate for pre-Hilbert spaces.
Further, it should be noted that the states (2) are eigenvectors of any Hamiltonian of the form H ^ = ϵ k a ^ k * a ^ k , and have eigenvalues equal to n k ϵ k .
We can check this easily by calculating the commutator of the Hamiltonian H ^ = ϵ k a ^ k * a ^ k with the operator a ^ l * :
H ^ a ^ l * = a ^ l * H ^ + ϵ l a ^ l * .
To calculate the action of the Hamiltonian H ^ on the vector, (2) is sufficient when using this relation to move the Hamiltonian to the right. Somewhat simpler formal reasoning is as follows: it is known how the operators a ^ k * change with time; they are multiplied by a numerical factor. Thus, it follows that the vector (2) is multiplied by a numerical factor; hence, it represents a stationary state.
We have obtained an orthogonal (though not orthonormal) basis of Fock space which consists of eigenvectors (2). To check the orthogonality, we can use the fact that the representation is compatible with involution.
All these formulas can be applied to the case of a multidimensional harmonic oscillator. Here, operators a ^ k , a ^ k * are called operators of creation and annihilation of quanta. In a crystal, atoms somehow interact with each other; however, the crystal itself is in a stationary state that is close to the ground state. At least in the first approximation, the crystal is described by a quadratic Hamiltonian. For quanta in this situation, there is another name: phonons, or quanta of sound. In the general case, we are dealing with a system of non-interacting bosons. The operators a ^ k , a ^ k * are called particle creation and annihilation operators, and the numbers n k in the formula for energy levels n k ϵ k are called occupation numbers.
If we want the Fock space to be a Hilbert space, we have to take a completion.
There are advantages to both approaches. In pre-Hilbert space, the operators a ^ k , a ^ k * are defined everywhere. In Hilbert space, these are unbounded operators defined on a dense subset. However, certain important states do not belong to the pre-Hilbert space.
A pre-Hilbert Fock space can be represented as a space of polynomials. Indeed, we have the following formula for this basis:
( a ^ k 1 * ) n 1 ( a ^ k s * ) n s | 0 .
To obtain a polynomial, we delete | 0 and remove the hats; then, we find a monomial with respect to variables a k * , a k .
A linear combination of such monomials is a polynomial, that is, each element of the Fock space (which we consider to be a pre-Hilbert space) can be represented by a polynomial. It is easy to calculate that the scalar product in such a representation in the form of polynomials will be provided by the formula
F , G = d a * d a F ( a * ) G ( a * ) * e a * a ,
where F ( a * ) and G ( a * ) are polynomials corresponding to some vectors. Note that G ( a * ) stands with a star. The star applied to a twice is a again; hence, G ( a * ) * is a polynomial of a.
Let us now check whether the scalar product is written in the form (3). This can be calculated, but we can provide a simpler proof. The Fock space is uniquely defined by the existence of a cyclic vector which is annihilated by all annihilation operators provided that the creation and annihilation operators satisfy the required commutative relations. Let us consider the space of polynomials with respect to a k * . We define the operator a ^ k * in this space as multiplication by a k * and the operator a ^ k as differentiation with respect to a k * . It is easy to see that multiplication and differentiation satisfy the necessary commutation relations (if we multiply by a k * , then differentiate and apply the Leibniz rule, we get exactly what we need). Thus, we have commutation relations and there is a cyclic vector that equals 1. It remains to be checked whether a ^ k and a ^ k * are adjoint to each other (this is necessary for the involution to work correctly). Indeed, in Formula (3) one can apply integration over parts to make sure that multiplication and differentiation are adjoint to each other for this scalar product. As a result, all properties of the Fock representation are satisfied, and it is not necessary to compare the original scalar product with the new one as they necessarily coincide (up to numerical factor).
Thus far, we have dealt with polynomials. As we still want to be able to work in Hilbert space, we should take a completion; it consists of holomorphic functions with respect to a k * . However, only holomorphic functions having a finite norm in our scalar product belong to this space.
It is important to note that all this reasoning applies to an infinite number of degrees of freedom as well. Although the integral is infinite-dimensional there, it turns out to be well-defined.
There is another way to describe Fock space. Polynomials are related to symmetric functions depending on discrete arguments. We can always assume that the coefficients of quadratic form are symmetric with respect to indices. For a cubic polynomial, the coefficients depend on three indices, and again we can impose the symmetry condition. This has to be imposed if we want an unambiguous representation. For polynomials of higher degree, the situation is similar. Therefore, we can assume that there is a unique representation for every element of a pre-Hilbert Fock space with the following form:
n k 1 , , k n f n ( k 1 , , k n ) a ^ k 1 * a ^ k n * | 0 ,
where n = 0 , 1 , 2 , and the coefficients f n are symmetric with respect to indices. This means that the Fock space is represented as a sequence of symmetric functions depending on an increasing number of discrete variables:
f 0 , f 1 ( k ) , f 2 ( k 1 , k 2 ) , f 3 ( k 1 , k 2 , k 3 ) , .
When we are working with polynomials, there should be only a finite number of symmetric functions of k 1 , k 2 , k 3 , . We can take a completion, in which case we have to consider a sequence of functions f 0 , f 1 ( k ) , which must satisfy the condition that the norm of this infinite sequence be finite. Usually, this sequence is written as a column (called a Fock column). This construction makes sense when k is a continuous parameter. In quantum mechanics textbooks, Fock space is usually defined as a space consisting of columns of symmetric functions.

2.5. Hamiltonians Preserving the Number of Particles

In quantum theory, the operator
N ^ = a ^ k * a ^ k = d λ a ^ * ( λ ) a ^ ( λ ) ,
which is a sum over all k (or integral, if there is a continuous index) plays an important role. This operator has the physical meaning of the number of particles or number of quanta when dealing with oscillators. If there is an eigenvector X of N ^ with some eigenvalue N, the operators a k * act on X, increasing the eigenvalue by 1; on the contrary, a k is decreased by 1 (creation or destruction of a particle). This means that the operator preserving the number of particles must contain the same number of creation and annihilation operators.
Let us consider quadratic Hamiltonians possessing this property.
The quadratic Hamiltonians preserving the number of particles are very important, as near the minimum energy state (ground state) they play the main role. Let us take a quadratic Hamiltonian which conserves the number of particles:
H ^ = d x d y A ( x , y ) a ^ * ( x ) a ^ ( y ) = a * , A a .
It contains products of type a * a , while products of type a * a * , a a do not appear. If the operator A has only a discrete spectrum, we can write the Hamiltonian in the form
H ^ = ϵ k a ^ k * a ^ k ,
where a ^ k = d x ϕ k ( x ) a ^ ( x ) corresponds to the eigenfunction ϕ k ( x ) of the operator A.
Let us now take as x and y vectors in Euclidean space and assume that A is written in the form A = 1 2 m Δ + U ^ , which appears in the Schrödinger equation. This is an operator obtained from the quantization of the classical Hamiltonian of the form p 2 2 m + U ( x ) . In this case, the Hamiltonian H ^ describes a system of non-interacting nonrelativistic identical bosons. If we add nonquadratic terms conserving the number of particles, we obtain the Hamiltonian of the system of interacting nonrelativistic identical bosons.
It should be pointed out here that the content of this lecture is in a certain sense an explanation of how a mathematician could have guessed quantum mechanics, but did not. The physicists guessed instead, of course. Look at the logic involved. The mathematician knows that the observables in classical theory are simply functions on phase space. He knows that quantum mechanics is a somewhat deformed classical mechanics: while everything obeys classical laws, in certain situations there must be corrections. On this basis, he says: let us deform the commutative algebra of functions; we will leave associativity, but introduce noncommutativity. The simplest deformation that we have is a Weyl algebra. We know that the deformation is governed by a Poisson bracket and the Weyl algebra corresponds to the Poisson bracket with constant coefficients. The mathematician guesses that one should use a Weyl algebra, and after that, realizes that in the Weyl algebra one should consider the simplest Hamiltonian, which describes the situation near to the ground state. This is the quadratic Hamiltonian. The energy levels of this Hamiltonian are provided by the formula
ϵ k n ^ k ,
where n ^ k = 0 , 1 , 2 , are occupation numbers. This formula describes the energy levels of a system of non-interacting identical particles. There is no mystery in the appearance of identical particles. From the mathematician’s point of view, they must appear. While the non-identical particles do not necessarily exist, the identical ones always exist because the simplest quadratic Hamiltonian already describes identical particles.
In the nonrelativistic case, the one-particle energy levels ϵ k must be obtained by quantization of the classical one-particle Hamiltonian p 2 2 m + U ( x ) . By adding the non-quadratic terms, we obtain the Hamiltonian of the system of nonrelativistic identical bosons.

2.6. Representations of Weyl Algebra

We have studied Fock representations of Weyl algebras. Are there any other representations of this algebra? First of all, let us note that if we take a direct sum of two Fock representations, this will be a new representation. Because the Fock representation, as can be easily seen, is irreducible (i.e., there is no other representation inside it), the main question is the irreducible representations are.
This question is poorly formulated. In Hilbert space, operators a ^ k * , a ^ k are defined on a dense domain; however, this is not the whole Hilbert space. One can change the domain while leaving the operator the same. Thus, we can ask whether this is the same representation, or a different one. From a formal mathematical point of view, it is different. In fact, of course, it is the same. This problem always arises when dealing with unbounded operators. While it can be solved, it is better to deal with bounded operators acting in Hilbert spaces.
In a representation of Weyl algebra, one can consider the operators
V α = e i α k u ^ k ,
where the exponent is a linear combination of self-adjoint operators u ^ k satisfying the commutation relations
u ^ k u ^ l u ^ l u ^ k = i σ k , l .
When the coefficients in the exponent are chosen to be real, the exponent is a self-adjoint operator multiplied by i. We obtain a unitary operator V α , and a unitary operator is bounded; thus, it can be extended to the whole Hilbert space. It is convenient to use these operators instead of unbounded operators u ^ k . The a relation
V α V β = e i 1 2 α σ β V α + β .
is easy to check using the formula
e X e Y = e X + Y e 1 2 C ,
which is true if the commutator of operators X and Y is a number, denoted here by the letter C:
[ X , Y ] = C .
If we do not want to deal with unbounded operators, we can work with these unitary operators and with the exponential form of commutation relations (4). This is the exponential form of Weyl algebra. From the point of view of the physicist, it is the same algebra we dealt with previously; from the point of view of the mathematician, however, it is not quite the same.
There is a single irreducible representation of a Weyl algebra in the case of a finite number of degrees of freedom (finite number of generators). We provide a proof of this fact without using the exponential form of Weyl algebra, instead working with Hilbert spaces. Though it is not rigorous, from the point of view of a physicist it is acceptable.
The reasoning is as follows: consider the already mentioned particle number operator N ^ = a ^ k * a ^ k . In the case of a finite number of generators, it is a good operator. Let us take the eigenvector of this operator and start applying annihilation operators to it. A mathematician might ask how we know that there is such an eigenvector, but a physicist probably will not. If all annihilation operators provide zeros with respect to acting on the eigenvector, we can say that this is exactly the Fock vacuum we need. If there is such an operator which does not provide zero, then we repeatedly apply annihilation operators until we have a vector | 0 such that all annihilation operators provide zeros when acting on it (i.e., a k | 0 = 0 ).
Such a vector necessarily exists because the particle number operator is positive definite and the annihilation operators decrease the number of particles. After that, we take the subrepresentation containing the vector | 0 . We repeatedly apply all operators a ^ k * to the vector | 0 ; by taking linear combinations of the resulting expressions, we then have a space that is invariant with respect to all creation and annihilation operators. Taking a closure of this space, we obtain a subrepresentation of our representation in a Hilbert space, which will be a Fock representation because it contains a cyclic vector that is annihilated by all a ^ k . Although my representation contains a Fock representation, it is irreducible; hence, it coincides with Fock representation.
In the case of an infinite number of generators, this reasoning does not apply. We now explain how to construct an example of representation that is not equivalent to Fock representation. We will construct new operators, which we denote by the letter A ^ k in the Fock space. These are the same as annihilation operators minus a number:
A ^ k = a ^ k f k , A ^ k * = a ^ k * f ¯ k
(i.e., for each a ^ k we subtract a different number). The commutation relations that we need in Weyl algebra are satisfied. Thus, we again have a representation of canonical commutation relations, a representation of Weyl algebra.
Now, we will try to solve the equation A ^ k Θ = 0 . Its solutions are eigenvectors of the operators a ^ k . We will encounter these vectors many times; they are sometimes called Poisson vectors. It is easy to understand that the answer will be represented in the form
Θ = e f a ^ * | 0 ,
where f a ^ * = f k a ^ k * , or as a function e f a * if we represent elements of Fock space as functions of a * .
We can calculate the norm of a Poisson vector and the scalar product of two Poisson vectors. The scalar product is provided by an integral; this integral is Gaussian, and is easy to calculate. The norm of Θ is finite if the sum | f k | 2 is finite. If the norm is finite, then the vector Θ belongs to the Fock space and the representation is equivalent to the Fock representation, while if the norm is infinite there is no vector in the Hilbert Fock space which is annihilated by the operators a ^ k . Hence, this representation is not equivalent to the Fock representation. Moreover, it cannot contain a subrepresentation equivalent to the Fock representation. In the case of a finite number of degrees of freedom, | f k | 2 is always finite, and this reasoning provides nothing.
Now, we want to generalize this construction. We will consider the operators A ^ k defined slightly differently. Previously, we simply added f k ; here, we will take linear combinations as well:
A ^ k = Φ k l a ^ l + Ψ k l a ^ l * + f k
A ^ k * = Ψ ¯ k l a ^ l + Φ ¯ k l a ^ l * + f ¯ k
in such a way that the new operators A ^ k , A ^ k * satisfy the canonical commutation relations. This imposes conditions on the coefficients. The transition to A ^ k , A ^ k * is called a linear canonical transformation. The operators A ^ k , A ^ k * define a new representation of the Weyl algebra and, again, if the vector Θ which is the solution of the equation A ^ k Θ = 0 belongs to Fock space, then the new representation of CCR is equivalent to the Fock representation. If this condition is not met, then the new representation is not equivalent to the Fock representation (see [26] for details).
Linear canonical transformations are often useful; they and their analogs in the Fermionic case are sometimes called Bogoliubov transformations.

3. Lecture 3

3.1. Clifford Algebra and Grassmann Algebra

Weyl algebra has close relatives that are defined by the same formulas as Weyl algebra, only instead of commutators we consider anticommutators:
u k u l + u l u k = 2 h k l ,
where h k l is an invertible matrix. A unital associative algebra with generators obeying these relations is called a Clifford algebra.
A unital associative algebra with generators obeying
u k u l + u l u k = 0
is called a Grassmann algebra.
The Grassman algebra is not a special case of the Clifford algebra; here, we have a zero on the right-hand side instead of an invertible matrix. The elements of the Grassman algebra can be considered polynomials of anticommuting variables.
Let us start with the Grassmann algebra with anticommuting generators ϵ 1 , , ϵ n :
ϵ i ϵ j = ϵ j ϵ i .
It is denoted by Λ n . Every element of a Grassmann algebra can be represented as a sum of monomials with respect to ϵ i :
ω = α + i α i ϵ i + α i j ϵ i ϵ j + + α i 1 , , i k ϵ i 1 ϵ i k +
This representation is not unique; however, we can obtain a unique representation requiring antisymmetry of coefficients α i 1 , , i k . Another standard representation of ω is based on the remark that each element of a Grassman algebra can be uniquely written as a sum of monomials in such a way that in each monomial the indices increase:
ω = α + i α i ϵ i + i < j α i j ϵ i ϵ j + + i 1 < < i k α i 1 , , i k ϵ i 1 ϵ i k + + α 1 , , n ϵ 1 ϵ n .
This is obvious; we can use anticommutation relations to move the smaller indices to the left, and there cannot be two matching indices. If we take i = j in these relations, we find that the square of a generator is zero.
The Grassman algebra, like the usual algebra of polynomials, is Z -graded: Λ n = k Λ n k . This means that there is a notion of degree; the degree of each monomial is the number of generators in that monomial, and clearly the degrees add up when monomials are multiplied, just as for ordinary polynomials.
We have already said that there is a notion of degree ( Z -grading). From this Z -grading, we can obtain Z 2 -grading by saying that there are even and odd elements Λ n e v e n = k 0 Λ n 2 k and Λ n o d d = k 0 Λ n 2 k + 1 . The Z 2 -grading is more important because it governs multiplication. An even element commutes with anything. Two odd elements anticommute.
We can say that an element of a Grassman algebra is a function of anticommuting variables; it is automatically a polynomial because there are only a finite number of monomials in the case when we have a finite number of anticommuting variables. The analogy with functions is an important idea, as it suggests that there must be an analysis in Grassmann algebra, and indeed there is. One can define differentiation i = / ϵ i with respect to a variable ϵ i . To differentiate, you have to delete that variable. If there is no variable ϵ i in the monomial then the derivative is zero. For the case of anticommuting variables, we have the notions of left derivative and right derivative. For certainty, we consider the left derivative; this means that before deleting, the corresponding variable should be moved to the left, as follows,
i ( ϵ i ϵ i 1 ϵ i n ) = ϵ i 1 ϵ i n ,
i ( ϵ i 1 ϵ i n ) = 0
if i i k .
The notion of the derivative is related to the Leibniz rule. Here, too, there is a modification of Leibniz rule,
i ( ω ρ ) = ( i ω ) · ρ + ( 1 ) ω ¯ · ω · i ρ ,
where ω , ρ Λ n and ω has parity ω ¯ , that is, ω is either even ( ω ¯ = 0 ) or odd ( ω ¯ = 1 ). This is called a graded Leibniz rule. If this rule is satisfied, we speak of odd derivation; if the regular Leibniz rule is satisfied, we speak of even derivation.
There is an additional notion of integration : Λ n C . The integral of any monomial of non-maximum degree gives zero, and the integral of a monomial of maximum degree gives plus or minus one:
ϵ i 1 ϵ i k d n ϵ = 0 if k < n ,
ϵ 1 ϵ n d n ϵ = 1 .
In our notation, plus one is obtained when the generators are ordered in ascending order. It is easy to understand that the integral of a derivative is always zero:
( i ω ) d n ϵ = 0 .
This is because a derivative cannot contain a term of maximum degree. From this and Leibniz rule we can derive the rule of integration by parts:
( i ω ) · ρ d n ϵ = ( 1 ) ω ¯ ω · i ρ d n ϵ .
We would like to emphasize that when we defined a Grassman algebra Λ n , we fixed a system of generators ϵ 1 , , ϵ n . Naturally, it is possible to take another system of generators (this is analogous to a change of variables), in which case everything will change, both the notion of differentiation and the notion of integration. For the change of variables, there is an analog of the chain rule and an analog of the Jacobian.
We want to consider only the special case where the change of variables is linear:
ϵ ˜ i = A j i ϵ j .
It is easy to check that one can obtain the integral with respect to the new variables from the integral with respect to the old variables by multiplying the latter by ( det A ) 1 , where A stands for the matrix A j i . In conventional calculus, we multiply by det A .
The next observation is that in a smooth function f that depends on a real variable x R , one can substitute x by an even element ω of the Grassman algebra. To define f ( ω ) , we represent ω in the form ω = a + ν , where a is a number and ν is a nilpotent element (i.e., ν k = 0 for some k). Take the Taylor series expansion of the function f ( a + ν ) with respect to the nilpotent part:
f ( ω ) = f ( a ) + f ( a ) 1 ! ν + + f l ( a ) l ! ν l +
and note that because of the nilpotency this Taylor series has a finite number of terms.
In particular, we can consider e ω . As an example, consider the exponent of a quadratic expression:
e λ 1 ϵ 1 ϵ 2 + λ 2 ϵ 3 ϵ 4 + + λ k ϵ 2 k 1 ϵ 2 k = e λ 1 ϵ 1 ϵ 2 e λ k ϵ 2 k 1 ϵ 2 k =
( 1 + λ 1 ϵ 1 ϵ 2 ) ( 1 + λ k ϵ 2 k 1 ϵ 2 k ) ,
it follows that
e λ 1 ϵ 1 ϵ 2 + + λ k ϵ 2 k 1 ϵ 2 k d 2 k ϵ = λ 1 λ k .
In the more general case where ω = 1 2 a i j ϵ i ϵ j , it is possible to represent the antisymmetric nonsingular matrix a in block-diagonal form by changing the variables. This allows us to calculate the Gaussian integral:
e ω d n ϵ = ( det a ) 1 / 2 .
The answer is almost the same as in the usual case, where we had ( det a ) 1 / 2 instead of ( det a ) 1 / 2 .
We can consider functions that depend both on commuting and anticommuting variables. This simply means that in the expression (5) for a general element of Grassmann algebra the coefficients can be considered as functions on commuting variables. Either polynomial functions or smooth functions can be considered. By definition, functions of m commuting and n anticommuting variables are functions on the superspace R m , n .
One can say that polynomial or smooth functions of commuting variables x 1 , , x m and anticommuting variables ϵ 1 , , ϵ n are elements of algebra C [ x 1 , , x m ] Λ n or algebra C ( R m ) Λ n .
We can define supervarieties by assuming that commuting and anticommuting variables obey some equations. It is important that all summands in these equations are of the same parity. For example, we can impose the condition x 1 + ϵ 1 ϵ 2 = 0 . If the equations are polynomial, we obtain algebraic supervarieties. This is a generalization of the ordinary algebraic varieties specified by polynomial equations in ordinary linear space.
Now, we would like to discuss the notion of a point in a supervariety.
Let us start with algebraic varieties determined by polynomial equations with integer coefficients. In this case, we can think of the variables x i as elements of any commutative ring. This means that we can consider the same variety over different fields or over different rings. We define a notion of a point of an algebraic variety over some ring as a solution for defining equations with variables belonging to the ring. For example, if we consider, say, a circle of imaginary radius x 2 + y 2 + 1 = 0 , such an object has no real points, but has points over complex numbers.
When we consider supervarieties we have the similar situation of a Λ -point of supervariety as a solution for defining equations with variables belonging to Grassmann algebra Λ . It is necessary to substitute an even element of Grassmann algebra for an even variable and an odd element for an odd variable. In other words, it is necessary to preserve the parity when introducing Λ -points.
The notion of the Λ -point is very convenient. For example, it allows us to provide a very simple definition of the notion of a Lie superalgebra. A Lie superalgebra is a Z 2 -graded algebra with some generalization of the Jacobi rule. It is not necessary to memorize this rule; we just need to say that the Λ -points of a Lie superalgebra should constitute an ordinary Lie algebra.
We do not talk about this in detail here. What we have discussed is a small piece of what is called supermathematics.

3.2. Representations of Clifford Algebra

The following reasoning, as promised, exactly repeats the reasoning of the previous lecture, only instead of “Weyl” we say “Clifford” and instead of “commutator” we say “anti-commutator”.
According to the definition of Clifford algebra, the anticommutator of generators is a symmetric nonsingular matrix h k l = h l k :
u ^ k u ^ l + u ^ l u ^ k = 2 h k l .
The same relations appear in the Dirac equation; gamma matrices are generators of a Clifford algebra.
Let us consider a Clifford algebra with involution, assuming that the generators a ^ k , a ^ k * are obtained from each other by involution and satisfy the following equations:
a ^ k a ^ l + a ^ l a ^ k = 0 , a ^ k * a ^ l * + a ^ l * a ^ k * = 0 , a ^ k a ^ l * + a ^ l * a ^ k = δ k l ,
which are called canonical anticommutational relations (CAR). They are obtained from the canonical commutation relations by replacing commutators with anticommutators. We note that the anticommutation relations presented here are exactly the anticommutation relations that are satisfied by differentiation and multiplication with respect to generators of Grassman algebra.
As in the case of Weyl algebras, we can consider Clifford algebras with an infinite number of generators or with generators a ^ ( k ) , a ^ ( k ) * that depend on a continuous parameter and satisfy the relations:
a ^ ( k ) a ^ ( l ) = a ^ ( l ) a ^ ( k ) , a ^ * ( k ) a ^ * ( l ) = a ^ * ( l ) a ^ * ( k ) ,
a ^ ( k ) a ^ * ( l ) + a ^ * ( l ) a ^ ( k ) = δ ( k , l ) .
In Clifford algebra with generators a ^ k , a ^ k * , we have the notion of a normal form. Just as in the case of Weyl algebra, the generators a ^ k * should be moved to the left and a ^ k to the right; however, there is a small peculiarity. Previously, starting with the normal form, we could define the Wick symbol simply by removing the hats and treating a k , a k * as complex variables. We cannot do that here, as we get zero; instead, we can define the Wick symbol by saying that by removing hats we obtain anticommuting variables. In other words, from an element of Clifford algebra we can obtain a polynomial of anticommuting variables.
Consider the formal Hamiltonian
H ^ = Γ m , n ( k 1 , , k m , l 1 , , l n ) a ^ k 1 * a ^ k m * a ^ l 1 a ^ l n .
The Hamiltonian must always be considered an even element. When the number of variables is infinite, the Hamiltonian is usually not an element of a Clifford algebra; however, commutators with generators make sense if the same conditions are imposed as for a Weyl algebra.
The definition of the Fock representation of the Clifford algebra exactly repeats the one for the Weyl algebra; we require the existence of a cyclic vector | 0 for which the condition a ^ k | 0 = 0 is satisfied. Due to the cyclicity condition, one can obtain a basis by applying the operators a ^ k * to the vector | 0 . All elements of pre-Hilbert Fock space will be linear combinations of monomials of the form Π ( a ^ k * ) n k | 0 .
The only difference is that the numbers n k (occupation numbers) in these monomials can only be zero or one ( n k = 0 , 1 ) because a ^ k * 2 = 0 . This is what is called the Pauli principle (hence, we are dealing with fermions).
Again, the elements of this basis are eigenvectors of any Hamiltonian of the form
ϵ k a ^ k * a ^ k
and the eigenvalues are provided by exactly the same formula as in the bosonic case: n k ϵ k .
The operators a ^ k * are creation operators and a ^ k are annihilation operators for the same reasons as in the bosonic case. One can say that the Hamiltonian ϵ k a ^ k * a ^ k describes non-interacting fermions.
Recall that in the bosonic case, in order to obtain a representation of elements of the Fock space by polynomials we removed the hats in the monomials and obtained a polynomial of complex variables. Now, we want to do a similar thing, namely, to remove the hats and obtain a polynomial of anticommuting variables. The form of the scalar product is exactly the same, except that the integration is over anticommuting variables. As in the case of Weyl algebra, in this representation the operator a ^ k * acts as multiplication and a ^ k acts as differentiation; this provides the correct anticommutation relations.
The only thing left now is to check whether multiplication and differentiation are adjoint operators in the scalar product
F , G = d a * d a F ( a * ) G ( a * ) * e a * a .
This can be done by applying integration by parts.
If we consider polynomials alone, we obtain a pre-Hilbert Fock space, and can take a completion to obtain a Hilbert space. In the case of a finite number of degrees of freedom, it is not necessary to take the completion, as there are only polynomials; however, in the case of an infinite number of degrees of freedom, the completion is necessary if we want to work in a Hilbert space.
In Fock space, vector can be represented as a sum of monomials with antisymmetric coefficients:
n k 1 , , k n f n ( k 1 , , k n ) a ^ k 1 * a ^ k n * | 0 ,
while for the Weyl algebra the coefficients are symmetric. If we are working in Hilbert space, these sums can be infinite. In other words, a point in fermionic Fock space can be considered as a sequence of antisymmetric functions, whereas in Fock representation of a Weyl algebra it is a sequence of symmetric functions. This is the standard representation from textbook quantum mechanics. This representation works when k is a continuous parameter.
We can again consider the operator
N ^ = a ^ k * a ^ k = d λ a ^ * ( λ ) a ^ ( λ )
(number of particles). Again, a ^ k * increases the number of particles by one, while a ^ k decreases it.
Let us now move to the consideration of operators which conserve the number of particles, as in nonrelativistic quantum mechanics. The simplest Hamiltonian is the quadratic Hamiltonian.
H ^ = d x d y A ( x , y ) a ^ * ( x ) a ^ ( y ) = a * , A a .
If the operator A has a discrete spectrum, then by diagonalizing it we obtain the operator
H ^ = ϵ k a ^ k * a ^ k ,
where ϵ k are eigenvalues, ϕ k ( x ) are eigenfunctions of the operator A, and
a ^ k = d x ϕ k ( x ) a ^ ( x ) .
In the last lecture, we said that by taking A = 1 2 m Δ + U ^ we obtain a system of non-interacting nonrelativistic bosons. With the same operator and now using the canonical anticommutation relations, we obtain a system of non-interacting nonrelativistic fermions. The only difference is that we have to assume that this operator acts on multicomponent functions of x R 3 . While it has exactly the same form, we should add summation over discrete indices.
Which canonical relations should be taken depends on how the group of rotations of three-dimensional space acts on the wave functions. The action of this group on discrete indices determines the spin of the particle. The case of half-integer spin corresponds to fermions, in which case we must quantize using Clifford algebra, while in the case of integer spin we obtain bosons, and should use Weyl algebra. Put another way, if the representation is irreducible then everything is determined by the number of indices. If the number is odd, we are dealing with Weyl algebra, and if it is even with Clifford algebra.
In order to describe interacting particles in nonrelativistic quantum mechanics, it is necessary to add higher-order terms with an equal number of creation and annihilation operators; these terms preserve the number of particles.
In the case of a finite number of degrees of freedom, there exists a single irreducible representation of the Clifford algebra, which is isomorphic to the Fock representation. The proof of the irreducibility of the Fock representation and its uniqueness is the same as in the case of a Weyl algebra, with the difference being that in the case of Clifford algebra the proof is rigorous. In the previous case, it was not rigorous because the operators a ^ k and a ^ k * were unbounded operators; here, all these operators are bounded. This follows from the relation a ^ k a ^ k * + a ^ k * a ^ k = 1 , in which both summands are positive definite. Their sum is equal to one; hence, each of the operators is bounded.
If we have an infinite number of degrees of freedom, the number of operators a ^ k and a ^ k * is infinite. We can consider the canonical transformation by introducing new generators that satisfy the anticommutation relations. It is very easy to construct examples of canonical transformations. The anticommutativity conditions are symmetric with respect to a ^ k and a ^ k * . If we swap the creation and annihilation operators, the anticommutator does not change.
In Fock space, we take operators defined by the formula A ^ k = a ^ k for k I , A ^ k = a ^ k * for k I . The operators A ^ k , , A ^ k * obey canonical anticommutation relations; hence, they define a representation of the Clifford algebra. For this representation to be a Fock representation, we must have a cyclic vector Θ which obeys A ^ k Θ = 0 .
Finding a solution to the equation for Θ is very easy. We define Θ acting on | 0 with those operators a ^ k * that were changed ( k I ). When acting on Θ with any operator A ^ k = a ^ k * with k I we get zero, because operators a ^ k * appears twice with the same index. The operators A ^ k = a ^ k where k I also give zero, as they can be transferred to | 0 . Thus, we have a monomial Θ which satisfies the relation A ^ k Θ = 0 for all k.
If we change only a finite number of operators, then Θ is a finite monomial, and belongs to Fock space. If, however, we change an infinite number of operators, we obtain something that does not belong to Fock space at all, namely, a monomial of infinite degree. The new representation is not equivalent to the Fock representation, as it does not have the cyclic vector that we need. Thus, we have examples of non-equivalent representations. This construction goes back to Dirac (the famous Dirac sea).
Let us consider linear canonical transformations of the form
A ^ k = Φ k l a ^ l + Ψ k l a ^ l * ,
A ^ k * = Ψ ¯ k l a ^ l + Φ ¯ k l a ^ l * .
Unlike the Weyl case, we cannot add numerical terms here, though everything else is the same. If we require that new operators satisfy canonical anticommutation relations, we obtain a new representation of the Clifford algebra (see [26]).
That is more or less all we wished to say about Clifford algebra.
In conclusion, we want to add that the group of automorphisms of a Clifford algebra is isomorphic to the orthogonal group. From this, we have what is called a spinor representation of the orthogonal group.

3.3. Statistical Physics

Turning to a new topic, let us recall some notions from statistical physics.
In both classical and quantum statistical mechanics there is the notion of an equilibrium state, and in both cases it is defined as a state of maximum entropy under given conditions. The expression “conditions” may have different meanings in different situations. If a state is represented by a density matrix, the entropy of the state is provided by the formula
S = Tr K log K .
If the density matrix is diagonal, then the diagonal elements p i of the matrix are interpreted as probabilities, and this formula provides the usual expression for the entropy of probability distribution:
S = p i log p i .
If the Hamiltonian H ^ is given, we can fix the average energy (the expectation value of energy). It is possible to fix an admissible energy interval, in which case we obtain a microcanonical distribution; if we fix the average energy E = Tr H ^ K , then we have a canonical distribution. Maximizing the entropy for a given average energy, we obtain the density matrix
e β H ^ Z .
To calculate the constant Z, note that by definition the density matrix must have a trace equal to 1; therefore, this constant must be equal to the trace of the operator e β H ^ :
Z = Tr e β H ^ .
This expression is called the statistical sum or partition function.
If the spectrum H ^ is discrete with eigenvalues E i , then
Z = e β E i .
The physical meaning of β is the inverse temperature: β = 1 T . The expression for the statistical sum shows that for β or, which is the same thing, T 0 , only the term with minimum energy (the term corresponding to the ground state) contributes to this expression (here, we assume that the ground state is non-degenerate). We can see that in the case of zero temperature the equilibrium state is a pure state (ground state).
The derivation of Formula (6) is based on the method of Lagrange multipliers. We assume that the average energy Tr H ^ K = E is fixed; in addition, we know that the trace of a density matrix is equal to one: Tr K = 1 . Introducing the Lagrange multipliers β and ζ , we see that we should calculate stationary points of the expression
L = Tr K log K β Tr ( H ^ K E ) ζ ( Tr K 1 ) .
These points obey the equation
log K 1 β H ^ ζ = 0 .
To verify this we must use the formula
δ Tr ϕ ( K ) = Tr ϕ ( K ) δ K
for the variation of the trace of a function of K.
It is sufficient to check this formula for the case ϕ ( K ) = K n . In the variation of the function K n , we have n terms due to noncommutativity; when we take the trace, all these terms become identical and we obtain an answer agreeing with (7).
Although the statistical sum Z itself has no direct physical meaning, many physical quantities can be expressed in terms of Z . In particular, we can calculate the average energy
E = H ¯ = 1 β log Z β
and entropy
S = β E + log Z .
Free energy is defined by the formula F = E T S , and is expressed in terms of the statistical sum as follows:
F = T log Z .
Free energy is convenient because instead of calculating the maximum entropy one can search for the minimum free energy. This immediately follows from the method of Lagrange multipliers.
As a rule, the physical quantities can be obtained as follows: we take some Hamiltonian, add something to it, and see what we get in the limit when the added term tends to zero. In particular, if the Hamiltonian H ^ changes a little, as follows:
H ^ ( λ ) = H ^ + λ A + ,
then the new value of the statistical sum is
Z ( λ ) = Z + ( β ) λ Tr A e β H ^ + .
The derivative of log Z with respect to λ at λ = 0 (and hence the derivative of free energy at this point) is controlled by the average value of the added term
A ¯ = Tr A e β H ^ Z .
We will use the alternative notation A for this expression.
Namely,
A ¯ = T log Z λ = F λ
with the derivatives calculated at the point λ = 0 .
Now let us move on to our main goal, correlation functions. A correlation function is simply an average of a product of some physical quantities A 1 A n . We can assume that these physical quantities are time-dependent, i.e., that they are Heisenberg operators and satisfy Heisenberg equations. Then, the expression A 1 ( t 1 ) A n ( t n ) is called a correlation function. We can consider correlation functions for any state, not necessarily an equilibrium state; however, if it is an equilibrium state then we write the inverse temperature value as the index:
A 1 ( t 1 ) A n ( t n ) β .
If the Hamiltonian depends linearly on a set of parameters λ 1 , , λ k , we can calculate the correlation functions by differentiating the free energy F. For example, if H ^ = H ^ 0 + λ 1 A 1 + λ k A k , then we have
2 F λ i λ j = A i A j A i A j
The derivatives are calculated at the point λ i = 0 . The RHS of (8) is called a truncated correlation function. Higher truncated correlation functions can be defined as higher derivatives of F . These will be important later.
We want to note that all statements about the statistical sum and related things are applicable very often in the case of a finite number of degrees of freedom, while in the case of infinite number of degrees of freedom they usually are not applicable. In nonrelativistic quantum mechanics, they are applicable when in the case of a finite volume. In cases of infinite volumes, they do not work, asthe statistical sum is not well defined and maximum entropy is infinite. One should consider the statistical sum and correlation functions first in a finite volume, and after that take the limit of correlation functions. It is not possible to work directly with an infinite volume.
We should note in passing that for an infinite volume one usually takes a limit not with a fixed number of particles, but with a fixed density of particles. In other words, when passing to the limit the number of particles changes in proportion to the volume, then the density of particles remains constant.
Suppose now that the set of correlation functions in the infinite volume is obtained; what should be done with it? There is no Hilbert space in which the operators A entering the definition of correlation functions are defined in infinite volume, but there are correlation functions. Usually in such a case one can construct a Hilbert space from these correlation functions, applying some analog of the GNS construction. Physicists usually do this implicitly. They simply say, “We have an equilibrium state; it can be represented by a vector in a Hilbert space or a density matrix in a Hilbert space, and there these operators A that act”. In fact, it is necessary to apply a construction which in axiomatic quantum field theory is called “reconstruction theorem”. There, the role of correlation functions is played by Wightman functions.
Now let us turn to the question of how one can deal with equilibrium states in the algebraic approach in a situation when it is impossible to use the maximum entropy principle. Here, we can apply what is called the Kubo–Martin–Schwinger condition (KMS):
A ( t ) B β = B A ( t + i β ) β .
This is the condition on the correlation functions of the observables A and B in the equilibrium state. It is easy to derive in the case of finite-dimensional Hilbert space, because in this case everything is well-defined. In such a case, there is a Heisenberg operator which involves e i H t in the definition and there is a density matrix in which e β H appears. We can assume that the time t in the expression for the evolution operator is a complex number, and when it is purely imaginary with I m ( t ) > 0 , then we obtain the equilibrium state density matrix (up to a constant factor). This is an important observation: in some sense, we obtain statistical physics from quantum dynamics in imaginary time. This is what is called “Wick rotation”. In the finite-dimensional case we can consider any complex time; if the dimension is infinite, this is not true. However, if the correlation function B A ( t ) β can be continued analytically into the strip 0 I m ( t ) β , the KMS condition makes sense.
The KMS condition does not use the notion of entropy; you only need to know correlation functions. It works in an infinite volume. One can consider the KMS condition as a definition of an equilibrium state. The equilibrium state, as we defined it earlier, is almost always unique, and there is room for the equilibrium state to be non-unique. This non-uniqueness of the equilibrium state is related to the presence of phase transitions. The KMS condition is a replacement for the maximum entropy condition in the framework of algebraic quantum theory.
Now let us look at some examples.
The simplest example is the quadratic Hamiltonian.
A positive definite quadratic Hamiltonian can be reduced to the form
H ^ = ϵ k a ^ k * a ^ k ,
describing non-interacting bosons. From the formal point of view, non-interacting bosons are the same as a multi-dimensional harmonic oscillator, for which one can easily calculate the equilibrium density matrix, statistical sum, average energy, etc. The statistical sum is equal to a product of statistical sums for different values of the index k. For each k, one can sum the geometric progression and then take the product:
Z = Π 1 1 e β ϵ k .
The average energy is calculated by the formula
E = H ¯ = ϵ i n ¯ i ,
where n ¯ i = ( e β ϵ i 1 ) 1 are the average occupation numbers.
For the case of fermions, there is no significant difference:
Z = Π ( 1 + e β ϵ i ) ,
H ¯ = ϵ i n ¯ i ,
where n ¯ i = ( e β ϵ i + 1 ) 1 . The main difference is that the expression for the occupation numbers we have a plus rather than a minus, meaning that the average occupation numbers are always less than one.

4. Lecture 4

4.1. Adiabatic Approximation: Decoherence

We begin this lecture by explaining what happens when the Hamiltonian H ^ ( t ) depends on time but changes slowly (adiabatically). We assume that all energy levels E n ( t ) of the Hamiltonian H ^ ( t ) where t is fixed are different and depend on t continuously and even smoothly. We denote the corresponding eigenvectors by ϕ n ( t ) . We assume that the time-dependent vector ϕ n ( t ) changes slowly and its derivative over t can be neglected. We will show that if we start with the eigenvector of the Hamiltonian H ^ ( t = 0 ) , then during the evolution controlled by the slowly changing Hamiltonian H ^ ( t ) it remains an eigenvector, but will be an eigenvector of another Hamiltonian (of the Hamiltonian H ^ ( t ) , where t is fixed). We verify this in the adiabatic approximation
U ^ ( t ) ϕ n ( 0 ) = e i α n ( t ) ϕ n ( t ) ,
where the phase factor e i α n ( t ) is defined by the equation
d α n ( t ) d t = E n ( t ) .
To check this, we differentiate (9) and neglect the derivative of ϕ n ( t ) . This reasoning is not completely accurate, as we assume that ϕ n ( t ) changes slowly over time, which is not obvious because the eigenvector is defined only up to a constant factor.
Let us carry out a more careful consideration. Let us assume that the Hamiltonian H ^ ( g ) depends on some parameter or many parameters, which we denote by g. Suppose that the eigenvectors ϕ n ( g ) and the eigenvalues E n ( g ) depend smoothly on g. Let us assume that the parameter g depends on time in such a way that the derivative of g with respect to t can be neglected. The standard choice is as follows: let us fix a function g ( t ) and construct a family g a ( t ) = g ( a t ) . The corresponding Hamiltonians H ^ a ( t ) = H ^ ( g ( a t ) ) and their eigenvectors ϕ n ( g ( a t ) ) vary slowly for small a. Obviously, the derivative of these eigenvectors with respect to t vanishes in the limit a 0 ; this remark allows us to justify the above reasoning.
In order to generalize this reasoning to density matrices, we should note that the dependence of the density matrix K ( t ) on time is determined by the equation
d K d t = H ( t ) K ( t ) = 1 i [ H ^ ( t ) , K ( t ) ] ,
where H ( t ) is a commutator with the Hamiltonian H ^ ( t ) (up to a constant factor). Now, we have to consider the eigenvectors ψ m n ( t ) of the operator H ( t ) . They can be expressed in terms of the eigenvectors of the Hamiltonian H ^ ( t ) , which we have denoted as ϕ n ( t ) :
ψ m n ( t ) x = x , ϕ n ( t ) ϕ m ( t ) .
In the representation where the operator H ^ ( t ) is diagonal, these will be matrices having only one non-zero entry equal to 1. Exactly the same reasoning as above determines the evolution of these eigenvectors:
U ( t ) ψ m n ( 0 ) = e i β m n ( t ) ψ m n ( t ) , d β m n ( t ) d t = E m ( t ) E n ( t ) .
We differentiate with respect to t and neglect the derivative of the vector ψ m n ( t ) . A very important remark here is that when m = n , we can assume that the phase is zero: β m m = 0 .
Let us represent the same in a slightly different notation, namely, let us take the density matrix and write it as a sum over the eigenvectors with some coefficients k m n :
K = k m n ψ m n .
Instead of considering the evolution of the eigenvectors ψ m n , we can consider the evolution of the coefficients k m n ( t ) . The formulas are the same: the coefficients k m n ( t ) receive phase factors
k m n ( t ) = e i β m n ( t ) k m n , β m n ( t ) = 0 t ( E m ( τ ) E n ( τ ) ) d τ .
If the adiabatic Hamiltonian H ^ ( t ) is such that at time T it returns to what it was at time zero ( H ^ ( T ) = H ^ ( 0 ) ), then the diagonal elements of matrix K do not change, while the non-diagonal elements do, as they are multiplied by a phase factor.
Let us now fix the Hamiltonian H ^ , describing a molecule or an atom or something bigger, i.e., any quantum system. Let us assume that the interaction with the environment changes the Hamiltonian; while the new Hamiltonian H ^ ( t ) may depend on time, we can assume that it changes slowly. We can imagine that a cosmic particle flies not very close to our molecule. The particle generates an electric field; this means that the Hamiltonian governing the molecule changes. If this particle flies far enough, we can assume that the change is adiabatic.
My favorite example is this: you are doing an experiment, and in the next room someone turns on a microwave. Your experiment is now affected by an adiabatic electric field. Another example: we know that we live in a world where there is microwave cosmic radiation, which generates an electromagnetic field. It is very small, but is there nonetheless.
While we do not know these adiabatic perturbations, we know that the diagonal elements of the density matrix are not affected by adiabatic perturbation, and the non-diagonal elements of the density matrix acquire phase factors which we, of course, do not know, because we do not know the perturbation.
What we have said above can be interpreted in a different way. One can consider linear combinations of the form α 0 ϕ 0 + α 1 ϕ 1 of two or more eigenvectors of the Hamiltonian H ^ = H ^ ( 0 ) . In the evolution of this state, phase factors will appear: α k ( t ) = e i E k t α k . These phase factors always appear, and are predictable, but if an adiabatic perturbation is imposed then the phase factors become unpredictable (absolute values of coefficients α k ( t ) remain constant). Before, the two eigenvectors were coherently changing over time, while now this coherence has disappeared. This is what is called decoherence.
Now, we want to explain how from these very simple considerations one can obtain a standard recipe for probabilities in quantum theory. Let us assume that the same molecule interacts with the environment and there is a random adiabatic perturbation H ^ ( t ) of the Hamiltonian H ^ . This means that there is a Hamiltonian that depends on some parameters λ Λ and there is some probability distribution on Λ . Let us assume that the adiabatic perturbation acts in the time period from 0 to T. Then, as we said before, the entries k m n λ , T ) of the density matrix K λ ( T ) in the H ^ -representation receive the phase factors C m n ( λ , T ) . The phase factors C m n ( λ , T ) are equal to 1 for the diagonal entries and non-trivial for other entries. Because the Hamiltonian is random, the density matrix should be averaged over the perturbation, that is, the phase factors for the non-diagonal entries should be averaged. It is quite clear that averaging these phase factors results in something less than 1 by absolute value. By imposing conditions, it is easy to check whether the average of the non-diagonal matrix entries will be equal to zero.
The formal proof is as follows. We have already said that we can include our Hamiltonian into some family of Hamiltonians H ^ ( g ) , where g belongs to some parameter set denoted by Λ ( g Λ ). Let us assume that all these perturbations are such that g ( 0 ) = 0 , g ( 1 ) = 0 and that the dependence of the Hamiltonian on time is defined by the formula H ^ ( g ( t ) ) . We can define the adiabatic Hamiltonian as follows:
H ^ α ( t ) = H ^ ( g ( α t ) ) ,
where α 0 . This will stretch time. If it varied from zero to one before, now it varies from zero to T = α 1 . If we denote the eigenvalues of the Hamiltonian H ^ ( g ) by E n ( g ) , the values of the phase factors e i β m n ( t ) at t = T will be determined by the following formula:
β m n = 0 T d τ ( E m ( g ( α τ ) ) E n ( g ( α τ ) ) ) .
By substituting α τ = τ , we obtain
β m n = 1 α 0 1 d τ ( E m ( g ( τ ) ) E n ( g ( τ ) ) .
Now, we can use the Riemann–Lebegue lemma
e i k x ρ ( x ) d x 0
at k if ρ ( x ) is absolutely integrable.
If the probability distribution on the set Λ is not too bad (with a decent probability density distribution), then the coefficients k m n of the density matrix will vanish if m n . As a result, the density matrix K becomes diagonal in H ^ -representation due to interaction with random adiabatic perturbation. If the initial density matrix corresponds to a pure state, this effect is known as the collapse of the wave function.
Let us now denote the diagonal matrix elements by the letters p n . In the usual approach, p n are the probabilities of different states. In our case, the averaged density matrix is a mixture of pure states with probabilities p n . This is the usual formula for the probabilities of different pure stationary states (i.e., of eigenstates of the Hamiltonian H ^ ) in a given mixed state. In particular, if we started with a pure state we obtain the standard formulas of the theory of measurements for the probabilities of different energy levels.
Note that decoherence and the collapse of the wave function are usually derived from interaction with a macroscopic classical system. Here, the same statements are derived from the random adiabatic interaction. The Planck constant and classical systems are not used in the proof at all.

4.2. Geometric Approach to Quantum Theory: Decoherence

Thus far, we have considered ordinary quantum mechanics; now, we will consider the geometric approach to quantum theory. We have already said that there is an algebraic approach, where the starting point is an algebra of observables, i.e., an associative algebra with involution. Self-adjoint elements of the algebra are physical observables. States are defined as positive linear functionals on this algebra, that is, the notion of state is secondary. Let us assume that the notion of state is primary. We start with some set of states. Let us ask ourselves what should we require of it.
The first thing we need is the notion of a mixture of states, i.e., not a mixed state but a mixture of states such that any states can be mixed with some probabilities with some weights. This is an absolutely necessary requirement, and is not specific to physics. We can, for example, consider mixed strategies in economics or in game theory. In order to be mixable, the set must be convex. Let us assume that a convex set is a subset of some vector space; then, the notion of mixing will be defined in an obvious way: if x i are points of the convex set and p i are non-negative numbers whose sum is equal to 1, then the mixture of these points is provided by the formula x ¯ = p i x i , where the numbers p i are treated as weights or probabilities.
What else is required? If we mix a finite number of states, then nothing else is needed. If we want to mix an infinite number of states, however, then we need the notion of a limit. In other words, we need a topology in the vector space L in which the convex set C 0 lies. This must be a topological vector space. For simplicity, we assume that it is a Banach space (i.e., a complete normed space). We assume that C 0 is a closed convex subset of L ; then, we can mix any number of states.
In addition, we require that the set of states be bounded. This is essentially the only requirement necessary for the development of the theory.
We need the notion of an evolution operator σ ( t ) , which transforms a state at time zero to a state at some time t; after all, what is the most important thing not just in physics, but in science? We wish to be able to predict the future. The evolution operator should map the set of states into itself. The time t can be negative; hence, the evolution operator should be an invertible transformation of the set of states.
Let us assume that the evolution operator is a linear transformation. More precisely, we assume that it can be extended as a linear operator to a vector space L containing the set C 0 . Let us introduce the notion of a group of automorphisms of the set of states C 0 . This is a group of invertible linear operators in the ambient space L which map the set C 0 onto itself. It is natural to assume that the evolution operator will be an automorphism. Sometimes, we require that evolution operators belong to some subgroup V of the group of automorphisms. This is necessary, for example, in classical mechanics.
Next, we should write the equation of motion. Usually, we think that we know the change in the system over an infinitesimal time. This is called the equation of motion. In the most general form, the equation of motion can be written as follows:
d σ d t = H ( t ) σ ( t ) ,
where H ( t ) is a linear operator.
Formally, we can say that H ( t ) is determined from this equation; in physics, we usually assume that the operator H ( t ) is known and that we need to find the evolution operator. Let us call the operator H ( t ) the “Hamiltonian” (in quotation marks). If the “Hamiltonian” does not depend on time, the evolution operator is an exponent of the “Hamiltonian” ( σ ( t ) = exp ( H t ) ). Everything is just as it is in ordinary quantum mechanics; we simply did not write the imaginary unit (for a time-independent operator H in Banach space, the exponent can be defined as a solution of Equation (10).
From Equation (10), it follows that the “Hamiltonian” belongs to the Lie algebra of the group V (it is the tangent vector of the group V in the unit element of the group); the group V is, generally speaking, infinite-dimensional, meaning that the notion of a tangent vector depends on the choice of topology in the group, though we do not pay attention to these subtleties here. It is reasonable to require that Equation (10) has a solution for a time-independent “Hamiltonian”.
Now, we have the notion of states and have the equations of motion; we additionally need the notion of an observable. Before, we were moving from observables to states; now, we want to move from states to observables. What is an observable? First of all, we must have an operator A satisfying the same conditions as on the “Hamiltonian”. In other words, the exponent σ A ( t ) = exp ( A t ) , which can be treated as a one-parameter subgroup in V , should be well defined. In addition, we need a functional a which is invariant with respect to the operators σ A ( t ) = exp ( A t ) . This condition is equivalent to the condition a ( A x ) = 0 . The function a determines the expected value of the observable.
In particular, in the case of ordinary quantum mechanics V is a group of unitary operators. It acts on density matrices by the formula U ( K ) = U ^ K U ^ 1 , where U ^ is a unitary operator. If A ^ is a self-adjoint operator (not necessarily bounded), then the exponent e i A ^ t can be treated as a one-parameter subgroup of the group V . In this way we obtain all one-parameter subgroups continuous in strong topology (Stone’s theorem). This means that we can identify self-adjoint operators with elements of the Lie algebra of the group of unitary operators. There are small difficulties here due to the fact that the commutator of unbounded self-adjoint operators is not always well-defined. These difficulties always arise and have no relation to the geometric approach. To overcome these problems, we assume that in topological Lie algebra the commutator is defined only on a dense subset.
The “Hamiltonian” is expressed in terms of the self-adjoint operator H ^ by the formula
H ( K ) = 1 i [ H ^ , K ] ,
where K is the density matrix; we assume that = 1 .
Similarly, for any self-adjoint operator A ^ we define an operator on density matrices by the formula
A ( K ) = 1 i [ A ^ , K ] .
An observable is a pair ( A ^ , a ) , where A ^ is a self-adjoint operator acting on the density matrices by Formula (11) and the functional on the density matrices a is provided by the formula a ( K ) = Tr ( A ^ K ) .
In the algebraic approach, the group V consists of automorphisms of the associative algebra A with involution (*-algebra); in particular, a self-adjoint element A of the algebra A defines an infinitesimal automorphism (a Lie algebra element of the group V ) by the formula α A ( X ) = i [ A , X ] . Recall that a linear functional ω on algebra A is called positive if ω ( X * X ) 0 for all X A ; we denote the set of all positive functionals by C .
The set of states C 0 consists of normalized positive functionals (positive functionals satisfying the condition ω ( 1 ) = 1 ). This is a bounded convex set, and the group V naturally acts on it. An observable is a pair ( A , a ) , where A is a self-adjoint element of the algebra A treated as an infinitesimal automorphism and a is a linear functional on C 0 mapping the state ω to the number a ( ω ) = ω ( A ) .
In the geometric approach, we also have decoherence. The proof of this is pretty much the same as before. We start with a time-independent “Hamiltonian” denoted by the letter H. Then, the evolution operator is an exponent σ ( t ) = e t H . We assume that the operator H is diagonalizable, that is, there exists a basis ( ψ j ) consisting of eigenvectors of the operator H:
H ψ j = ϵ j ψ j .
We have assumed that the evolution operator maps the set of states into itself and that the set of states is bounded. Under these conditions, the operators σ ( t ) are uniformly bounded (the norms of these operators are bounded by the same number).
The eigenvalues of the operator H are purely imaginary, as a function of the form e ϵ t is bounded only if ϵ is purely imaginary. If there is a Jordan cell of size greater than one, then the exponent σ ( t ) = e t H will not be bounded; hence, such Jordan cells cannot appear in H .
In the finite-dimensional case, these statements imply that H is diagonalizable. In the infinite-dimensional case, they are not sufficient for diagonalizability; nevertheless, one should expect that the operator H is diagonalizable. This is what we assume here.
Next, we repeat our reasoning. We include our “Hamiltonian” in the family H ( g ) of “Hamiltonians” with eigenvalues ( ϵ j ( g ) ) and eigenvectors ( ψ j ( g ) ) that smoothly depend on g:
H ( g ) ψ j ( g ) = ϵ j ( g ) ψ j ( g ) ,
We assume that these eigenvectors constitute a basis that coincides with the basis ψ j of eigenvectors of H = H ( 0 ) for g = 0 .
Further, we say that ψ j is a robust zero mode if ϵ j ( g ) 0 , that is, if the eigenvalue is zero at any g.
Now, we want to do the same thing as before; we assume that the interaction with the environment is determined by a random “Hamiltonian” H ( g ( t ) ) , where g depends on t, and assume that these random “Hamiltonians” are adiabatic (they change slowly enough that the derivative of g with respect to t can be neglected). Then, the vector σ ( t ) ψ j is an eigenvector of the “Hamiltonian” H ( g ( t ) ) where t is fixed:
σ ( t ) ψ j = e ρ j ( t ) ψ j ( g ( t ) ) ,
where d ρ j d t = ϵ j ( g ( t ) ) . To prove this, we differentiate the right-hand side of this expression by t, applying the equations of motion and neglecting the derivative g ˙ ( t ) . We do not write imaginary unit in these formulas, but ϵ j ( g ) itself is purely imaginary.
The phase factor does not appear for the robust zero modes. If adiabatic perturbation acts for some finite time, then at the end the robust zero modes do not change. All other modes acquire phase factors.
Everything is very similar to ordinary quantum mechanics. For a non-robust zero mode taking an average of phase factors with respect to random perturbation we get zero, while for the robust zero modes nothing changes.
In ordinary quantum mechanics, the robust zero modes are the diagonal elements of the density matrix in H ^ -representation. They are zero modes because all diagonal matrices commute with each other, and it is easy to check that they are robust.
In the general situation, an arbitrary state x C 0 can be represented as a linear combination of eigenvectors of the operator H; the interaction with the environment kills all modes except the robust zero modes. We denote by P the operator killing all but the robust zero modes; we can say that all observable physics lies in its image (i.e., in the projection on the robust zero modes). Next, we must represent the state P x C 0 as a mixture of pure robust zero modes P x = p i u i . Note that this representation is not necessarily unique. The coefficients p i in this expansion can be interpreted as probabilities. In ordinary quantum mechanics, the usual probabilities are obtained this way. In the general case, the coefficient p i is the probability of a pure robust zero mode u i in the state x .
The Hamiltonian H corresponds to the physical quantity ( H , h ) , where the function h must be identified with the energy. The coefficient p i should be interpreted as the probability of finding the energy h ( u i ) in the state x . It is assumed that all numbers h ( u i ) are different. If some of them are the same, then to find the probability of the energy being h we have to sum all the coefficients p i for which h ( u i ) = h .
If all zero modes are robust, we can write a simple formula for the operator, namely, P = P , where P is the operator that kills all non-zero modes:
P = lim T 1 T 0 T σ ( t ) d t .
If we decompose σ ( t ) by eigenvectors in this integral, then non-trivial phrase factors appear for the nonzero modes; by averaging them over time, we get zero.
We have explained how the probabilities for “Hamiltonians” appear. They are interpreted as probabilities of different energy levels. One can repeat the same reasoning by considering other observables represented as pairs ( A , a ) , where A V , a is a functional obeying a ( A x ) = 0 . It is possible to define the notion of a robust zero mode for any observable.
We can provide a slightly different and more general definition of the robust zero mode. When it is said that x is a robust zero mode, first of all, it is necessary to require that it is a zero mode ( A x = 0 ). If A is slightly changed, i.e., replaced by a close element A of the group V , then we must require that A has a zero mode x close to the original zero mode x.
Everything that was said for the energy can be repeated for an arbitrary observable. It is necessary to consider the projection P A on the space of zero modes of A:
P A = lim T 1 T 0 T d t e A t .
We have assumed here that all zero modes are robust. Afterwards, it is necessary to represent the projection on the zero modes as a mixture of extreme points:
P A ( x ) = p i u i .
The coefficients p i are interpreted as probabilities of values a ( u i ) in the state x.

4.3. L-Functionals

We have now explained that there is a formalism in which the starting point is a set of states. There is a question that a physicist should ask, namely, whether is this formalism convenient or whether it is convenient to calculate in this formalism. We answer this question now. We will work in the framework of the algebraic approach, assuming that the algebra is a Weyl algebra with generators u ^ i and relations
u ^ k u ^ l u ^ l u ^ k = i σ k , l .
We want to introduce the notion of an L-functional corresponding to the density matrix K in any representation of a Weyl algebra by the formula
L K ( α ) = Tr e i α k u ^ k K = Tr V α K ,
where V α = e i α k u ^ k = e i α u ^ are operators that we have already considered and α k are real numbers corresponding to the generators of Weyl algebra. Recall that we can assume that a Weyl algebra is generated by operators V α . If u k is self-adjoint and α k is real, then these generators are unitary and satisfy the relations
V α V β = e i 2 α σ β V α + β ,
where α σ β = α k σ k , l β l .
This is the so-called exponential form of the Weyl algebra.
An important property of L-functionals is that they unite all representations of the Weyl algebra. The problem of non-equivalence of representations of the Weyl algebra in the L-functional formalism completely disappears. In the formula for an L-functional, a unitary operator is multiplied by an operator from the trace class; therefore, the trace is well-defined.
We can introduce the space L of all linear functionals on a Weyl algebra. We can assume that an L-functional defines a linear functional on a Weyl algebra: L K L . This follows from the observation that every element expressed in terms of generators V α by a finite number of addition and multiplication operations is a linear combination of generators V α , as it is easy to check whether the functional L K is positive and normalized (on the unit element of the algebra it is equal to one). In what follows, we will identify L-functionals with positive functionals on the Weyl algebra.
Let us define operations in the space L under consideration. These operations are defined for the space of linear functionals on any *-algebra.
There are two operations on linear functionals for every element A of the algebra. One can define an operation on a functional ω ( x ) , where x is an element of the algebra, by multiplying x by A from the right. Another operation is obtained by multiplying x by A * from the left. We denote the first of these by the same letter A and the other by A ˜ :
( A ω ) ( x ) = ω ( x A ) , ( A ˜ ω ) ( x ) = ω ( A * x ) .
Let us now ask whether the functional ω remains positive when acted on by these operators. The answer is no; however, if we apply the combination A ˜ A then the positive functionals are mapped into positive functionals (which are not necessarily normalized). Recall that positive functionals must be non-negative on elements of the form x = B * B . It is easy to check whether the operation A ˜ A transforms x A * x A = ( B A ) * x B A . It can be seen that the positivity is preserved.
This is an important statement. We denote by C the space of all positive functionals that are not necessarily normalized. The operator A ˜ A acts in it.
The next observation is that operators of the form A ˜ always commute with operators of the form A. This is because one multiplies from the left and the other multiplies from the right.
Further, if the operator has the form A = e i t H , then A ˜ = e i t H ˜ . If the equations of motion are written as
i d σ / d t = ( H ˜ H ) σ ,
then the evolution operator will have the form e i t ( H ˜ H ) . According to the previous observation, this expression can be represented in the form A ˜ A , that is, if the equations of motion are written in the form (13), then σ acts on the cone of positive functionals (the cone of not necessarily normalized states). We will use this fact in what follows.
Let us now turn to the question of the form of the equations of motion in the case of Weyl algebra. It is easy to calculate the operators V β and V ˜ β acting in the space L :
( V β L ) ( α ) = e + i 2 α σ β L ( α + β ) , ( V ˜ β L ) ( α ) = e + i 2 α σ β L ( α β ) ,
Let us assume that H ^ is represented as an integral
H ^ = d β h ( β ) V β .
This operator will be self-adjoint if h ( β ) = h ( β ) * . When introducing the Planck constant, we should replace H ^ with 1 H ^ . The equation of motion in which H ^ plays the role of the Hamiltonian takes the form:
i d σ d t = ( H ^ ˜ H ^ ) σ .
The equation for L-functionals can be written in the following form:
i d L d t = d β h ( β ) e + i 2 α σ β L ( α β ) d β h ( β ) e + i 2 α σ β L ( α + β ) =
= d β h ( β ) ( e + i 2 α σ β e i 2 α σ β ) L ( α + β ) .
As a result, we arrive at the following formula:
d L d t = d β h ( β ) 2 sin ( 2 α σ β ) L ( α + β ) .
It is clear from this formula that the equation of motion for L-functionals has a limit when 0 .

5. Lecture 5

5.1. Functional Integrals

This lecture is devoted, first of all, to functional integrals, which are widely used in quantum mechanics. We use Berezin’s idea [27,28], which allows us to apply functional integrals not only in the conventional approach to quantum mechanics but in the geometric approach and L-functional formalism.
In quantum mechanics, a physical quantity is represented as a functional integral, that is, an infinite-dimensional integral, where the integrand includes an exponent of the action functional multiplied by something. The action depends on a curve (more precisely, on a function q ( τ ) , the graph of which we consider as a curve):
S [ q ( τ ) ] = 0 t d τ L ( q ( τ ) , q ˙ ( τ ) ) .
In this expression, q ( τ ) is in square brackets to emphasize that S is not a function, but a functional that depends on a curve (as opposed to a point of a curve).
The matrix element q 2 | U ^ ( t ) | q 1 of the evolution operator U ^ ( t ) = e i t H ^ in the coordinate representation is expressed in terms of a functional integral with an integrand
e i S [ q ( τ ) ] .
We integrate over a set of curves q ( τ ) with given starting point and end point q ( 0 ) = q 1 , q ( t ) = q 2 .
What does the word “functional integral” mean? An integral of the type we have described can be approximated by finite-dimensional integrals: under the sign of the functional integral, we can replace the ordinary integral with, say, integral sums and then take the limit. The problem is that, unlike in ordinary calculus, where the approximation scheme is irrelevant, the value of a functional integral depends on the choice of approximation. Moreover, the limit is usually either infinite or zero, and we should do something to obtain a finite answer.
What we are about to discuss can be made rigorous up to a point. Rigorous statements can be proven for Gaussian integrals (for integrals of quadratic exponents q, possibly multiplied by a polynomial). To calculate such integrals in the finite-dimensional case, one can use the formula:
exp ( i 2 A x , x + i b , x ) d x = ( det A ) 1 2 exp ( i 2 A 1 b , b ) .
This formula must contain a constant factor, which we do not write here, having included it in the definition of the measure. The operator A can be complex, but one should impose some conditions to guarantee that the integral makes sense. By differentiating this expression with respect to b, we can conclude that an integral of the form
P ( x ) exp ( i 2 A x , x ) d x ,
where P ( x ) , that is, some polynomial, can be expressed in terms of determinants; at least for the case of elliptic operators, there is quite a good theory for computing such determinants. Again, a determinant of an elliptic operator must be approximated by something, then infinite terms of the expansion for the logarithm of the determinant must be discarded.
If t W ( x ) has the form W ( x ) = Q ( x ) + g V ( x ) , where Q ( x ) is a quadratic expression, and we consider the constant g to be small, then the normalized functional integral of the form
e W ( x ) d x e Q ( x ) d x
can be calculated in the framework of perturbation theory (as a series with respect to g). Every term of the series is represented as a sum of Feynman diagrams. This is true in more general case as well, where W ( x ) is represented as a quadratic on the x x 0 expression plus terms that contain only monomials of higher degree with respect to x x 0 .
After these preliminary words, we move on to the construction of functional integrals. We will construct a functional integral for the exponent of an operator acting in Banach (not necessarily Hilbert) space; more generally, that is, we will solve the equation of motion
d σ d t = H ( t ) σ ( t )
in terms of functional integrals. The considerations below generalize the approach suggested by F. Berezin [27,28].
The operator H ( t ) in the equation of motion can be a function of t; here, to simplify the formulas, we assume that H does not depend on t. In ordinary quantum mechanics, the equation for the evolution operator is exactly the same, except that in this case H is a self-adjoint operator up to multiplication by an imaginary unit.
Let us denote the space in which the operator H acts by the letter L . The main tool is the notion of a symbol of an operator. While we have already considered some symbols, now we provide a very general definition.
A symbol of an operator is a function defined on some measure space (or, more generally, a function defined somewhere where there is a notion of integration). The symbol of an operator A is denoted as A ̲ . We impose the following conditions on symbols. The symbol of the operator A = 1 is equal to 1. The symbol A ̲ must depend linearly on the operator A. The product of the operators C = A B must correspond to some operation on symbols, which we denote by *, that is, C ̲ = A ̲ * B ̲ .
The following simple arguments lead immediately to the functional integral. We will use the standard formula for the exponent:
σ ( t ) = e t H = lim N ( 1 + t H N ) N .
For large N:
1 + t H N ̲ = e t N H ̲ + O ( N 2 ) .
The error is of the order 1 N 2 ; hence, for N this correction term can be neglected. We obtain
σ ( t ) ̲ = lim N I N ( t ) ,
I N ( t ) = e t N H ̲ * * e t N H ̲ ,
that is, N factors.
My claim is that we derived a representation of the evolution operator as a functional integral. We need only give examples of symbols and decipher the meaning of “ * ”.
We want to emphasize a point here that is, in our opinion, important and typically underestimated by physicists. Thus far, we have provided trivial arguments that can be made rigorous. It is not necessary to talk about functional integrals; it is sufficient to investigate the functions I N ( t ) . One can, for example, apply the stationary phase method and obtain more or less the same results as in the language of functional integrals.
Later, we introduce a large class of operators for which the operation * is written in a simple form. This class of operators includes q-p-symbols and Wick symbols. We can think of a symbol as a function of two variables A ̲ ( α , β ) . For this large class of symbols, the expression for the symbol of a product has the following form:
C ̲ ( α , β ) = d γ d γ A ̲ ( α , γ ) B ̲ ( γ , β ) e c ( α , γ ) + c ( γ , β ) c ( α , β ) r ( γ , γ ) ,
where c ( α , β ) and r ( α , β ) are functions.
Later. we will construct a large class of symbols obeying (14); for now, we simply postulate that there is such a formula for the symbol of the product. Knowing the formula for the product of two operators, we can write the formula for the product of operators A 1 , , A N as follows:
C ̲ ( α , β ) = d γ 1 d γ 1 d γ N 1 d γ N 1 A ̲ 1 ( α , γ 1 ) A ̲ 2 ( γ 1 , γ 2 ) A ̲ n ( γ N 1 , β ) e ρ N ,
where
ρ N = c ( α , γ 1 ) + c ( γ 1 , γ 2 ) + + c ( γ N 1 , β ) c ( α , β ) ) r ( γ 1 , γ 1 ) r ( γ N 1 , γ N 1 ) .
The result is the product of symbols multiplied by the exponent of some expression, which is denoted by ρ N . Returning to the expression I N ( t ) that approximates the evolution operator, we can represent it as
I N ( t ) = d γ 1 d γ 1 d γ N 1 d γ N 1 e ρ N exp ( t N ( H ̲ ( α , γ 1 ) + H ̲ ( γ 1 , γ 2 ) + + H ̲ ( γ N 1 , β ) ) .
What we have described here is a very broad scheme. There is one very concrete example, which is the q-p-symbol. If we consider the kernel of the unit operator in the sense of mathematics (or the matrix of the unit operator, as physicists say), it is a δ -function: in the coordinate representation q 2 | 1 | q 1 = δ ( q 1 q 2 ) , and in the momentum representation p 2 | 1 | p 1 = δ ( p 1 p 2 ) . We want the symbol of the unit operator to be equal to one, not to the δ -function. This is very easy to do. The Fourier transform of the δ -function is a constant; thus, in order to find a symbol equal to 1 for the unit operator, we must take the Fourier transform and multiply it by a constant factor. As the matrix of unit operator in coordinate representation is a δ -function of q 1 q 2 , we take the Fourier transform of the matrix element q 2 | A | q 1 with respect to q 1 q 2 :
A ̲ q p ( q , p ) = d y y | A | q e i p ( q y ) .
Hence, we can define the q-p-symbol as the Fourier transform of a matrix element with respect to the difference between the arguments. Obviously, one can take the inverse Fourier transform and express the matrix elements q 2 | A | q 1 in terms of the q-p-symbol. As we know how to express the matrix element of the product in terms of the matrix elements of the factors, we can calculate the q-p-symbol of the product of two operators. The answer is provided by the Formula (14), where the functions c ( α , β ) and r ( α , β ) are scalar products (up to a constant factor):
c ( q , p ) = r ( q , p ) = i p q .
The definition of the q-p-symbol that has just been provided differs from the definition stated earlier. This definition applies to any operator as long as the integral makes sense. If A is a differential operator, then it is easy to understand that the definition of q-p-symbol which we have just provided agrees with the previous one. Let us recall the old definition: if there is a differential operator with polynomial coefficients, then it can be written as a polynomial of coordinate operators q j and momentum operators p ^ j = 1 i q j (the coordinate operators are on the left and the momentum operators are on the right). Then, the hats should be removed from the operators. We obtain a polynomial function called the q-p-symbol of the operator.
Note that we have considered the Planck constant to be equal to 1. Sometimes it is more convenient to keep it in the formulas.
Turning to ordinary quantum mechanics, we find that the formula for I N takes the form
I N ( q , p , t ) = 1 N 1 d q α d p α exp ( i 1 N p α ( q α q α 1 ) i t N 1 N H ̲ ( p α , q α 1 ) ) ,
where p N = p , q 0 = q N = q .
Thus, we have obtained the evolution operator as a limit of finite-dimensional integrals.
It would be quite reasonable to stop and study this representation; however, it is possible to say the words “functional integral” here instead. To do this, note that the expression that stands in the exponent is the integral sum for some integral. This integral is well known to physicists, being the action functional:
S [ p ( τ ) , q ( τ ) ] = 0 t ( p ( τ ) d d t q ( τ ) H ̲ ( p ( τ ) , q ( τ ) ) d τ .
Everything is already very close to what we want. Let us now get even closer. Above, we have written the q-p-symbol for the evolution operator. We can say that the q-p-symbol is the functional integral of the exponent
e i S [ p ( τ ) , q ( τ ) ] ,
where the functional S depends on a pair of functions p ( τ ) , q ( τ ) satisfying the boundary conditions
p ( 0 ) = p ( t ) = p , q ( 0 ) = q ( t ) = q
when τ changes from 0 to t.
We already know that the q-p-symbol is a Fourier transform of a matrix element; hence, we can proceed to matrix elements by performing the inverse Fourier transform. The result is the same integral with different boundary conditions:
q ( 0 ) = q 1 , q ( t ) = q 2 .
To arrive at the formula we mentioned above, consider the special case where the symbol H ̲ ( p , q ) is the sum of the quadratic function of p (kinetic energy) and some function V ( q ) (potential energy). Then, it is easy to integrate over p , as it is a Gaussian integral. As a result, the matrix element of the evolution operator can be represented as a functional integral with an integrand of the form
e i S [ q ( τ ) ] = e i ( 0 t d τ ( T ( q ˙ ( τ ) ) V ( q ( τ ) ) ) .
Thus, we have derived the functional integral with which we began in an even more general form.
Note that in the expression (16) for I N there should be a constant factor tending to zero at N ; this factor comes from the constant we threw out when we wrote the expression for the Gaussian integral. This zero factor is neglected here. This is done all the time in functional integrals, as a good object is actually a quotient of two functional integrals.
We want to construct a large number of examples. These examples generalize what Berezin called covariant symbols.
We first take two Banach spaces L and L . Let us fix a nondegenerate scalar product (pairing) between them (we assume that the scalar product is linear with respect to the first argument and antilinear with respect to the second argument to alleviate the comparison with Hilbert spaces).
Thus, we have two Banach spaces that are almost dual to each other in the sense that there is a nondegenerate scalar product between them. Then, we take two systems of vectors in these spaces: e α L , where α M , and e β L , where β M . They do not have to be linearly independent, as they are not bases, but should be overcomplete. This means that any vector can be expressed in terms of these vectors as a limit of linear combinations (Poisson vectors are an example of such an overcomplete system).
Now, we would like to insert unity in the scalar product. This means that a scalar product l , l should be expressed in terms of the scalar products l , e μ and e λ , l by some integration. This can always be done, even in different ways. We assume here that such a way is fixed:
l , l = l , e μ e λ , l e r ( λ , μ ) d λ d μ ,
where r ( λ , μ ) is a function on the space M × M . We assume that M × M is a measure space with the measure d λ d μ . This assumption can be weakened; it is sufficient to suppose that some functions can be integrated over M × M .
Next, we want to define the covariant symbol A ̲ ( α , β ) of the operator A acting in L (one may consider an operator in L with the same success). This symbol is defined by the formula
A ̲ ( α , β ) = A e α , e β e α , e β .
The main condition is satisfied, as the symbol of the unit operator is equal to 1 . It is easy to calculate the symbol C ̲ of the product of operators C = A B using the relation
A B e α , e β = B e α , A * e β ,
and Formula (17) for l , l , where l = B e α , l = A * e β . Introducing the notation e α , e β = e c ( α , β ) , we obtain the following expression:
C ̲ ( α , β ) = d λ d μ B ̲ ( α , μ ) A ̲ ( λ , β ) exp ( r ( λ , μ ) c ( α , β ) + c ( α , μ ) + c ( λ , β ) ) ;
which agrees with Formula (14).
Note that the above construction is incredibly general. The vectors e α and e β could be chosen almost arbitrarily; the only requirement is that they constitute overcomplete systems. However, it is important to have simple expressions for the scalar product e α , e β and for the function r ( λ , μ ) .
If L = L is Fock space, one can take as e α the Poisson vectors e α = e α a ^ * | 0 , which we have already considered. In this case. c ( α , β ) = r ( α , β ) = α , β . This is easy to calculate because all integrals are Gaussian.

5.2. L-Functionals and Functional Integrals

Let us now turn to the L-functionals. We defined them in the preceding lecture and identified them with positive functionals on Weyl algebra represented in exponential form (12). In this definition, nice formulas can be obtained by taking the Weyl algebra with the overcomplete system e α = V α as L and the space of linear functionals on the Weyl algebra with the overcomplete system e f of linear functionals obeying e f ( V g ) = exp i 2 ( f , g ) as L .
In the present section, we will write canonical commutative relations in the form
[ a ( k ) , a + ( k ) ] = δ ( k , k ) , [ a ( k ) , a ( k ) ] = [ a + ( k ) , a + ( k ) ] = 0 ,
where k , k M . We assume that M is a measure space. Recall that in these relations we are dealing with generalized functions; in other words, we should work with formal expressions f a = d k f ( k ) a ( k ) , g a + = d k g ( k ) a + ( k ) . In the discrete case, the integral is understood as a sum and the δ -function as the Kronecker symbol.
In this section, we define the L-functional corresponding to a density matrix K in the representation of CCR by the formula
L K ( α * , α ) = Tr e α a + e α * a K ,
An easy formal calculation shows that
W α W β = e ( α * , β ) W α + β ,
where
W α = e α a + e α * a .
In the preceding section, we wrote an exponent of a linear expression in the definition of an L-functional corresponding to a density matrix in a representation of CCR. This exponent can be considered as a unitary operator in representation space; hence, the L-functional is well-defined. Here, we are writing a product of exponents. This difference is not significant (a numerical factor); however, with the new definition we can say that the L-functional is simply a generating functional for correlation functions.
When α is a square-integrable function, the expression (19) is well-defined because the numerical factor is finite. However, we do not assume that α is square-integrable (this is important for applications to string theory). We slightly modify the definition of W by assuming that it is generated by a ( k ) , a * ( k ) , and W α obeying (18) and (20). For appropriate topology in W , the multiplication is defined on a dense subset, and we can consider W as a version of Weyl algebra.
Let us denote by L the space of continuous linear functionals on W . Such a functional is completely determined by its values on generators W α ; we denote these values by L ( α * , α ) . In other words, a linear functional L can be represented by a nonlinear functional L ( α * , α ) .
The action of the Weyl algebra W on the space L is realized by the operators b and b + , for which the action on functionals L K corresponds to multiplication of the density matrix by the operators a + and a from the right:
b ( k ) L K = L K a + ( k ) , b + ( k ) L K = L K a ( k ) .
It is easy to check that these operators satisfy the canonical commutative relations, and they can be represented in the following form:
b + ( k ) = c 2 + ( k ) + c 1 ( k ) , b ( k ) = c 2 ( k ) ,
where c i + ( k ) are multiplication operators, by α k * for i = 1 and by α k for i = 2 , and c i ( k ) are derivatives with respect to α * ( k ) and α ( k ) .
An alternative action of W on L is realized by operators, for which the action on the functionals L K d corresponds to the multiplication of the density matrix by operators a and a + from the left:
b ˜ ( k ) L K = L a ( k ) K , b ˜ + ( k ) L K = L a + ( k ) K
These operators satisfy the canonical commutative relations. They can be represented in the form
b ˜ + ( k ) = c 1 + ( k ) c 2 ( k ) , b ˜ ( k ) = c 1 ( k ) .
Thus, there are two commuting actions of Weyl algebra on L (in physics terminology, we have a doubling of fields).
Consider now the formal Hamiltonian H ^ :
H ^ = m , n k i , l j H m , n ( k 1 , k m | l 1 , l n ) a k 1 + a k m + a l 1 a l n
expressed in terms of creation and annihilation operators and presented in the normal form (i.e., all the creation operators are moved to the left). Recall that, in the algebraic approach, the corresponding equations of motion can make sense even when the formal Hamiltonians might not make sense as operators. The Hamiltonian H ^ induces two formal operators acting on L :
H ^ = m , n k i , l j H m , n ( k 1 , k m | l 1 , l n ) b k 1 + b k m + b l 1 b l n
H ˜ = m , n k i , l j H m , n ( k 1 , k m | l 1 , , l n ) b ˜ k 1 + b ˜ k m + b ˜ l 1 b ˜ l n
One of them is denoted by the same symbol, while the other is denoted by the tilde symbol. We can now write the equation of motion for the L-functional L ( α * , α ) :
i d L d t = H L = H ˜ L H ^ L ,
where we introduce the notation H = H ˜ H ^ .
If we consider a translation-invariant Hamiltonian, then the momentum representation the coefficients of H m , n contain δ -functions δ ( k 1 + + k m l 1 l n ) (i.e., they express the momentum conservation law).
The equations for the L-functional can make sense even when the equation of motion in Fock space is ill-defined. The problem of non-equivalence of different representations of canonical commutation relations disappears for L-functionals.
The standard perturbation theory suffers from divergences when we are in an infinite volume; sometimes, they are characterized as trivial volume divergences. If there are no UV divergences, then the formalism of L-functionals leads to a well-defined perturbation theory even in infinite volume.
Returning to the expressions for H, we find ourselves in a familiar setting. Indeed, we have representations of two Weyl algebras (we may consider them as a representation of one larger Weyl algebra). In textbook quantum field theory, everything is usually done in the framework of perturbation theory. Here, we write the evolution operator in the interaction representation as a T-exponent of the interaction Hamiltonian; by applying the Wick lemma, we derive the diagram techniques. There are no essential changes in the case of L-functionals. We have the same commutation relations, and can apply exactly the same techniques.
The only thing that has changed is that the number of fields has doubled, and now we are not working in Hilbert space. However, as the notion of Hilbert space was not used anywhere in the derivation of the diagram techniques, all standard techniques from quantum field theory work in the formalism of L-functionals. Thus, from the computational point of view, the formalism of L-functionals is not worse than the usual one; in fact, it is better. As we explained previously, it solves problems with trivial volume divergences. It is significantly better if we consider the adiabatic approximation. This is true not only for L-functionals, but for the geometric approach to quantum theory in general.
Let us consider a family of “Hamiltonians” H ( g ) in this framework. For example, we can consider a family H ( g ) = H 0 + g V that is used in perturbation theory. We argue that if we consider the expression ω ( g ( t ) ) , which is the stationary state for the “Hamiltonian” H ( g ( t ) ) for fixed t, and if g ( t ) varies slowly (adiabatically), then ω ( g ( t ) ) is a solution of the equation of motion for the nonstationary “Hamiltonian” H ( g ( t ) . This is a trivial fact; in adiabatic approximation, we can neglect the derivative g ˙ ( t ) in the equations of motion.
Using this observation, we will express the stationary states of the “Hamiltonian” H ( g ) = H 0 + g V in terms of stationary states of the “Hamiltonian” H 0 .
Let us consider the Hamiltonian H 0 + g e α | t | V and the corresponding evolution operator σ α ( t , ) . Applying adiabatic approximation in the limit α 0 and assuming that ω ( 0 ) is a stationary state of H 0 , we can check whether
ω ( g ) = lim α 0 σ α ( 0 , ) ω ( 0 )
is a stationary state of the “Hamiltonian” H ( g ) = H 0 + g V .
A similar formula (with a phase factor) is true in ordinary quantum mechanics. In ordinary quantum mechanics, this formula usually applies to obtain the ground state for the coupling constant g from the ground state of H 0 . In our approach, it is possible to apply this formula in much more general situations. For example, if the Hamiltonian is translation-invariant, one can apply this formula to any translation-invariant stationary state of the free Hamiltonian H 0 .
This approach is very natural in both equilibrium and nonequilibrium statistical physics. It is possible to take as ω ( 0 ) , say, an equilibrium state for some temperature, then apply this process to obtain an equilibrium state that, while it might have a different temperature, has the same entropy, because the adiabatic process does not change the entropy.
The Keldysh formalism of non-equilibrium statistical physics is closely related to the formalism of L-functionals. We can say that the formalism of L-functionals provides a justification of Keldysh formalism.
Along with the operator σ α ( 0 , ) , we can consider the evolution operator σ α ( + , ) (the adiabatic S-matrix). If the adiabatic parameter α tends to zero, the adiabatic S-matrix in the formalism of L-functionals multiplied by some factors (which we do not describe here) tends towards an operator that we will call an inclusive scattering matrix. The inclusive scattering matrix can be used to calculate the inclusive cross section. We will discuss the inclusive scattering matrix and inclusive cross-sections later; here, we only say that the typical scattering matrix from quantum mechanics can be obtained from the adiabatic evolution operator. Taking the limit α 0 in an adiabatic S-matrix for bare particles, we obtain the scattering matrix of the physical (dressed) particles. The masses and wave functions are renormalized in this process, while the coupling constant is not.
Next, we want to explain some things that will be discussed later from a different perspective.
It is well known (and we will prove this later) that the scattering matrix is expressed in terms of Green’s functions. This is what is called the Lehmann–Symanzik–Zimmermann (LSZ) formula [8]. In order to calculate an inclusive scattering matrix, we have to consider what can be called generalized Green’s functions.
The standard Green’s function is defined as a chronological product
M = T ( B 1 * ( x 1 , t 1 ) B n * ( x m , t m ) ) = T ( B * )
averaged over some state; in other words, a Green’s function is an expectation value of the chronological product in this state. In a chronological product, the times are decreasing. One can consider the anti-chronological product as well, in which the times are increasing:
N = T o p p ( B 1 ( x 1 , t 1 ) B n ( x n , t n ) ) = T o p p ( B ) .
To calculate the generalized Green’s functions, we take the chronological product of some operators, multiply it by the anti-chronological product of other operators, and take the average (expectation value) in some state ω :
G m n = ω ( M N ) .
This is the generalized Green’s function in a given state. Such Green’s functions appear in the Keldysh formalism and, as will now be explained, in the formalism of L-functionals and in the algebraic approach in general.
Recall that for any *-algebra A , any element B A specifies two operators in the space L of continuous linear functionals on A . One of these transforms a linear functional ω ( A ) into the functional ω ( A B ) , while the second one transforms ω ( A ) into the functional ω ( B * A ) . The first of these operators is denoted by the symbol B, while the second operator is denoted by B ˜ .
If B = B 1 B 2 , then ( B ω ) ( A ) = ω ( A B 1 B 2 ) ; hence,
( B 1 B 2 ) ω = B 2 ( B 1 ω )
(here, the order is changed because B acts from the right.)
It follows that
( T ( B ˜ B ) ω ) ( x ) = ω ( T ( B * ) x T o p p ( B ) ) = ω ( M x N ) .
Taking x = 1 , we find that the generalized Green’s functions can be expressed in terms of an analog of the usual Green’s functions in the formalism of L-functionals. This means that we can apply the technique of calculating ordinary Green’s functions to compute generalized Green’s functions. This results in Feynman diagrams for generalized Green’s functions.
One can solve the equation of motion for L-functionals in terms of functional integrals by applying the methods in Section 5.1 and assuming that L = W . Here, we define covariant symbols of operators acting in L using systems of vectors e f L and vectors e f L , which are defined in the following way. We assume that e f L corresponds to a nonlinear functional e f ( W α ) = exp i ( ( f , α * ) + ( α , f * ) ) and that e f = W f . It follows that e f , e f = exp i ( ( f , f * ) + ( f , f * ) ) .
It is easy to calculate the covariant symbol of the operator H:
H ̲ ( f , f ) = i m , n 1 i m 1 j n ( d k i d l j ) H m , n ( k 1 , k m | l 1 , , l n ) × f * ( k 1 ) f * ( k m ) f ( l 1 ) f ( l n ) f * ( k 1 ) f * ( k m ) f ( l 1 ) f ( l n ) .
This allows us to find a representation of the symbol of the evolution operator in terms of functional integrals.

6. Lecture 6

6.1. Solitons as Classical Analogs of Quantum Particles

We are now going to define quantum particles and quasiparticles. The basic statement here is that the notion of a particle is secondary. We define a particle as an elementary excitation of the ground state. One can consider an elementary excitation of any stationary translation-invariant state, in which case the elementary excitation is a quasiparticle.
First of all, we want to discuss the classical analogs of these notions, specifically, the notions of a soliton and generalized soliton. Consider a translation-invariant Hamiltonian in an infinite-dimensional phase space which consists of vector-valued functions f ( x ) , where x R d are spatial coordinates. We assume that spatial translations act as shifts of these coordinates and that the time translations are specified by a Hamiltonian which is invariant with respect to spatial translations. Suppose that the corresponding equation of motion has the form
f t = A f + B ( f ) ,
where A is a linear operator and B stands for the nonlinear part. We assume that the nonlinear part is at least quadratic; then, the linear part dominates for small f. In particular, we can say that f 0 is a solution and that in its neighborhood, we can neglect the nonlinear part.
We define a soliton as a solution that has the form s ( x v t ) . We can represent the solution f 0 as a horizontal straight line, in which case the soliton (solitary wave) is a bump moving uniformly without changing the shape. In addition, we have the notion of a generalized soliton. This is a bump that moves with a constant average speed and can change its shape at the same time that it pulsates. We do not discuss this notion in detail here.
If a theory is Lorentz-invariant, this means that one can apply a Lorentz transformation to the soliton and again obtain a soliton. Solitons move in families of solitons with different velocities. The same is true for Galilean invariance and Galilean transformations. In these cases, we have a family of functions s p ( x a ) that is invariant with respect to temporal and spatial translations (here, p denotes the momentum of soliton). This family can be considered as a symplectic manifold. A family of generalized solitons can be considered a symplectic manifold that is invariant with respect to temporal and spatial translations; the coordinates on this manifold are the data characterizing a (generalized) soliton.
We require the soliton to have finite energy. For a translation-invariant solution, the notion of energy is meaningless, though we can talk about energy density; here, we will consider the energy equal to zero and count the energy of the soliton using the energy of the translation-invariant state. The fact that the energy is finite means that, roughly speaking, the soliton is more or less concentrated in some finite region.
In a paper with Fateev and Tyupkin [29] written almost fifty years ago, we conjectured the behavior of a solution for times tending to plus or minus infinity can be described in the following way for very many systems and for almost all initial conditions with finite energy. If there are no solitons in the theory, then asymptotically the solution obeys a linear equation. In general, we obtain a few solitons and something else which satisfies a linear equation, at least approximately. This is well known for integrable systems in the case d = 1 (see for example [30,31]); we have conjectured that this is true without the assumption of integrability in any dimension. While we do not think any mathematician has read our paper, this hypothesis has been expressed in other papers as well. Soffer [32,33] calls it a “grand conjecture”, while Tao [34] calls it the “soliton resolution conjecture”. Nevertheless, the hypothesis remains hypothetical, and remains outside the limits of existing mathematics except for the case where solitons do not exist (see for example [35]).
This conjecture can be justified as follows. Let us take as an initial condition a field concentrated in some domain. In this case, we should expect what in old quantum mechanics textbooks was called the spreading of wave packets. That is, if the initial data were concentrated somewhere, the solution later spreads to a larger domain. As the energy is conserved, this spreading causes the amplitude of the wave to decrease. If it really does decrease all the time, then, as we said, in the case of small amplitudes the nonlinear part can be neglected and the solution of the nonlinear equation approaches the solution of the linear one.
Of course, this does not happen if there is a soliton in the theory. While the height of the bump remains the same, we can expect that in the end we will have solitons or generalized solitons plus a tail that approximately satisfies the linear equation. Of course, this reasoning is not a proof, but it is convincing.
There is no doubt that we should impose conditions to prove the above conjecture. In particular, it should be required that the translation-invariant state f 0 is stable and the solitons are stable, otherwise the solution can blow up. However, it is natural to think that the conjecture is true in many cases.
In this picture, one must consider the notion of soliton scattering. For solvable models of dimension 1 + 1 (one space dimension and one time dimension), this statement is a theorem. When two solitons collide, we see something that does not resemble any solitons (“a mess”), then the same solitons arise again. This is specific for integrable models. The standard situation in non-integrable models is somewhat different; after the collision, we obtain solitons plus a “tail”. The tail asymptotically behaves as a solution of a linear equation. The solitons that we obtain after collisions generally do not coincide with the original solitons.
Next, we will try to provide some formal definitions. Let us first denote the space of possible initial data by the letter R . We conjecture that for a dense set of initial data we can define a mapping D + ( t ) : R R a s of initial data at moment t to asymptotic data at t + . The asymptotic data characterize the solitons and the asymptotic behavior of the tail. We can additionally consider the asymptotic data at t to obtain a mapping D ( t ) : R R a s .
Now, we assume that there is an inverse mapping, i.e., that one can find initial conditions from the asymptotics or at least prove that the given asymptotics is obtained from some initial conditions, which is to say, we want to consider the inverse operators S ( t , + ) = ( D + ( t ) ) 1 and S ( t , ) = ( D ( t ) ) 1 .
To find these operators, we should use the asymptotic data to determine the solutions of the equation, and as such the initial data. While this does not seem to be difficult, nobody has done it. We think that it is an interesting and not very difficult task to construct a solution from asymptotic data. In the quantum case, the solution to this problem is well known, and is called Haag–Ruelle scattering theory; we explain a generalization of this theory (see [36] for the version of Haag–Ruelle theory that is closest to our approach and [37,38] for more information about scattering theory).
First, we can define what should be called a nonlinear scattering matrix:
S = S ( 0 , + ) 1 S ( 0 , ) : R a s R a s .
Roughly speaking, we set the initial conditions at minus infinity and watch the asymptotics at plus infinity. We hope to obtain the nonlinear scattering matrix from the quantum scattering matrix in the limit 0 . More precisely, we expect that the inclusive scattering matrix should have a limit such as 0 and that this limit should be related to the nonlinear scattering matrix.
To conclude this discussion, we want to say that the classical soliton should be considered as a model of a quantum particle. In quantum field theory, the notion of a particle is an asymptotic notion; if two particles collide, we obtain “a mess”, which then disintegrates into particles. Note that the analogy with solitons makes it obvious that the existence of identical particles is not surprising.
The following reasoning further emphasizes the analogy of solitons with quantum particles. Consider a phase space and a Hamiltonian, that is, we have a symplectic manifold M (which can be identified with the space of initial data R ) and an evolution operator. Assume that spatial translations act on M and temporal translations are described by a translation-invariant Hamiltonian. Formally, this means that on the symplectic manifold M we have an action of the commutative group T of spatial and temporal translations. Now, we can take a stationary translation-invariant point m M of this symplectic manifold.
In the previous picture, such a point was the solution f 0 . This solution is translation-invariant and stationary.
Let us now define an excitation of a translation-invariant stationary state as a state with finite energy (recall that we assume that the energy of a translation-invariant state is equal to zero).
We define an elementary symplectic manifold E as such a symplectic manifold, where in Darboux coordinates p , x the spatial translations simply act as shifts x x + a while p remains unchanged. We consider a Hamiltonian that is invariant with respect to these translations, meaning that it depends only on p , which we denote as ϵ ( p ) . Then, the time translations are transformations x x + v ( p ) t , where v is calculated by the formula v ( p ) = ϵ ( p ) .
Suppose now that M is realized as a space of vector-valued functions f ( x ) where x R d and the spatial translations act as shifts x x + a . If we take a symplectic embedding of the elementary symplectic space E into the set of excitations in M and require that the embedding commutes with the space-time translations, then we obtain a family of solitons. This is very simple to explain. The symplectic embedding of elementary symplectic space E maps the point ( p , 0 ) into some function s p ( x ) depending on p . Because the embedding E M commutes with spatial translations, the point ( p , a ) maps into a shifted function s p ( x + a ) . The condition that the mapping E M commutes with time shifts means that the function s p ( x v ( p ) t ) satisfies the equation of motion.

6.2. Particles and Quasiparticles

Now, we introduce the notion of a particle and the more general notion of a quasiparticle. The only difference is that a particle is an excitation of the ground state, while a quasiparticle is an excitation of any translation-invariant stationary state. In order to define the notion of a particle, we need the notions of spatial and temporal translations.
In ordinary quantum mechanics, if we consider evolution, then we need the notion of time translations T τ . To define the notion of a particle, we need spatial translations T a that act on states and commute with temporal translations. In the geometric approach, the space of states is the basic object; here, however, it is convenient to consider non-normalized states. Recall that in the algebraic approach the states are positive functionals normalized by the condition ω ( 1 ) = 1 . Discarding the normalization condition, we obtain a cone, which we denote by C . The state is now defined only up to a numerical factor. We will subsequently talk about this cone of non-normalized states all the time. Space-time translations must act on this cone.
Let us now denote the commutative group of space-time translations as T . In the algebraic approach, this group should act by automorphisms of the algebra A . The group of automorphisms of an algebra (and as such the group T ) acts on C ; recall here that we always assume that automorphisms agree with involution.
We use the standard notation, A ( τ , x ) = T τ T x A , for an element A A shifted in time and space. The translation-invariant stationary state ω in the algebraic approach satisfies the condition ω ( A ( τ , x ) ) = ω ( A ) . Standard examples of such a state are ground states and equilibrium states.
In particular, we can consider the Weyl algebra A with generators a ^ * ( x ) , a ^ ( x ) obeying CCR; we can assume that the spatial translations simply shift the argument, while the temporal translations are defined by a formal Hamiltonian expressed in terms of a ^ * ( x ) , a ^ ( x ) with some coefficient functions depending only on the differences x i x j . This ensures translational invariance. We will additionally require that the coefficient functions decrease rapidly. In this way, the equation of motion makes sense.
We can now perform Fourier transformation and proceed to the momentum representation. Now, the argument is denoted by k and a spatial translation is realized as multiplication by exp ( i k a ) . The time translations are again determined by the Hamiltonian. The condition that the functions in the coordinate representation depend on the difference leads to δ -functions that correspond to the momentum conservation, while the requirement that the coefficient functions decrease rapidly means that the functions in the momentum representation will be smooth after the δ -function is omitted.
In the geometric approach, when the group of space-time translations acts on the cone of states then we define a translation-invariant stationary state as a state that does not change for spatial and temporal shifts. This will be the basic object for us here.
Now, we want to define the notion of excitation of a translation-invariant stationary state as an analog of the previously introduced notion of a state with finite energy. When a soliton goes to infinity, we stop seeing it. Formalizing this observation, we say that a state σ is an excitation of the translation-invariant state ω C if ( T a σ ) ( A ) tends to const · ω ( A ) in the limit a .
The constant appears here because the state is defined only up to a numerical factor.
The notion of excitation is a general notion that can be applied in both geometric and algebraic approaches.
In the algebraic approach, a pre-Hilbert space H (here, we want to be in a pre-Hilbert space) can be constructed from a translation-invariant stationary state using the GNS construction. In this space, there is a cyclic vector θ corresponding to the state ω . Recall that this means that ω is a positive functional
ω ( A ) = A ^ θ , θ ,
where A ^ is the operator that corresponds to A in the representation space of the *-algebra A .
Spatial and temporal translations T a and T τ descend to the pre-Hilbert space H as unitary operators. Translations act in the algebra A as automorphisms of the algebra, and we have constructed the pre-Hilbert space by factorizing the algebra in some way. This allows us to define these operators in H as unitary operators. Next, we define the energy and momentum operators as infinitesimal translation operators in time and space:
T a = e i P ^ a , T τ = e i H ^ τ .
In the algebraic approach, the elements of the pre-Hilbert space H can be identified with the excitations of the state ω . The physical meaning of the GNS construction is that, starting with some translation-invariant state, we can construct the space H in which the excitations exist. This, in fact, explains why this construction is so important in physics.
Note that what we have claimed above is not always true; we should additionally require the cluster property in order to justify our claim.
Let us imagine a ferromagnetic. If spin has some direction at the origin of the coordinate system, then the same direction of spin will be everywhere, at least statistically. This is the case when there is no correlation decay. In a more standard situation at larger distances, the spin no longer remembers what the spin was at the origin. This is what is called the cluster property.
Mathematically, this can be formulated as follows. Let us take ω ( A ( τ , x ) B ) , where A , B are two algebra elements. Then, the cluster property implies that
lim x ω ( A ( τ , x ) B ) = ω ( A ) ω ( B )
This is the simplest form of cluster property; later, we will formulate it in a more general way. At this point, we need only the following generalization. Let us take three elements B , A, and B. If one of these elements is shifted to infinity, then
lim x ω ( B A ( τ , x ) B ) = ω ( A ) ω ( B B ) .
Any element of H can be represented as B θ , where B A . Then, for the state σ ( A ) corresponding to the vector B θ we have
σ ( A ) = A ^ B ^ θ , B ^ θ = ω ( B * A B ) .
It follows from (25) that
( T x σ ) ( A ) = σ ( A ( 0 , x ) = ω ( B * A ( 0 , x ) B ) ω ( A ) ω ( B * B ) ,
as x . This means that all elements of the pre-Hilbert space H correspond to excitations. In the algebraic approach we consider only such excitations. In fact, we could start here, defining the notion of excitation this way: take ω , apply the GNS construction, and take the elements of the pre-Hilbert space H . This construction provides excitations in the algebraic approach.
We now define the notion of elementary excitation of a translation-invariant state. An elementary excitation of the ground state is what is called a particle in quantum field theory, while elementary excitations of a translation-invariant state are called quasiparticles. As we will be considering both cases, we will speak about elementary excitations, but may also use the terms “particle” or “quasiparticle”.
In the algebraic approach we are in a Hilbert space, which we obtain through the GNS construction. We want to understand what should be called a particle in this situation. First of all, we should note that it is necessary to be able to talk about a particle having momentum p . A particle could have other quantum numbers; these will appear as discrete indices, which do not bother us in any way. We will denote the vector describing a particle with momentum p by Φ ( p ) . This means that
P ^ Φ ( p ) = p Φ ( p ) .
The energy of this state is some function ϵ ( p ) , which is called the dispersion law:
H ^ Φ ( p ) = ϵ ( p ) Φ ( p ) .
Note that (27) and (28) can be rewritten as
T a Φ ( p ) = e i p a Φ ( p ) ,
T τ Φ ( p ) = e i ϵ ( p ) τ Φ ( p ) .
It is important to note that Φ ( p ) is not an element of Hilbert space (it has infinite norm), but is a generalized vector function. To obtain an element of Hilbert space, we must consider an integral of Φ ( p ) with some test function ϕ ( p ) :
Φ ( ϕ ) = d p ϕ ( p ) Φ ( p ) ,
This will be a well-defined vector. It is convenient (though not necessary) to impose a normalization condition
Φ ( p ) , Φ ( p ) = δ ( p p )
(normalization on δ -function); for vectors Φ ( ϕ ) , the normalization condition implies that
Φ ( ϕ ) , Φ ( ϕ ) = ϕ , ϕ .
Let us define an elementary space 𝔥 as a subspace of the space of square-integrable functions ϕ a ( x ) taking values in the space C r . The elements of this space can be considered as test functions; generalized functions are considered as linear functionals on 𝔥. In what follows, we assume for definiteness that 𝔥 consists of smooth functions decreasing faster than any power, in other words, that 𝔥 is the Schwartz space S .
We define the action of spatial and temporal translations on this space. The action of spatial translations on the test functions in the x-representation is a shift of the argument; in the momentum representation, this action is multiplication by the exponent e i k a . We can deduce the formula for time shift from the requirement that the time translations commute with spatial translations. In the momentum representation, the time translation T τ is represented as multiplication by the exponent e i E ( k ) τ , where E ( k ) is a Hermitian matrix of dimension r × r . We can diagonalize this matrix, in which case we have multiplication by scalar phase factors. This means that we can always restrict ourselves to the case of r = 1 .
The elementary excitation of a translation-invariant stationary state ω can be defined as an isometric mapping σ of the elementary space 𝔥 to the set of excitations. This map should commute with spatial and temporal translations.
It is important to note that for the scalar case r = 1 this definition is equivalent to the definition above. Indeed, elementary excitation has been defined as a function Φ ( p ) which is an eigenvector for momentum and energy (27) and (28). There are normalization conditions (32) as well. It follows from these conditions that Φ ( ϕ ) is a mapping of the space of test functions to the excitation space. The fact that this mapping is an isometry follows from the normalization condition. Formulas (27) and (28) ensure that translations of the vector Φ ( p ) correspond to translations of the function ϕ ( p ) in both space and time. Thus, in the algebraic approach for r = 1 we can consider the space of test functions as elementary space and define σ ( ϕ ) as Φ ( ϕ ) .
In the geometric approach, we have to consider cones as sets of states. If we start with the theory in the algebraic approach, then the function ϕ from the elementary space is mapped to the state
( σ ( ϕ ) ) ( A ) = A Φ ( ϕ ) , Φ ( ϕ ) .
Here, σ is a quadratic (or rather, hermitian) mapping of the elementary space to the cone of states that commutes with all translations.
This remark suggests that in the geometric approach one should define the elementary excitation as a mapping of the elementary space to the cone of states that commutes with spatial and temporal translations.
If Φ ( ϕ ) in the algebraic approach belongs to the pre-Hilbert space H and θ is a cyclic vector in this space, then Φ ( ϕ ) is obtained by applying some element from the algebra A to the cyclic vector: Φ ( ϕ ) = B ( ϕ ) θ . Then, one can easily verify the formula
σ ( ϕ ) = L ( ϕ ) ω ,
where L ( ϕ ) = B ˜ ( ϕ ) B ( ϕ ) . Recall that in the algebraic approach an element B A specifies two operators on the space of functionals; one corresponds to the multiplication of the argument by B * from the left (denoted B ˜ ), while the other corresponds to multiplication of the argument from the right (denoted by the same letter B).
It is convenient to include the existence of the operator L ( ϕ ) satisfying the relation (34) in the definition of elementary excitation in the geometric approach.
We have already said that only translational invariance is important in the definition of scattering; however, if we are dealing with, say, a Lorentz-invariant theory, it is natural to assume that in the algebraic approach the vector θ is Lorentz-invariant. Then, the entire Poincaré group acts in the space H . The elementary space should carry a representation of the Poincaré group, and the mapping of the elementary space into H should agree with representations of this group. In local quantum field theory, Lorentz-invariant particles are defined as irreducible representations of the Poincaré group. This definition agrees with the above definition.
Let us now make the following observation. Consider a translation-invariant Hamiltonian in nonrelativistic quantum mechanics. In this case, the Hamiltonian is invariant under Galilean transformations and the energy of elementary excitation is provided by the usual formula: ϵ ( p ) = p 2 / 2 m + c o n s t .
If we take the operator a ^ * ( p ) and apply it to the translation-invariant Fock vacuum | 0 , we obtain an elementary excitation of the Fock vacuum:
Φ ( p ) = a ^ * ( p ) | 0 .
While this is a particle, in addition to the particle we have other particles that satisfy the same imposed conditions. These are called bound states.
What is a bound state? The Hamiltonian acts on states with any number of particles and preserves this number. Take n particles and separate the motion of the center of inertia. The Hamiltonian in this space can have normalizable eigenstates. These are called bound states.
Equivalently, we can try to solve the Equations (27) and (28) for p = 0 . Then, the solution will contain the δ -function of the sum of momenta:
d p 1 d p n Ψ ( p 1 , p n ) δ ( p 1 + + p n ) a ^ * ( p 1 ) a ^ * ( p n ) | 0 .
If the function Ψ ( p 1 , p n ) is square-integrable, we obtain a bound state. It is easy to understand that the generalized function
Φ ( p ) = d p 1 d p n Ψ ( p 1 , p n ) δ ( p p 1 p n ) a ^ * ( p 1 ) a ^ * ( p n ) | 0 ,
can be regarded as an elementary excitation. From our perspective, such bound states (composite particles) are no worse than elementary excitations with n = 1 . The general theory, which we will present, provides a description of the scattering of composite particles. It can be proved that non-relativistic quantum mechanics has this interpretation in terms of particles in the sense of Section 7.2 (see for example [39]).
In the present section, we have defined a notion of stable elementary excitation. Note, however, that particles can be unstable and quasiparticles are almost always unstable. This means that our requirements are satisfied only approximately. The theory of inclusive scattering matrix developed in the next lectures can be applied to unstable (quasi)particles if the lifetime of colliding (quasi)particles is much greater than the collision time. Note that the conventional scattering matrix does not make sense for quasiparticles.

6.3. Asymptotic Behavior of Solutions of Linear Equations

Let us now consider solutions of the translation-invariant linear equation
f t = A f ,
discarding the nonlinear part in (24).
We assume that for fixed t the function f ( x , t ) is defined on R d and takes values in C r .
It follows from translation invariance that after Fourier transformation with respect to x (i.e., in the momentum representation) the operator A can be considered as an operator of multiplication of f ( p ) as considered as a column vector by an r × r matrix A ( p ) . If the operator A is local (represented as a polynomial of derivatives), then the matrix A ( p ) is a polynomial. While we do not assume locality, we suppose that A ( p ) is a smooth function of p ) , meaning we can say that A is quasi-local.
The solution to (35) in the momentum representation has the form e t A ( p ) f ( p ) .
Let us assume that the matrix A ( p ) is diagonalizable and has purely imaginary eigenvalues. Then, the solution f 0 is stable. This means that for an appropriate definition of the norm, the evolution operators T τ are uniformly bounded: | T τ | | C for all τ . In particular, if f is small at some moment it remains small as τ ± .
Let us consider the solution to Equation (35) in the coordinate representation
f ( x , t ) = d p e i p x e t A ( p ) f ( p ) .
Note that the same formula describes the behavior of a test function in the coordinate representation if we take E ( p ) = i A ( p ) .
We are interested in the behavior of the function (36), as t ± . By diagonalizing the matrix A ( p ) , we can reduce this problem to the case when r = 1 We assume that A ( p ) = i ϵ ( p ) , following the notation from the preceding section, to analyze the behavior of the function
( T τ f ) ( x ) = f ( x , τ ) = d p e i p x i τ ϵ ( p ) f ( p ) .
Note that for large | τ | the phase is large, and we can use the stationary phase method, which leads to
x τ = ϵ ( p ) .
Clearly, we must consider only the situation when p supp f . Recall that supp f , that is, the support of the function f in momentum space, is defined as the closure of the set of points where f ( p ) 0 . We assume that the set supp f is compact.
Now, we define the set U as a neighborhood of the set of points x where the condition (38) with p supp f and τ = 1 is satisfied. Equation (38) has no solution outside the set τ U ; thus, the function ( T τ f ) ( x ) is very small at x τ U .
We can say that τ U is the essential support of the function f in coordinate representation for large | τ | . It can be proved that for x τ U we have
| ( T τ f ) ( x ) | < C n ( 1 + | x | 2 + τ 2 ) n
for any integer n . The proof can be based on a generalization of the Riemann–Lebesgue lemma. Recall that it follows from this lemma that the function d k g ( k ) e i h ( k ) t , where g ( k ) is a smooth function having compact support and h ( k ) is a linear function, tends to zero faster than any power of t . To prove this statement, we can repeatedly integrate by parts. This statement can be easily generalized to the case when instead of being linear the function h ( k ) is smooth and does not have stationary points. In this case, this function can be made locally linear by means of a change of variables. By using a partition of unity (a representation of unity as a finite sum of functions g a ( k ) that only do not vanish on small sets U a covering the support of g ( k ) ) and linearizing h ( k ) on the sets U a , we can obtain the generalization that we need. The Riemann–Lebesgue lemma and its generalization can be proven in the multi-dimensional case as well, which allows us to verify the above estimate.
When we apply the stationary phase method to the calculation of the integral (37) we obtain a factor ( det ( τ Hess ) ) 1 2 = τ d 2 ( det Hess ) 1 2 , where Hess denotes the matrix of second derivatives. This allows us to conjecture that
| f ( x , τ ) | C τ d 2 .
One can prove this conjecture by imposing some conditions on ϵ ( p ) (see for example [40]).
The estimate (40) can be used to analyze the problem of the existence of a solution of the nonlinear Equation (24). To find a solution that behaves as T t g when t , we should find a fixed point of the nonlinear operator
B ( f ) = T t g + t T t τ ( B f ) ( τ ) d τ ,
where f is considered as a function of t taking values in an appropriate space of functions of d variables. We can use the contraction principle to prove the existence of the fixed point of B under certain conditions [35].

7. Lecture 7

7.1. Multi-Particle States

In Section 6.2, we defined the elementary space 𝔥 as the space of test functions ϕ a ( x ) in which spatial translations act by shifting the argument. Test functions take values in C r . For definiteness, we assume that the test functions belong to the space S of smooth fast-decreasing functions.
In the momentum representation, spatial translations act as multiplication by e i k a and temporal translations as multiplication through e i E ( k ) τ , which follows from the assumption that time shifts are unitary operators commuting with spatial translations. Here, E ( k ) denotes a hermitian r × r matrix. By diagonalizing the matrix E ( k ) , we can reduce the general case to the case of r = 1 .
Let us now recall the notions of excitation and elementary excitation.
The elementary excitation of a translation-invariant stationary state ω (quasiparticle) is provided by mapping σ from an elementary space 𝔥 into the set of excitations. This mapping must commute with both spatial and temporal translations.
In the algebraic approach, the set of excitations is the pre-Hilbert space H , which is obtained using the Gelfand–Naimark–Segal (GNS) construction applied to a stationary translation-invariant state ω . The state ω is represented by a cyclic vector denoted by θ .
The map σ transforms ϕ 𝔥 into a vector σ ( ϕ ) , denoted by Φ ( ϕ ) in Section 6.2. This is a mapping into the space H , which means that there exists an element B ( ϕ ) from the algebra A which transforms the cyclic vector θ into our vector
σ ( ϕ ) = B ( ϕ ) θ .
We additionally assume that the mapping σ is isometric.
We want to emphasize here that while the operator B ( ϕ ) exists, it is not unique and must be chosen somehow. Here, we impose conditions on it, which will allow us to develop the scattering theory. In particular, we will require that it be linear in ϕ . As explained previously, each vector in the representation space of the algebra A corresponds to a state (that is, to a positive linear functional on A ). The state corresponding to σ ( ϕ ) can be represented by the formula
( σ ( ϕ ) ) ( A ) = A σ ( ϕ ) , σ ( ϕ ) .
If a vector σ ( ϕ ) is represented in the form (41), we have the following formula:
σ ( ϕ ) = L ( ϕ ) ω ,
where
L ( ϕ ) = B ˜ ( ϕ ) B ( ϕ ) .
Thus, in the algebraic approach we have a mapping σ : 𝔥 C acting according to the Formula (42).
In the geometric approach, one must forget about algebra, though there remains a cone of all states C . By definition, the mapping σ of an elementary space 𝔥 to the cone C defines an elementary excitation if it commutes with both spatial and temporal translations.
We postulate here that, as in the algebraic case, the mapping σ ( ϕ ) is obtained according to (42) by the action of some operator L ( ϕ ) on a translation-invariant stationary state ω . While the mapping σ ( ϕ ) considered in the algebraic situation was linear, the mapping σ ( ϕ ) is not linear at all. Indeed, from Formula (43) it follows that in the algebraic approach L ( ϕ ) is a quadratic expression, more precisely a hermitian expression, because it is linear in one variable and anti-linear in the other (an expression f is called hermitian if it can be represented in the form f ( x ) = F ( x , x * ) , where F ( x , y ) is linear in the first argument and antilinear in the second). It is natural to require that L ( ϕ ) satisfy the same conditions in the geometric approach. In what follows, we will use the word “quadratic” instead of “hermitian”, however, it should be understood that it is not really quadratic.
If one prefers to work with linear mappings, this can be done using the following general algebraic construction. For each linear complex space E in the tensor product E E ¯ of this space provided by a complex conjugate, one can construct a cone C ( E ) as a minimal cone containing all elements of the form e e ¯ (the bar stands for complex conjugation). We call the cone C ( 𝔥 ) an elementary cone. It corresponds to the elementary space 𝔥, and σ can be viewed as a linear mapping σ : C ( 𝔥 ) C of the elementary cone to the cone of states.
To simplify the notation, we consider the case in which the elementary space consists of scalar functions ( r = 1 ).
Suppose that the support supp ( Œ ) of the function ϕ ( p ) in the momentum space is a compact set. In this case, it is possible to find a bounded set U ϕ for which all points of the form ϵ ( p ) (where p belongs to supp ( Œ ) ) are interior points, i.e., the function ϵ ( p ) is assumed to be smooth.
Then, for large | τ | and x τ U ϕ , the function
( T τ ϕ ) ( x ) = d k e i k x i ϵ ( k ) τ ϕ ( k )
obeys
| ( T τ ϕ ) ( x ) | < C n ( 1 + | x | 2 + τ 2 ) n ,
where n is some integer (see (39) in Section 6.3).
In other words, for large | τ | the function ( T τ ϕ ) ( x ) is small outside the set τ U ϕ , which we call the essential support of the function ( T τ ϕ ) ( x ) .
Let us now return to the general case where the elementary space 𝔥 consists of vector-valued functions. We say that the set τ U ϕ is an essential support of the function
( T τ ϕ ) ( x ) = d k e i k x i E ( k ) τ ϕ ( k )
if
| | ( T τ ϕ ) ( x ) | | < C n ( 1 + | x | 2 + τ 2 ) n
at large | τ | and x τ U ϕ .
We say that functions ϕ and ϕ do not overlap if the distance between the sets U ϕ and U ϕ is positive; then, the corresponding essential supports do not overlap. Moreover, at large τ they are distant from each other. We say that ϕ 1 , , ϕ n is a non-overlapping family of functions if ϕ i does not overlap with ϕ j at i j . We will always assume that there are many non-overlapping families of functions; more precisely, linear combinations of non-overlapping families of functions should be dense everywhere in the space of families of functions we are interested in here. When r = 1 , this is satisfied, for example, when the function ϵ ( p ) is strictly convex.
What should we call a two-particle state in the algebraic approach? We want to note here that while we only needed spatial and temporal shifts when defining a one-particle space, now we need additional shifts. Before, we used the representation in (43) to describe a one-particle state with a wave function ϕ . When there are two particles, however, B must be applied twice: B ( ϕ ) B ( ϕ ) θ . When ϕ and ϕ have supports far apart in coordinate space, one can at least say that this vector describes a state of two distant particles. We must require that B ( ϕ ) and B ( ϕ ) almost commute with each other, in which case the particles will be bosonic, or almost anticommute, in which case the particles will be fermionic. This definition is provided in terms of states described by vectors, although it is possible to give a definition in terms of states described by positive functionals on the algebra A as well. For this purpose, we note that state, which corresponds to the vector B ( ϕ ) B ( ϕ ) θ , can be written in the form L ( ϕ ) L ( ϕ ) ω , where
L ( ϕ ) = B ˜ ( ϕ ) B ( ϕ ) , L ( ϕ ) = B ˜ ( ϕ ) B ( ϕ ) .
In all cases, L ( ϕ ) almost commutes with L ( ϕ ) .
In the geometric approach, a two-particle state is written as L ( ϕ ) L ( ϕ ) ω , where L ( ϕ ) almost commutes with L ( ϕ ) .
Thus, the distinction between bosons and fermions is smoothed out in the geometric approach.
In the following we will talk about bosons frequently; however, the transition to fermions is trivial, as it is only necessary to replace commutators with anticommutators.

7.2. Scattering: In-States and Out-States

We would first like to quote a wonderful statement by Bertrand Russell:
“The axiomatic method has many advantages over honest work."
Everything in this section will be very simple; unfortunately, this simplicity is achieved at the expense of working exclusively in the axiomatic approach. We want to remind readers that the axioms of local quantum field theory were first proposed by Wightman in the 1950s and to date there has been no known example of a non-trivial theory for which it has been proved that it satisfies all Wightman axioms in our three-dimensional space. While a major step forward occurred when many such theories were constructed in the formalism of conformal field theory in one-dimensional space, to date the Wightman axioms in three-dimensional space have been verified only within the framework of perturbation theory. In our approach, the situation is slightly better; nevertheless, the verification of necessary axioms remains a substantial problem. This will be discussed in the next lecture. Fortunately, at least in perturbation theory everything is fine.
Let us consider the scattering of elementary excitations in algebraic and geometric approaches. In the algebraic approach, we assume that the mapping σ of the elementary space 𝔥 to the space H defining a particle or quasiparticle can be written in the form
σ ( f ) = B ( f ) θ .
In the geometric approach, we assume the existence of a mapping L : 𝔥 E n d ( L ) , where
σ ( f ) = L ( f ) ω .
Both σ and σ should commute with translations.
In both approaches, we define states describing the scattering process.
In the algebraic approach, we define the operator B ( f , τ ) by the formula
B ( f , τ ) = T τ B ( T τ f ) ) T τ
It must be remembered here that the time-shift acts on an operator as a conjugation with the operator T τ .
In the geometric approach, we define the operator L ( f , τ ) by similar formula:
L ( f , τ ) = T τ L ( T τ f ) T τ .
It is easy to verify that L ( f , τ ) ω does not depend on τ . To confirm this, note that
L ( f , τ ) ω = T τ L ( T τ f ) ω = T τ σ ( T τ f ) = σ ( f ) .
Here, we have used the invariance of ω with respect to time translations, Formula (45), and the fact that σ commutes with time translations.
Using exactly the same reasoning, we can show that B ( f , τ ) θ does not depend on τ .
We find that
L ˙ ( f , τ ) ω = 0 ,
B ˙ ( f , τ ) θ = 0 ,
where the dot at the top indicates the derivative with respect to τ . This will be our main tool.
In the case of many particles, by analogy with the definition of a single-particle state, we first apply B ( f i , τ ) many times with different f i to θ :
Ψ ( f 1 , , f n | τ ) = B ( f 1 , τ ) B ( f n , τ ) θ ,
obtaining a multi-particle state. Then, we take the limit of the resulting expression at τ :
Ψ ( f 1 , , f n | ) = lim τ Ψ ( f 1 , , f n | τ ) .
We call this limit, which lies in the Hilbert space H ¯ (the completion of the space H ), the i n -state. Later on we will explain its physical meaning.
In the geometric approach, instead of B ( f , τ ) we take L ( f , τ ) :
Λ ( f 1 , , f n | τ ) = L ( f 1 , τ ) , L ( f n , τ ) ω
Λ ( f 1 , , f n | ) = lim τ Λ ( f 1 , , f n | τ ) .
We obtain an i n -state lying in L . In the algebraic approach, this corresponds to the vector
Ψ ( f 1 , , f n | ) .
Applying the operator T τ to L ( f , τ ) leads to a time shift in both arguments:
T τ ( L ( f , τ ) ) = T τ + τ L ( T τ f ) T τ τ = L ( T τ f , τ + τ ) .
This is a purely formal calculation.
Formula (49) implies that
T τ Λ ( f 1 , , f n | ) = Λ ( T τ f 1 , , T τ f n | ) .
If the functions f 1 , , f n do not overlap, essential supports of the functions T τ f i are far away in the limit τ , and it follows from (50) that in this limit τ the evolution of the i n -state Λ describes the process of scattering.
Usually, one considers the scattering of particles with definite momenta. It is inconvenient to work with definite momenta in our approach, however, as in such cases the wave functions are non-normalizable. We can consider the situation when the momentum lies in some narrow range, i.e., where the support of the wave function is a small piece of the momentum space. We can say that the state T τ Λ ( f 1 , , f n | ) describes a collision of particles with wave functions ( f 1 , , f n ) if these functions do not overlap. In this case, we assume that the corresponding operators L ( f i , τ ) almost commute for τ , i.e., their commutator in this limit vanishes:
lim τ | | [ L ( f i , τ ) , L ( f j , τ ) ] | | = 0 .
Why do we assume this? When f i and f j do not overlap, then in the limit τ the essential supports of the functions T τ f i and T τ f j are far away. In this case, from the point of view of physics it is natural to think that the corresponding operators almost commute.
The condition (51) can be derived from the requirement that the commutators of the two operators L that depend on the functions ϕ a ( x ) and ψ a ( x ) satisfy the inequality
| | [ L ( ϕ ) , L ( ψ ) ] | | d x d x D a b ( x x ) | ϕ a ( x ) | · | ψ b ( x ) |
where D a b ( x ) tends to zero faster than any power when x . Under these conditions, if the sets U f i for each pair of functions do not overlap, then the commutators in Formula (51) are close to zero. This means that we can permute the operators L in Formula (47) for the i n -state. It follows that i n -states are symmetric, i.e., they do not change when the arguments f i are rearranged.
Let us now prove that the limit in question exists. To do this, we additionally impose the condition of the smallness of the commutator [ L ˙ ( f i , τ ) , L ( f j , τ ) ] at τ , where the functions f i , f j , i j do not overlap. This is another axiom. More precisely, we impose the condition
| | [ L ˙ ( f i , τ ) , L ( f j , τ ) ] | | c ( τ ) ,
where c ( τ ) is summable:
| c ( τ ) | d τ < .
We can assume, for example, that c ( τ ) 1 / τ a , where a > 1 .
Next, we provide a very simple proof that the i n -state does exist (i.e., the expression Λ ( τ ) = Λ ( f 1 , , f n | τ ) has a limit at τ ). We prove that Λ ˙ ( τ ) is summable; hence, the expression
Λ ( τ 2 ) Λ ( τ 1 ) = τ 1 τ 2 Λ ˙ ( τ ) d τ
tends to zero as τ 1 , τ 2 . If (54) tends to zero, then Λ ( τ ) has a limit. This follows from the completeness of the space in which this vector lies.
Next, we need to prove that Λ ˙ ( τ ) is small. Recall that in the definition of Λ ( τ ) (Formula (47)) we repeatedly applied the operator L to the state ω . Here, we differentiate this expression by τ by applying the Leibniz rule. We obtain several summands, each of which contains a derivative of one of the factors L. We move L ˙ to the right, eventually moving it to the very last place. When we reach the very end, we use the equality L ˙ ω = 0 . As a result, Λ ˙ ( τ ) is a summable function of τ , as the commutators [ L ˙ ( f i , τ ) , L ( f j , τ ) ] are summable.
Thus, we have derived the existence of limits from (53). This is very important, as it proves that we can consider the scattering of particles in our picture. While our requirements were very simple, our axioms are sufficient to prove the existence of a limit, which in turn proves that there is a notion of scattering.
The conditions we imposed on L in the case of the geometric approach are axioms; in the case of the algebraic approach, similar conditions can be obtained as a consequence of more physical requirements, for example, from the asymptotic commutativity of the algebra A . All of the above reasoning is valid in the algebraic approach as well. In the algebraic approach, we can impose conditions
| | [ B ˙ ( f i , τ ) , B ( f j , τ ) ] | | c ( τ ) ,
where c ( τ ) is a summable function. The vector
Ψ ( τ ) = B ( f 1 , τ ) B ( f n , τ ) θ
will have a limit in the Hilbert space H ¯ at τ . We can prove this directly by the same method or derive it from the existence of the limit in (47). Here, we should work in the completion of the space H in order to apply the convergence condition.
We want to generalize the above statement a little bit. One can argue that the vector
Ψ ( f 1 , τ 1 , , f n , τ n ) = B ( f 1 , τ 1 ) B ( f n , τ n ) θ
has a limit in H ¯ , denoted
Ψ ( f 1 , , f n | ) ,
as τ j . Previously, we proved this statement for the case in which all times τ j are equal.
The proof can be based on the assumption
| | [ B ˙ ( ϕ ) , B ( ψ ) ] | | d x d x D a b ( x x ) | ϕ a ( x ) | · | ψ b ( x ) | ,
where D a b ( x ) 0 is faster than any degree when x (this condition is similar to condition (51)).
Analogous statements can be proved if commutators are replaced by anticommutators in this assumption or in (55).
Now, we will define the notion of an asymptotic bosonic Fock space H a s , assuming that operators B at large distances commute. We define an asymptotic bosonic Fock space H a s as a Fock representation of the canonical commutative relations
[ b ( ρ ) , b ( ρ ) ] = [ b + ( ρ ) , b + ( ρ ) ] = 0 , [ b ( ρ ) , b + ( ρ ) ] = ρ , ρ ,
where ρ , ρ 𝔥 .
In the case in which we consider anticommutators instead of commutators, the bosonic Fock space must be replaced by a fermionic Fock space via Fock representation of canonical anticommutation relations.
The action of spatial and temporal translations on the elementary space 𝔥 can be extended to the Fock space. This is clear because the n-particle part of the asymptotic space is the n-th symmetric or antisymmetric power of 𝔥 . The corresponding infinitesimal automorphisms (the asymptotic Hamiltonian and asymptotic momentum operator) are quadratic with respect to creation and annihilation operators, and coincide with the Hamiltonian and momentum operator on 𝔥, considered as the one-particle subspace of Fock space. The joint spectrum of the asymptotic Hamiltonian and the momentum operator coincides with the spectrum of non-interacting bosons or fermions.
Now, we will define the Møller matrix (half of the scattering matrix). The Møller matrix S ± transforms a vector b + ( f 1 ) b + ( f n ) | 0 from bosonic or fermionic Fock space into the state Ψ ( f 1 , , f n | ± ) . It is important that the i n -state is symmetric or antisymmetric. The fact that one can rearrange f i is essential; otherwise. this definition would make no sense, as an n-particle subspace in a Fock space is a symmetric or antisymmetric tensor product of the space 𝔥. The Møller matrices are defined on a dense subset of Fock space. In the next lecture, we will prove that it follows from the cluster property that S and S + are isometric embeddings of H a s into H ¯ ; hence, they can be extended to a Fock space considered as a Hilbert space. If both Møller matrices are isometric as well as unitary, that is, they are surjective mappings of the Fock space to the entire H ¯ , then we say that the theory has an interpretation in terms of particles. This means that almost every state is an i n -state (i.e., linear combinations of the i n -states are dense everywhere). In other words, almost every state in the limit τ evolves to a set of distant particles; a similar statement is true for τ + .
In this way, we obtain a picture similar to the so-called soliton resolution conjecture in classical theory (see Section 6.1).
Møller matrices commute with translations. This follows from Formula (50), which implies that the action of the time shift of the i n -state corresponds to the time shift of the arguments. The time shift of the arguments corresponds to the time shift in Fock space; hence, Formula (50) states that a time shift in Fock space corresponds to a time shift in Hilbert space H ¯ . The fact that Møller matrices commute with spatial translations is even easier to prove.
If a theory has a particle interpretation, then the Møller matrix S specifies unitary equivalence between the Hamiltonian and momentum operator in H ¯ and the corresponding operators in the asymptotic Fock space. The same is true for the Møller matrix S + .
The scattering matrix (S-matrix) can be defined by the formula
S = S + 1 S .
We provided a similar formula in the soliton picture (Section 6.1). The scattering matrix is the main object in quantum field theory.
Now, we will define the i n -operators a i n + using the limit of the operators B ( f , τ ) at τ :
a i n + ( f ) = lim τ B ( f , τ ) .
Why is this a legitimate definition? Let us turn to the Formula (56), where the operators B ( f i , τ i ) stand for different times. This means that for one of the arguments we can reach the limit earlier than for the other arguments. It should be emphasized that the i n -operator is not always defined; however, if in the Formula (57) all functions f 1 , , f n at least do not overlap, the i n -operator a i n + ( f 1 ) is defined on the vector Ψ ( f 2 , , f n | ± and maps onto the vector Ψ ( f 1 , , f n | ± .
In our definition, i n -operators depend linearly on the functions f. These operators can be regarded as generalized functions. We introduce the following notation:
a i n + ( f ) = d p f k ( p ) a i n , k + ( p ) ,
where a i n , k + ( p ) is a generalized function and the index k specifies the particle type.
We define o u t -operators in the same way, except that τ must tend to plus infinity:
a o u t + ( f ) = lim τ + B ( f , τ ) .
Notice, that i n -operators are related to operators in asymptotic space by the formulas
a i n + ( ρ ) S = S b + ( ρ ) , S | 0 = θ .
These formulas can be considered as an alternative definition of i n -operators. In the same way, one can define the operators a i n and a o u t associated with annihilation operators in Fock space:
a i n ( ρ ) S = S b ( ρ ) , a o u t ( ρ ) S + = S + b ( ρ ) .
There is an obvious connection between the definitions in the geometric and algebraic approaches. If the geometric approach is considered within the algebraic approach, then the operator L ( f , τ ) in the space of states corresponds to the operator B ( f , τ ) in H ¯ according to the formula L ( f , τ ) = B ˜ ( f , τ ) B ( f , τ ) .
The state Λ ( f 1 , , f n | τ ) corresponds to the vector Ψ ( f 1 , , f n | τ ) , while the state Λ ( f 1 , , f n | ) (the i n -state) state corresponds to Ψ ( f 1 , , f n | ) .
The analog of the Møller matrix in the geometric approach is denoted as S ˜ . While the Møller matrix S is a linear operator, S ˜ is a nonlinear operator. For theories that can be formulated algebraically, S maps a symmetric power 𝔥 treated as a subspace of Fock space into H ¯ . Taking the composition of this mapping with the natural mapping H ¯ to the cone of states C , we obtain S ˜ .
When L is quadratic or hermitian, this induces a multilinear mapping of the symmetric power of the cone C ( 𝔥 ) corresponding to 𝔥 into the cone C .
The scattering matrix describes a collision of particles. The connection of the scattering cross-section with the scattering matrix is explained in general courses in quantum mechanics and quantum field theory. Here, we explain the connection with the inclusive cross-section, which is simpler.
First, we must introduce the notion of the inclusive cross-section. The scattering cross-section is related to the transition probability of, say, a pair of particles to n particles ( M , N ) ( Q 1 , Q n ) . We can consider the process when at the end we obtain particles ( Q 1 , Q n ) plus something else:
( M , N ) ( Q 1 , , Q m , R 1 , , R n )
The inclusive cross-section is defined as the probability of such a process. One can obtain it from the usual cross-section by summing (more precisely, integrating) over R 1 , , R n . This is true in a theory having an interpretation in terms of particles (this means that everything decays into particles); however, we can define the inclusive cross-section even if there is no such interpretation, as we can consider a process ( M , N ) ( Q 1 , , Q n + s o m e t h i n g ) even if we do not know what s o m e t h i n g is.
In the geometric approach, only the inclusive cross-section makes sense. In the algebraic approach, we can work with the usual cross-section, though it is possible (and sometimes easier) to work with the inclusive cross-section.
Considering an arbitrary state ν , we can write the following formula for the probability density:
ν ( a o u t , k 1 + ( p 1 ) a o u t , k 1 ( p 1 ) a o u t , k m + ( p m ) a o u t , k m ( p m ) )
The expressions a o u t , k i + ( p i ) a o u t , k i ( p i ) are in fact the numbers of particles with momentum p i ; thus, Formula (58) represents the probability density in the momentum space of finding m outgoing particles of types k 1 , , k m with momenta p 1 , , p m . We do not look at other particles.
Thus far, ν has been any state; we now consider an i n -state as ν :
ν = Λ ( g 1 , , g n | ) = lim τ L ( g 1 , τ ) L ( g n , τ ) ω .
The i n -state is determined by the incoming particles. When we defined the inclusive cross-section, we collided two particles; it is possible to collide several. If we measure the number of outgoing particles in an i n -state, as in the Formula (58), by definition we obtain the inclusive cross-section. Thus, if we calculate the expression (58) when ν is an i n -state, we obtain an inclusive cross-section.
We now represent the answer in a different form. Consider the following expression:
1 | L ( g 1 , τ ) L ( g n , τ ) L ( g 1 , τ ) L ( g n , τ ) | ω
where we assume that g i and g j do not overlap and that the times tend to infinity with τ + , τ . Acting by operators L on the state ω , we obtain a linear functional on the algebra, and we can calculate its value on the unit element of the algebra. Note that here we have used what are called bra-ket notations, where from the left and from the right we have elements from dual spaces. Consider the expression (58) and denote its limit by Q as τ + , τ . Taking the limit τ , we obtain
Q = lim τ + 1 | L ( g 1 , τ ) L ( g n , τ ) ν ,
where ν = Λ ( g 1 , , g n | ) . As we have assumed that the functions do not overlap, all commutators tend to zero and Q does not change when rearranging g 1 , , g n .
Now let us look at the same formulas in the algebraic approach. We have L ( g , τ ) = B ˜ ( g , τ ) B ( g , τ ) and the formula ( M ˜ N ν ) ( X ) = ν ( M * X N ) . The operator M ˜ multiplies the argument by M * from the left, while the operator N multiplies the argument from the right. In Formula (59), these operators are applied to ν , and they simply change the argument of ν . In addition, 1 | σ = σ ( 1 ) . The result is the expression
Q = lim τ + ν ( B * ( g n , τ ) B * ( g 1 , τ ) B ( g 1 , τ ) B ( g n , τ ) ) .
In the limit τ + , operators B tend towards i n -operators and o u t -operators:
lim τ + B ( g , τ ) = a o u t + ( g ) , lim τ + B * ( g , τ ) = a o u t ( g ) .
Using this fact and the need for all operators to commute within the limit, we obtain the following expression for Q:
Q = ν ( a o u t ( g n ) a o u t ( g 1 ) a o u t + ( g 1 ) a o u t + ( g n ) ) ) .
We will call this expression
Q = Q ( g 1 , , g n , g 1 , , g n )
the inclusive scattering matrix. This expression is quadratic with respect to its arguments. We can switch from quadratic expressions to bilinear expressions, in which case the number of arguments will double. The resulting expression will also be called the inclusive scattering matrix, from which an inclusive cross-section can be obtained, though this is not quite a trivial process. The problem is that, in our definition of the inclusive scattering matrix, we considered it as a functional on non-overlapping families of functions. This functional is linear or antilinear, and as such can be regarded as a generalized function; however, the arguments of the generalized function (momenta) must be different. In the expression for the inclusive cross-section, the momenta can coincide; hence, we should take some limits in the matrix elements of inclusive scattering matrix to obtain the inclusive cross-section.
In the geometric approach, we can define an inclusive scattering matrix by taking, along with ω L , some translation-invariant state α L * :
lim τ + , τ α | L ( g 1 , τ ) L ( g m , τ ) × L ( f 1 , τ ) L ( f n , τ ) | ω .
Such a formula can be applied in the algebraic situation as well, as in the latter the states α and ω enter symmetrically. It can be formulate in the following way: Formula (60) provides the scalar product of the o u t -state in L * and the i n -state in L ; in other words, the same formula provides the inclusive scattering matrix of elementary excitations of the state ω and the inclusive scattering matrix of elementary excitations of the state α . This is a kind of duality, in our opinion, absolutely mysterious. In the algebraic approach, one can also consider this duality.
It is important to note that we can hope to find an interpretation in terms of particles only for elementary excitations of the ground state, as the conventional scattering matrix only makes sense in this case. However, the inclusive scattering matrix and inclusive cross-section make sense for almost-stable quasiparticles, and indeed almost-stable excitations of any translation-invariant stationary state.

8. Lecture 8

8.1. Link to Local Quantum Field Theory: The Cluster Property

What we have discussed thus far very different from what is usually laid out in textbooks on relativistic quantum field theory, which consider local theories. The main idea in the subsequent presentations is to emphasize that locality is not essential in most cases, and that the fields themselves are irrelevant. We do not know what should be called ’fields’ in the approach we are discussing here, even though all of quantum field theory is present.
We want to begin by establishing a connection between what we are currently discussing and what is commonly referred to as local relativistic quantum field theory.
In the axiomatic approach to local theory, there are different systems of axioms, starting with Wightman’s axioms, where local fields which are generalized operator functions represent the main objects. This is not very convenient, as while these fields are local, they are generalized functions; if we integrate them, we obtain ordinary operators. Then they are no longer local, and in a sense are quasilocal, i.e., concentrated in certain domains. We do not discuss Wightman’s axioms further.
We now discuss the system of axioms belonging to Araki, Haag, and Kastler. This system considers fields concentrated in some open subset of Minkowski space. It is assumed that such fields form an algebra of operators acting in a Hilbert space; the algebra should be closed with respect to weak convergence, though this is not essential. These operators should act in the representation space E of a unitary representation of the Poincaré group P .
It is assumed that for each bounded domain (bounded open subset) O of Minkowski space we have an algebra of operators A ( O ) acting in Hilbert space E such that:
  • When the domain becomes larger ( O 1 O 2 ), the algebra becomes larger ( A ( O 1 ) A ( O 2 ) );
  • The action of the Poincaré group P on algebras A ( O ) agrees with the action on domains A ( g O ) = g A ( O ) g 1 if g P ;
  • If the space–time interval between the points of domains O 1 and O 2 is space-like, then the operators belonging to the algebra A ( O 1 ) commute with operators belonging to the algebra A ( O 2 ) (roughly speaking, this means that we cannot have a causal relation between observables separated by a space-like interval);
  • The ground state θ of the energy operator is invariant with respect to the Poincaré group (with the Poincaré group representation, we can consider the energy operator (Hamiltonian) and momentum operators as infinitesimal generators of temporal and spatial translations, respectively);
  • The vector corresponding to the ground state is cyclic with respect to the union A of all algebras A ( O ) .
This is the axiomatic relativistic local quantum field theory.
In the above, a particle is defined as an irreducible subrepresentation of the Poincaré group as represented in the space E .
Let us return now to the definition of scattering in the algebraic approach. The consideration in Lecture 7 was based on axioms which are not easy to check. Here, we will impose requirements that are much easier to check. In particular, these requirements are fulfilled in relativistic local theory.
The starting point, as before, is an associative algebra A with involution (a *-algebra). Space–time translations are automorphisms of this algebra.
Recall that non-normalized states correspond to positive linear functionals on the algebra A and form a cone C . We will work here with non-normalized states. A translation-invariant stationary state will always be denoted as ω C . Excitations of ω states are elements of the pre-Hilbert space H , which is constructed from ω using the GNS construction.
In the algebra A with which we started there is no norm; however, because it is represented in the pre-Hilbert space H and its completion (the Hilbert space H ¯ ), we can consider a normed algebra A ( ω ) consisting of operators A ^ . Moreover, we can work with the completion of the algebra A ( ω ) with respect to this norm, though this is not necessary.
Now let us take an element A of the algebra A which is represented by a bounded operator A ^ in Hilbert space H ¯ . We can consider both temporal and spatial translations of this operator. The result will be denoted by A ^ ( x , τ ) . Moreover, we can average such an operator with a smooth and fast decreasing function α ( x , τ ) :
B = d τ d x α ( x , τ ) A ^ ( x , τ ) .
It is possible to shift the operator B in time and space:
B ( x , τ ) = d τ , d x α ( x x , τ τ ) A ^ ( x , τ ) .
One can differentiate under the sign of the integral. As the function α ( x , τ ) is assumed to be smooth, one can differentiate as many times as desired. Here, we always work with operators of the form (60), and we call them smooth.
We now consider asymptotically commutative algebras. In other words, we require the commutator of a shifted operator with another operator to become small at large spatial shifts. This can be formalized in different ways. We do this here in such a way that it is instantly clear that our condition is satisfied in the axiomatics of Araki, Haag, and Kastler. Specifically, we require that the norm | | [ B 1 ( x , τ ) , B 2 ] | | of the commutator of a shifted operator with respect to another operator corresponding to an element of the algebra A decreases faster than any power of | | x | | when x . We will impose the same condition on | | [ B ˙ 1 ( x , τ ) , B 2 ] | | , where the dot denotes the time derivative. All operators are smooth.
In the axiomatics of Araki, Haag, and Kastler this is always fulfilled, as after a large spatial shift the space-time interval between the corresponding domains becomes space-like; therefore, in this case we can say that. starting from some point, the commutator we consider is equal to zero (and as such decreases faster than any power).
Another definition of asymptotic commutativity is the condition
| | [ B 1 ( x , τ ) , B 2 ] | | C n ( τ ) 1 + | | x | | n ,
where C n ( τ ) is a polynomial and n is arbitrary (strong asymptotic commutativity). This condition is satisfied in the Araki, Haag, and Kastler axiomatics if the mass spectrum is bounded from below by a positive number.
In addition to asymptotic commutativity, we want to impose the cluster property on the state ω . In its simplest form, this means that
ω ( A ( x , t ) B ) = ω ( A ) ω ( B ) + ρ ( x , t ) ,
where ρ ( x , t ) is small for large x .
To formulate the cluster property in a more general form we require the notion of a correlation function, which is a generalization of the Wightman function from relativistic quantum field theory.
For this, we take some elements A 1 A r A and shift them in both space and time. By multiplying them, we obtain an element of the algebra, after which we apply ω or, which is the same, we take the average (the expectation value) of this product in the state ω . The result is
w n ( x 1 , t 1 , x n , t n ) = ω ( A 1 ( x 1 , t 1 ) A n ( x n , t n ) ) = A 1 ( x 1 , t 1 ) A n ( x n , t n ) ,
which is a correlation function.
It is useful to define the notion of a truncated correlation function
w n T ( x 1 , t 1 , x n , t n ) A 1 ( x 1 , t 1 ) A n ( x n , t n ) T .
This is done somewhat formally using an inductive formula linking truncated correlation functions to regular correlation functions:
w n ( x 1 , τ 1 , k 1 , x n , τ n , k n ) = s = 1 n ρ R s w α 1 T ( π 1 ) w α s T ( π s ) .
Here, R s denotes the set of all partitions of the set { 1 , , n } into subsets s (denoted by π 1 , , π s ), the number of elements in the subset π i is denoted by α i , and w α i T ( π i ) denotes the truncated correlation function with arguments x a , τ a , k a , where a π i . This formula expresses correlation functions in terms of truncated functions for all possible partitions of the set of indices.
When there are only two operators, the truncated correlation function has the form
w 2 T ( x 1 , τ 1 , k 1 , x 2 , τ 2 , k 2 ) = ω ( A 1 ( x 1 , t 1 ) A 2 ( x 2 , t 2 ) ) ω ( A 1 ( x 1 , t 1 ) ) ω ( A 2 ( x 2 , t 2 ) ) .
As ω is translation-invariant and stationary, both usual and truncated correlation functions depend only on the differences x i x j , t i t j . We can say that the cluster property is satisfied if the truncated correlation functions become small at x i x j . Smallness can be understood in different ways; here, we mean the strongest condition, namely, that at fixed t i the functions tend to zero faster than any power of the difference d = min x i x j . More precisely, we assume that
| w n T ( x 1 , t 1 , x n , t n ) | C s ( t ) d s ,
where s is any natural number and C s ( t ) is a polynomial function of times t i .
We can proceed to the momentum representation by applying the Fourier transform with respect to spatial variables. The invariance with respect to spatial translations leads to the appearance of the δ -function of the sum of momenta p i . It follows from the cluster property that the truncated correlation function in the momentum representation is a smooth function of the momenta multiplied by the δ -function:
ν n ( p 2 , , p n , t 1 , , t n ) δ ( p 1 + + p n ) .
Note that the Fourier transform of a fast-decreasing function is smooth.
In relativistic quantum theory, the cluster property is satisfied if the particle masses are bounded from below by a positive number (the mass gap).

8.2. Green’s Functions: Connection to the Scattering Matrix

A correlation function is defined as ω ( M ) , where M = A 1 ( x 1 , t 1 ) A r ( x r , t r ) . In our definition of Green’s function, we replaced M with the chronological product, where the same factors are ordered by time in descending order. This is what is called the chronological product (it is not defined when some times coincide, though this is irrelevant to our considerations). We can say that the Green’s function is the average (expectation value) of the chronological product with respect to ω . Equivalently, we can say that we are taking the expectation value of this product with respect to the vector θ corresponding to ω in the GNS construction.
We obtain the function
G n = ω ( T ( A 1 ( x 1 , t 1 ) A r ( x r , t r ) ) ) = θ | T ( A ^ 1 ( x 1 , t 1 ) A ^ r ( x r , t r ) ) | θ ,
which is called the Green’s function in the ( x , t ) -representation (i.e., in the coordinate representation).
As always, we can proceed to the momentum representation by taking the Fourier transform over x . This is what is called the ( p , t ) -representation (momentum and time). In addition, we can take the (inverse) Fourier transform with respect to the time variable, in which case the Green’s functions will be in the ( p , ϵ ) -representation, where the main variables are momenta and energies. We will need all of these representations.
Due to translational invariance, Green’s function in the ( x , t ) -representation depends on the differences x i x j , t i t j ; therefore, we have the factor δ ( p 1 + + p r ) in the ( p , t ) -representation, which corresponds to the momentum conservation law. In the ( p , ϵ ) -representation we additionally have the factor δ ( ϵ 1 + + ϵ r ) , which corresponds to the energy conservation law.
Let us consider the poles of Green’s function in the ( p , ϵ ) -representation. It should be noted that we always ignore δ -functions when talking about the poles. In particular, when Green’s function includes only two operators, in the ( p , ϵ ) -representation we have two momenta, two energies, and δ -functions depending on the momenta and energies:
G ( p 1 , ϵ 1 | A , A ) δ ( p 1 + p 2 ) δ ( ϵ 1 + ϵ 2 ) .
The function G ( p 1 , ϵ 1 | A , A ) depends on the variable p 1 and the variable ϵ 1 . It is important to note that the poles of such two-point Green’s functions with respect to energy at a fixed momentum correspond to particles. These poles depend on the momentum, and the corresponding function ε ( p ) provides the dispersion law for particles (the dependence of energy on momentum). These well-known facts can be easily deduced from of the reasoning that we use below.
We will prove here that in order to find the scattering amplitudes one should consider the asymptotic behavior of Green’s function in the ( p , t ) -representation when t ± . This is the first and most basic observation. The other observation is that this asymptotic behavior in the ( p , t ) -representation is governed by the poles in the ( p , ϵ ) -representation. More precisely, the residues in these poles describe the asymptotics. This is called “the on-shell value of the Green function”.
There is a well-known mathematical fact in that if the asymptotic behavior of a function ρ ( t ) at t ± has the form e i t E ± A ± , or put another way, if there is a limit lim t ± e i t E ± ρ ( t ) = A ± , then the (inverse) Fourier transform ρ ( ϵ ) has poles at the points E ± ± i 0 with residues 2 π i A ± ; in other words, the limit corresponds to the residues in the poles and the exponents correspond to poles. The poles are slightly shifted in the complex plane from the real axis in either the up or down direction. This is an extremely important observation.
One can either look at the poles in the ( p , ϵ ) -representation or look at the asymptotics in the ( p , t ) -representation. We show that the calculation of the scattering amplitudes is reduced to finding out the asymptotic behavior of the Green’s functions in the ( p , t ) -representation. Turning to the ( p , ϵ ) -representation, we can say that the scattering amplitudes are expressed in terms of the on-shell values of the Green’s functions. This is the Lehmann–Simanzyk–Zimmermann (LSZ) formula.
Below, we will prove the LSZ formula under certain conditions. First, we assume that the theory has an interpretation in terms of particles. This means that the Møller matrices S ± are unitary. Both S and S + provide unitary equivalence between the free Hamiltonian in the asymptotic space H a s and the Hamiltonian in the space H obtained with the GNS procedure. Second, we assume that the conservation laws for energy and momentum guarantee the stability of particles. The second condition will be relaxed in the next lecture.
As we want to simplify the notation, we discuss the case in which there is only one type of particle. Recall that we previously considered a generalized function Φ ( p ) corresponding to the state of a particle with a given momentum p and that this state is an eigenvector for both momentum and energy operators. The Hamiltonian acts on Φ ( p ) as multiplication by the function ε ( p ) (dispersion law):
H ^ Φ ( p ) = ε ( p ) Φ ( p ) , P ^ Φ ( p ) = p Φ ( p ) .
We must remember here that Φ ( p ) is a generalized function, and does not really exist. In order for all of this to make exact mathematical sense, we should integrate it with some test function ϕ ( p ) to obtain a vector Φ ( ϕ ) = d p ϕ ( p ) Φ ( p ) .
Now, we want to make the assumption that the one-particle spectrum does not overlap with the multi-particle spectrum.
Let us formulate this assumption more precisely. We denote by H 0 the one-dimensional subspace containing the vector θ , by H 1 the smallest closed subspace of H containing all vectors Φ ( ϕ ) (one-particle space), and by H M the orthogonal complement of the direct sum H 0 + H 1 (multiparticle space). A corresponding decomposition exists in asymptotic space. We assume that the joint spectra of the Hamiltonian and the momentum operator in these three spaces do not overlap.
The asymptotic Hamiltonian is free; it (and hence, H ^ ) has a spectrum completely determined by the function ε ( p ) . The energies of multiparticle excitations are simply the sums ε ( p 1 ) + + ε ( p n ) , while the corresponding momenta are p 1 + + p n . If we want to say that the one-particle spectrum does not overlap with the multi-particle spectrum, we must require the inequality
ε ( p 1 + + p n ) ε ( p 1 ) + + ε ( p n ) .
This means that particles with momentum p 1 + + p n cannot decay into particles with momenta p 1 , , p n . The conservation laws forbid decay.
Now, we will formulate the LSZ formula. To do this, we fix some elements A i A of the algebra A . Recall that we are working with smooth elements, though this is not especially important here. In addition, it is required that we obtain a non-zero vector by applying the operator A ^ i to the vector θ (which in relativistic quantum theory is interpreted as the physical vacuum) and projecting to one-particle space. More precisely, we require the projection of the vector A ^ i θ to be a one-particle state of the form
Φ ( ϕ i ) = ϕ i ( p ) Φ ( p ) d p ,
where ϕ i ( p ) is a function which does vanish anywhere. The projection of this vector onto the vector θ must vanish.
Let us consider Green’s functions containing both the elements A i and their adjoint elements A i * . We take the Green’s function in the ( p , t ) -representation:
G m n = ω ( T ( A 1 * ( x 1 , t 1 ) A m * ( x m , t m ) A m + 1 ( x m + 1 , t m + 1 ) A m + n ( x m + n , t m + n ) ) .
Then, we proceed to the ( p , ϵ ) -representation. It is convenient to change the sign of the variables p i and ϵ i to 1 i m . We multiply the Green’s function in the ( p , ϵ ) -representation by the expression
1 i m Λ i ( p i ) ¯ ( ϵ i + ε ( p i ) ) m < j m + n Λ j ( p j ) ( ϵ j ε ( p j ) ) .
where we have introduced the notation Λ i ( p ) = ϕ i ( p ) 1 .
Then, we take the limit ϵ i ε ( p i ) for 1 i m and ϵ j ε ( p j ) for m < j m + n .
Only the poles contribute to the limit; in other words, the calculation boils down to taking the residues of the poles.
We can carry out this procedure in two steps. First, we multiply the Green’s function by
1 i m ( ϵ i + ε ( p i ) ) m < j m + n ϵ j ε ( p j ) ) .
and take the limit ϵ i ε ( p i ) for 1 i m and ϵ j ε ( p j ) for m < j m + n .
At the end, we multiply by Λ i ( p ) for i > m and by Λ i ( p ) ¯ if i m . In physics, this is called the renormalization of the wave function. In the case where these factors are not included we can talk about on-shell Green’s functions, while if they are included we say that we are considering normalized on-shell Green’s functions.
The basic statement in the approach of Lehmann, Simanzyk, and Zimmermann is that the normalized on-shell Green’s function provides the scattering amplitude. To prove this, we will first consider the case where the operators A ^ i simply provide one-particle states A ^ i θ = Φ ( ϕ i ) i.e., there is no need to project. We call these good operators. At the end of the lecture, we will explain that the general case can be reduced to this particular case.
Thus far, we have considered the case in which there is only one type of particles. Let us now consider the case in which there are many types of particles, in other words, there are many functions Φ k ( p ) which are eigenvectorss for both momentum and energy:
P Φ k = p Φ k ( p ) , H Φ k ( p ) = ε k ( p ) Φ k ( p ) ,
which have different dispersion laws ε k ( p ) provided by smooth functions. As always, Φ k ( p ) are generalized functions, i.e., we should integrate them with test functions to obtain vectors from H . We consider test functions from the space S of smooth fast-decreasing functions. To guarantee that time shifts are well-defined in the space S , we should assume that the functions ε k ( p ) grow polynomially at most.
As already mentioned, we will work with good operators B k A (operators which are smooth and transform vectors θ into one-particle states B ^ k θ = Φ k ( ϕ k ) ). Now, we define the operator B ^ k ( f , t ) depending on the function f = f ( p ) as follows:
B ^ k ( f , t ) = f ˜ ( x , t ) B ^ k ( x , t ) d x ,
where the function f ˜ ( x , t ) is obtained as the Fourier transform of the function f ( p ) e i ε k ( p ) t with respect to the momentum variable.
Similar operators were considered in Section 7.2. They have the property that, when applying them to θ , we obtain a t-independent one-particle state
B ^ k ( f , t ) θ = Φ k ( f ϕ k ) ,
hence,
B ^ ˙ k ( f , t ) θ = 0 ,
where the dot stands for the time derivative. (in Section 7.2, the function ϕ k was equal to 1). In general, what was said in Section 7.2 can be repeated here as well. The fact that the resulting state is independent of time is the result of a formal calculation. The calculations become quite simple if we introduce operators
B ^ k ( p , t ) = d x e i p x B ^ k ( x , t ) .
If this operator is applied to θ , we obtain a one-particle state that does not depend on t.
Now, we are repeating the considerations of Section 7.2, though the notations have changed because we do not want to work with elementary spaces; thus, we write the indices explicitly.
We introduce a vector
Ψ ( k 1 , f 1 , , k n , f n | t 1 , , t n ) = B ^ k 1 ( f 1 , t 1 ) B ^ k n ( f n , t n ) θ ,
where it is assumed that the functions f 1 , , f n have compact supports.
Now, as in Section 7.2, we consider vectors v i ( p ) = ε k i ( p ) , which can be interpreted as velocities. We denote by U i an open set containing all possible velocities v i ( p ) , where p belongs to the support of the function f i . We require that all these sets do not overlap, and we will call the functions f 1 , , f n non-overlapping. This means that all classical velocities are different; therefore, the wave packets are moving in different directions. Then, as explained in Section 6.3, the corresponding wave functions almost do not overlap in the coordinate representation (i.e., the essential supports do not overlap).
Now, taking the limit t i , we will prove that the vector Ψ ( k 1 , f 1 , , k n , f n | t 1 , , t n ) has a limit, which we denote by
Ψ ( k 1 , f 1 , , k n , f n | ± ) .
The proof uses the same reasoning as in Section 7.2. Again, we assume that t i = t (i.e., all times coincide). In order to prove that there is a limit, we should prove that the derivative with respect to t is a summable function. This condition is satisfied. By definition, the vector Ψ is the result of repeatedly applying the operators B ^ k i ( f i , t i ) to θ . When we differentiate this expression with respect to t, we have a dot (denoting the time derivative) over one of the operators B ^ . We can move the operator with the dot to the right using the asymptotic commutativity and (39) (only if we work with non-overlapping functions). We obtain additional summands that are summable functions of t. This operator with a dot applied to θ gives zero due to (61), meaning that there is a limit.
Because the limit exists, we can define Møller matrices. To do this, we introduce the asymptotic space H a s as a Fock representation of the operators a k + ( f ) , a k ( f ) and define the Møller matrices S and S + as operators defined on a subset of H a s and taking values in H ¯ , following the formula
Ψ ( k 1 , f 1 , , k n , f n | ± ) = S ± ( a k 1 + ( f 1 ϕ k 1 ) a k n + ( f n ϕ k n ) | 0 ) ,
where | 0 is the Fock vacuum. This is the same formula as in the last lecture, except with the difference that we now have factors ϕ k i (recall here that a good operator B ^ k i acting on θ provides Φ k i ( ϕ k i ) ). The Møller matrices are defined on a dense subspace of the asymptotic Hilbert space H a s .
It can be proved, as we do now, that Møller matrices provide isometric embeddings of the asymptotic space H a s into the space H ¯ . The physical meaning of Møller matrices can be understood from the following formula (which was written in the last lecture with different notation):
e i H t Ψ ( k 1 , f 1 , , k n , f n | ± ) = S ± ( a k 1 + ( f 1 ϕ k 1 e i ε k 1 t ) a k n + ( f n ϕ k n e i ε k n t ) | 0 ) .
This formula means that when we consider evolution in the space H ¯ , the action of the evolution operator on the vector Ψ in the limit t ± corresponds in the asymptotic space to the evolution governed by a free Hamiltonian. In other words, the evolution of the vector Ψ for large t corresponds to the evolution of a system of n distant particles with non-overlapping wave functions f 1 ϕ k 1 e i ε k 1 t , , f n ϕ k n e i ε k n t .
Our definition of S ± can be ambiguous. For example, we can use different good operators, and it is not clear whether we obtain the same answer. However, we can prove that the answer does not depend on our choice. We can derive from the cluster property that S ± are isometric operators, that is, they preserve the norm and the scalar product. Such operators cannot be multivalued, as if two vectors coincide, the distance between them is 0; hence, two coinciding vectors must go to coinciding vectors. At the same time, we can see that the vector Ψ ( k 1 , f 1 , , k n , f n | ± ) does not change when the arguments ( k i , f i ) and ( k j , f j ) are permuted.
The main line of proof is as follows.
To define the Møller matrices, we used the vectors Ψ ( k 1 , f 1 , , k n , f n | t 1 , , t n ) specified by Formula (62). Note that according to (62) such a vector is obtained by repeatedly applying the operators B to θ . It is easy to see that the scalar product of two such vectors can be expressed in terms of correlation functions defined as the average values (expectation values) of products of the operators B and B * . The correlation functions are expressed in terms of truncated correlation functions, and in truncated correlation functions only two-point correlation functions can survive in the limit t ± if we require the cluster property. This remark relates the scalar product of two vectors of the form Ψ ( k 1 , f 1 , , k n , f n | ± ) to the scalar product in the asymptotic space.
This allows us to say that a Møller matrix is an isometric mapping.
Having introduced the notion of Møller matrix, we now introduce the notions of the o u t -operator and i n -operator:
a i n ( f ) S = S a ( f ) , a i n + ( f ) S = S a + ( f ) ,
a o u t ( f ) S + = S + a ( f ) , a o u t + ( f ) S + = S + a + ( f ) .
Again, we do not write an index here describing the types of particles.
It is easy to check that
a i n + ( f ϕ ) = lim t B ^ ( f , t ) , a o u t + ( f ϕ ) = lim t B ^ ( f , t ) .
The limit in (63) exists on the set of all vectors of the form Ψ ( k 1 , f 1 , , k n , f n | ± ) provided that f , f 1 , , f n is a non-overlapping family of functions. Interestingly, when the dimension of the space d 3 , under certain conditions this limit exists without the non-overlapping condition. This is insignificant for our purposes, because the non-overlapping condition provides a limit on a dense subset, which is sufficient.
On the basis of what we have said above, we can explicitly write out how the operators we have defined act on the i n -states:
a i n + ( f ϕ ) Ψ ( f 1 , , f n | ) = Ψ ( f , f 1 , , f n | ) ,
a o u t + ( f ϕ ) Ψ ( f 1 , , f n | ) = Ψ ( f , f 1 , , f n | ) .
a i n ( f ) Ψ ( ϕ 1 f , f 1 , , f n | ) = Ψ ( f 1 , , f n | ) ,
a o u t ( f ) Ψ ( ϕ 1 f , f 1 , , f n | ) = Ψ ( f 1 , , f n | ) .
These formulas can be seen as definitions of operators a i n and a o u t . Roughly speaking, the operators a i n / o u t + add one function to Ψ ( f 1 , , f n | ) and their associated operators destroy one of these functions.
If the operators S + and S are unitary, we can say that the theory has an interpretation in terms of particles. In this case, as well as in the more general case when the image of S coincides with the image of S + , we can define the scattering matrix (S-matrix)
S = S + * S .
as a unitary operator in the asymptotic space H a s . The asymptotic space is a Fock space. It has a generalized basis
| p 1 , , p n = 1 n ! a + ( p 1 ) a + ( p n ) | 0 .
In this basis, the matrix elements of the unitary operator S (the scattering amplitudes) can be expressed in in terms of i n -operators and o u t -operators. These are the same matrix elements for which the squares provide the effective scattering cross-sections. We obtain the following formula:
S m n ( p 1 , , p m | q 1 , , q n ) = a i n + ( q 1 ) a i n + ( q n ) θ , a o u t + ( p 1 ) a o u t + ( p m ) θ
This follows directly from the definition of i n -operators and o u t -operators. In Formula (63) and in the following, we omit the numerical coefficients ( m ! ) 1 ( n ! ) 1 .
The above formula is proved only for the case when all the momentum values p i , q j are different. More precisely, we must assume that all vectors v ( p i ) = ε ( p i ) , v ( q j ) = ε ( q j ) are different. When the function ε ( p ) is strictly convex, it is sufficient to assume that p i , q j are different. While this is not an essential constraint, it is present. Formula (63) should be understood in the sense of generalized functions. This means that the set of functions f i ( p i ) , g j ( q j ) with non-overlapping subsets U ( f i ) ¯ , U ( g j ) ¯ should be taken as test functions.
We now present the Formula (63) in different ways:
S m n ( f 1 , , f m | g 1 , , g n ) =
= d m p d n q f i ( p i ) g j ( q j ) S m n ( p 1 , , p m | q 1 , , q n ) =
= a i n + ( g 1 ) a i n + ( g n ) θ , a o u t + ( f ¯ 1 ) a o u t + ( f ¯ m ) θ .
Recalling that we defined the S-matrix (scattering matrix) taking limits t ± and using Formula (63), we arrive at the following representation:
S m n ( f 1 , , f m | g 1 , , g n ) =
lim t , τ θ | B ^ ( f ¯ m ϕ 1 , t ) * B ^ ( f ¯ 1 ϕ 1 , t ) * B ^ ( g 1 ϕ 1 , τ ) B ^ ( g n ϕ 1 , τ ) ) | θ =
lim t , τ ω ( B ( f ¯ m ϕ ¯ 1 , t ) * B ( f ¯ 1 ϕ ¯ 1 , t ) * B ( g 1 ϕ 1 , τ ) B ( g n ϕ 1 , τ ) ) ,
where B ( f , t ) * = d x B * ( x , t ) f ˜ ( x , t ) ¯ .
We can write a more general formula
S m n ( f 1 , , f m | g 1 , , g n ) =
lim t i , τ j ω ( B m ( f ¯ m ϕ m 1 , t m ) * B 1 ( f ¯ 1 ϕ 1 1 , t 1 ) * B m + 1 ( g 1 ϕ m + 1 1 , τ 1 ) B m + n ( g n ϕ m + n 1 , τ n ) ) ,
where all B i are different good operators and B i θ = Φ ( ϕ i ) .
A very important observation follows from the non-overlapping condition: in the limit t i , τ j , the order of factors is irrelevant both in the group with times tending to plus-infinity and in the group with times tending to minus-infinity. This means that for large times we can rearrange these operators. In particular, we can consider them to be ordered by time. This means that we can regard the expression under the sign of the limit as a Green’s function. This ends the proof of the statement that the matrix element of the scattering matrix can be expressed in terms of the asymptotic behavior of a Green’s function.
We can express the operators B ^ k ( f , t ) in terms of B ^ k ( p , t ) and obtain the following result:
S m n ( f 1 , , f m | g 1 , , g n ) =
d m + n p lim t i , τ j θ | f m ϕ ¯ m 1 e i ε m ( p m ) t m B ^ m ( p m , t m ) * f 1 ϕ ¯ 1 1 e i ε 1 ( p 1 ) t 1 B ^ 1 ( p 1 , t 1 ) * ×
g 1 ϕ m + 1 1 e i ε m + 1 ( p m + 1 ) τ 1 B ^ m + 1 ( p m + 1 , τ n ) g n ϕ m + n 1 e i ε m + n ( p m + n ) τ n B ^ m + n ( p m + n , τ n ) | θ .
We obtain the following formula for the matrix elements of the scattering matrix:
S m n ( p 1 , , p m | p m + 1 , p m + n ) =
lim t 1 , , t m , t m + 1 , , t m + n θ | ϕ ¯ m 1 e i ε m ( p m ) t m B ^ m ( p m , t m ) * ϕ ¯ 1 1 e i ε 1 ( p 1 ) t 1 B ^ 1 ( p 1 , t 1 ) * ×
ϕ m + 1 1 e i ε m + 1 ( p m + 1 ) t m + 1 B ^ m + 1 ( p m + 1 , t m + 1 ) ×
ϕ m + n 1 e i ε m + n ( p m + n ) , t m + n B ^ m + n ( p m + n , t m + n ) | θ ,
This formula tells us that, starting with good operators, we can express the scattering matrix in terms of Green’s functions in the ( p , t ) -representation, or more precisely, in terms of their asymptotics at t i for i m and t j for i > m . We have factors ϕ i 1 , which are exactly the same Λ that were introduced in order to obtain normalized Green’s functions. The fact that we are considering asymptotics means that we are now taking on-shell Green’s functions in the energy representation. The fact that we have factors ϕ i 1 means that we have obtained normalized Green’s functions, which for our purposes is the end of the story.
We have provided a proof of the LSZ formula for good operators. From this, as we have said, one can draw the conclusion that the same is true for a much broader class of operators. We will explain this in a situation where there is only one type of particle.
In the approach of Lehmann, Simanzyk, and Zimmermann, the operators A i are almost arbitrary. It is only necessary that the projection of the vector A ^ i θ on the one-particle states is nonzero and that the projection of this vector on the vector θ vanishes.
What is important to us is that in the definition of the on-shell Green’s function these operators can be replaced by smooth operators A i = α ( x , t ) A i ( x , t ) d x d t . It is easy to confirm that this does not change the normalized on-shell Green’s functions. The proof is based on the remark that A i ( x , t ) can be obtained as a convolution of α ( x , t ) with A i ( x , t ) . This is the first observation; the second observation is that for an appropriate choice of α one can consider A i as good operators. Specifically, we can take α ( x , t ) in such a way that the support of its Fourier transform α ^ ( p , ω ) does not intersect with the multiparticle spectrum and does not contain zero (we assume that the one-particle spectrum does not intersect the multiparticle spectrum). In this case, we automatically obtain a good operator.
Let us sketch the proof of this fact. We have already said that the operator A i ( x , t ) is obtained from A i ( x , t ) by convolution with α ( x , t ) . In the ( p , ϵ ) -representation, the convolution turns into multiplication via the Fourier transform α ^ ( p , ω ) of α ( x , t ) . If we consider the spectrum of the energy and momentum operators, multiplication by the function α ^ ( p , ω ) in the ( p , ϵ ) -representation kills all points of the spectrum where this function is equal to zero. The function we consider here kills the multi-particle spectrum, and we obtain a good operator.

9. Lecture 9

9.1. Introduction and Reminder

In this lecture, as in Lecture 8, we will consider scattering theory in the algebraic approach while assuming asymptotic commutativity and the cluster property. However, instead of the conjecture that the one-particle spectrum does not overlap with the multiparticle spectrum, we will make a weaker assumption. In addition to the standard LSZ formula, we will prove its analog for the inclusive scattering matrix.
The reasoning is much the same as in the previous lectures, though we will try to make this lecture independent of the previous two.
We will use the notion of generalized Green’s functions (Section 5.2). These functions appear naturally in the Keldysh formalism and in the formalism of L-functionals. We want to show how the inclusive scattering matrix is expressed in terms of generalized Green’s functions. For this purpose, we will re-prove the LSZ formula. We first show how an ordinary scattering matrix is expressed in terms of ordinary Green’s functions, though the proofs are constructed in such a way that it is clear that they can be repeated for the inclusive scattering matrix and generalized Green’s functions.
We will change the notation somewhat here. We denote the time variable by by the letter τ and write B ( τ , x ) instead of B ( x , τ ) . The state ω , as before, is assumed to be translation-invariant and stationary, while the corresponding vector in the pre-Hilbert space H is denoted by θ .
The ordinary Green’s function in the ω state is the average (i.e., the expectation value) of the product of the operators B i A :
N = T ( B 1 ( τ 1 , x 1 ) B n ( τ n , x n ) ) ,
where the times are decreasing (chronological order). In the generalized Green’s function
G n n = ω ( M N ) ,
we have both a chronological product N in which times are decreasing, and an anti-chronological product
M = T o p p ( B 1 * ( τ 1 , x 1 ) B n * ( τ n , x n ) )
in which the times are increasing. We will prove that the inclusive scattering matrix is expressed in terms of the asymptotic behavior of the generalized Green’s functions in the representation where the arguments are momenta and time. Following the general properties of the Fourier transform, this means that the inclusive scattering matrix is expressed in terms of the poles of the generalized Green’s functions, where the arguments are energies and momenta. More precisely, it coincides with the normalized generalized on-shell Green’s function.

9.2. Møller Matrix

The starting point of the theory is a *-algebra A where time translations T τ and spatial translations T a act as automorphisms together with the translation-invariant stationary state ω . Applying the Gelfand–Naimark–Segal (GNS) construction to this state, we obtain a pre-Hilbert space H in which the one-particle physical states d p f ( p ) Φ ( p ) lie.
We restrict ourselves to the case when there is only one type of particle. In other words, we consider the generalized vector function Φ ( p ) , which is an eigenvector for both the energy operator and the momentum operator:
H Φ ( p ) = ϵ ( p ) Φ ( p ) , P Φ ( p ) = p Φ ( p ) .
Here, we can add an index corresponding to the type of particles to make this a general case.
Another definition we will use is that of a smooth operator. We say that the operator B ^ provided by the formula
B ^ = d τ d x α ( τ , x ) A ^ ( τ , x ) , α S , A A
is a smooth operator if α belongs to the space S of fast-decreasing smooth functions. In the case of a shift in space and time, the operator
B ^ ( τ , x ) = d τ d x α ( τ τ , x x ) A ^ ( τ , x ) ,
will be a smooth function of τ and x .
For the operator B ^ ( τ , x ) , we take the Fourier transform with respect to spatial variables:
B ^ ( τ , x ) = d p e i p x B ^ ( τ , p ) .
Here, we consider the Fourier transform in the sense of generalized functions; more precisely, the operator B ^ ( τ , p ) should be considered as a regular function of τ and a generalized function of p .
It is convenient to introduce the following notations:
B ( p , τ ) = e i ϵ ( p ) τ B ^ ( τ , p ) ,
where ϵ ( p ) is the dispersion law for the particle in question, and
B ( f , τ ) = T τ B ( T τ f ) T τ = d p f ( p ) B ( p , τ ) = d p d x e i ϵ ( p ) τ i p x f ( p ) B ^ ( τ , x ) ,
where f ( p ) is a smooth function having compact support.
We assume that the function B ˙ ( f , τ ) θ is summable:
d τ | | B ˙ ( f , τ ) θ | | <
and that B ( f , τ ) θ tends to a one-particle state, as τ ± :
lim τ ± B ( f , τ ) θ = Φ ( g ) .
where g = f ϕ . We say that an element B A obeying (64) and (65) is admissible. We will verify that almost all elements are admissible in a theory having a particle interpretation.
The following expression in which B i are admissible elements can be considered as a vector representing an n-particle state:
Ψ ( τ ) = Ψ ( f 1 , , f n | τ ) = B 1 ( f 1 , τ ) B n ( f n , τ ) θ .
We define the notion of the i n -state by taking the limit τ in this expression, and the notion of the o u t -state by taking the limit τ + :
Ψ ( f 1 , , f n | ± ) = lim τ ± Ψ ( f 1 , , f n | τ ) ,
where the limit is taken in the Hilbert space H ¯ . These limits describe the scattering process. They exist under conditions. The simplest requires that the commutators of the operators B k ( f k , τ ) and B ˙ j ( f j , τ ) (where the dot denotes differentiation with respect to τ ) tend to zero fast enough at τ ± :
| | [ B ˙ k ( f k , τ ) , B l ( f l , τ ) ] | | C 1 + | τ | a ,
where a > 1 is a fixed number (it is sufficient here to require that the left-hand side be a summable function).
If this condition is satisfied, then the limit (67) exists. This proof is similar to the proof presented in Section 7.2 and Section 8.2. Let us now differentiate Ψ ( τ ) , as represented by Formula (66) with respect to time. If the derivative is a summable function of τ , then a limit exists. The Leibniz rule produces n summands, each of which contains a derivative B ˙ i . Everything is wonderful when this derivative is at the last place, because we assumed that the function B ˙ ( f , τ ) θ is summable. If the derivative is not in the last place, then due to conditions on the commutators (68) we can move it to the last place and obtain a summable function; however, we have to pay with commutators, which are themselves summable functions. In this case, the difference Ψ ( τ 1 ) Ψ ( τ 2 ) , represented as an integral of Ψ ˙ ( τ ) , becomes small at τ 1 , τ 2 ; hence, due to the completeness of Hilbert space, we obtain the existence of the limit (67).
Our main tool here has been the smallness of the commutator. The statement (68) that the operators entering the expression (66) almost commute at large τ can be derived from asymptotic commutativity together with the requirement that the functions f k ( p ) appearing in Formula (68) do not overlap. Recall that when we considered the behavior of the wave function f k as a function of τ we saw that the set of possible velocities plays an important role in the analysis of evolution in x-space (see Section 6.3); we define the set U f as an open set containing all vectors of the form v = ϵ ( p ) , where p belongs to the support of the function f ( p ) . We require that the sets U f k do not overlap. Roughly speaking, this means that the particles move in different directions and the essential supports of wave functions in coordinate space are far apart. We will impose this condition all the time. It is not always satisfied; however, we require it to be satisfied for families of functions f 1 , , f n belonging to a dense subset of space S n . If the asymptotic commutativity condition is imposed, it follows that the commutators we need are small.
The asymptotic commutativity condition, first of all, means that the commutator of operators B k will be small at the same time but at distant spatial points. The condition of smallness can be varied; howver, we need this commutator to at least be small in the following sense:
[ B ^ k ( τ , x ) , B ^ l ( τ , x ) ] < C 1 + | x x | a ,
where a > 1 is a fixed number. This estimate must be satisfied for the operators themselves as well as for their time and space derivatives.
Recall now that all operators B ^ k are smooth, that is, they can be obtained by smoothing some operators B ^ k = g k ( t , x ) A ^ k ( t , x ) d x d t , where g k S . The strong asymptotic commutativity condition can be imposed on the operators A ^ k
[ A ^ k ( t , x ) , A ^ l ( t , x ) ] < C a ( t t ) 1 + | x x | a
for any a. This means that we require the commutators to decrease faster than any power when the spatial distance tends to infinity. The numerator should contain a polynomial function C a ( t ) . It is not necessary to impose conditions on the derivatives, as they follow from (70).
In order to derive the condition (69) from the strong asymptotic commutativity, it is sufficient to note that it can be reduced to an integral of an expression including the product of functions f k with different indices, which have essential supports distant from each other and use (39) (see Section 6.3). These integrals will be small, as for distant operators the commutators tend to zero faster than any power. It is very easy to make an estimate in this case.
One can expect that strong asymptotic commutativity takes place when there is a gap in the spectrum of the Hamiltonian, i.e., the spectrum belongs to the ray ( ϵ , + ) starting at some positive ϵ . In the case of relativistic theory, this corresponds to the case when particles have masses bounded below by a positive number. When the mass is zero, there will be no strong asymptotic commutativity, though weaker conditions ( a > 1 ) can be fulfilled in conformal theories where all anomalous dimensions > 1 2 .
We define the notion of the Møller matrix as follows. We start with the notion of asymptotic space. Asymptotic space H a s is defined as the representation space of Fock representation of canonical commutative relations:
[ a ( p ) , a + ( p ) ] = δ ( p , p ) , [ a ( p ) , a ( p ) ] = [ a + ( p ) , a + ( p ) ] = 0 .
Here, we are working with momentum variables.
Instead of generalized operator functions a ( p ) , a + ( p ) , we can consider operators a ( f ) = d p f ( p ) a ( p ) , a + ( f ) = d p f ( p ) a + ( p ) , where f h = S . Space and time translations act in the Fock space; this is obvious if we remember that this space can be represented as a completion of the direct sum of symmetric powers of the elementary space 𝔥 = S . Let us define the Møller matrices as mappings S ± : H a s H ¯ from the asymptotic space H a s to H ¯ (to the completion of H ). This mapping is defined as follows:
Ψ ( f 1 , , f n | ± ) = S ± ( a + ( g 1 ) a + ( g n ) | 0 ) .
where f i and g i are related by Formula (65): lim τ ± B ( f i , τ ) θ = Φ ( g i ) . This condition is equivalent to the relation
S ± a + ( g i ) | 0 = Ψ ( f i | ± ) .
On the left-hand side of the Formula (71), we have an i n -state or o u t -state that depends on the functions f i . We impose the condition that the functions f i and g i be related in the following way: f i = g i ϕ i 1 , where ϕ i does not vanish anywhere. Various properties of the Møller matrix can be derived from Formula (71). One of them is that the Møller matrix commutes with spatial and temporal translations. This follows almost immediately from the definitions (see Section 7.2).
Let us show that Formula (71) defines the Møller matrix as an isometric mapping S ± : H a s H ¯ .
First of all, let us prove that Møller matrices do not depend on the choice of operators B. Recall that the operators B in the Formula (66) can be different. Let us change one of these operators, say, the last operator B n , by replacing it with another operator while leaving the function g n unchanged. Then, the limit changes. This follows from (72), because the change of the last operator is reduced to the change of the one-particle state.
Now, let us change the operator B k , where k < n . Due to asymptotic commutativity, the operators B i have vanishingly small commutators; thus, we can move the operator B k to the last place. This must be paid for with commutators; however, given the limit this does not change anything. After moving B k to the last place, we can apply the previous reasoning. This proves that the Møller matrix does not depend on the choice of operators B i used in the definition (66).
For the Møller matrix to be well-defined, Formula (71) must have symmetry with respect to the variables g i . This symmetry is present because the commutators are small in the case where there is strong asymptotic commutativity and we are dealing with non-overlapping functions.
If we impose the cluster property in addition to asymptotic commutativity, we can prove that the Møller matrices are isometric (hence, they are single-valued maps that do not depend on the choice of admissible operators B i ). Recall that they were previously defined only for a set of non-overlapping functions; however, as they are isometric they can be extended to the whole asymptotic Hilbert space. We can find that the operators S ± are isometric embeddings of the asymptotic space H a s in the space H ¯ .
Note that all the above statements can be derived from cluster property without applying the asymptotic commutativity; however, with asymptotic commutativity everything is much more transparent.

9.3. The Scattering Matrix: The LSZ Formula

Now, we want to define the notion of the scattering matrix. This notion is reasonable if the theory has an interpretation in terms of particles, which implies that Møller matrices are not only isometric, but are unitary operators. In this case, they define an isomorphism of the asymptotic space H a s and the space H ¯ . This means, roughly speaking, that any (or almost any) state decays into particles over time. Something similar was discussed in Section 6.1 under the name of the “soliton resolution conjecture”.
Let us now proceed to the calculation of the matrix elements of the scattering matrix. Let there be some initial state and a final state, respectively denoted by the letters i and f. The scattering matrix is defined by the formula S = S + * S . Note that S specifies the mapping H a s H ¯ and S + * acts in the opposite direction; hence, the scattering matrix is an operator in asymptotic space. Here, we must take its matrix elements between states in the asymptotic space: f | S | i = f | S + * S | | i .
Consider states in the asymptotic space that are obtained from the Fock vacuum by applying some number of a + operators:
| i = a + ( g 1 ) a + ( g n ) | 0 , | f = a + ( g 1 ) a + ( g m ) | 0 .
Expressing the matrix element in terms of operators B, we arrive at the following formula:
f | S | i = lim τ k + , τ j θ | B m * ( f m , τ m ) B 1 * ( f 1 , τ 1 ) B 1 ( f 1 , τ 1 ) B n ( f n , τ n ) | θ ,
where f j = g j ϕ j 1 , f k = g k ϕ k 1 . Formulas (66) and (67) are used here with a slight improvement. Previously, in determining the i n -vector it was assumed that all times in Formula (66) were the same. While this is sufficient for the definition of the Møller matrix, here it is convenient (though not necessary) to assume that the times are different, though all of them respectively tend to plus or minus infinity. The proof of this can be easily obtained, but we do not provide it here.
A very important observation is that in the Formula (73) all times can be ordered in descending order. Clearly, τ is greater than τ , as τ + and τ . If we take two times τ i 1 , τ i 2 , then the commutator of the corresponding operators is small, as the corresponding functions f i 1 , f i 2 do not overlap. In such a case, we can use the operators in descending order of times and obtain an expectation value of a chronological product or, in other words, a Green’s function.
In order to keep things simple, we use the standard generalized basis | i = | p 1 , , p n , | f = | p 1 , , p m . Actually, in physics, the matrix elements of the scattering matrix must be taken exactly in such a basis, which provides scattering amplitudes of particles with given momenta. Let us now rewrite Formula (73) on this basis. The operators in this formula can be represented as B ( f , τ ) = d p f ( p ) B ( p , τ ) . On the other hand, recall that we previously introduced the notation B ( p , τ ) = e i ϵ ( p ) τ B ^ ( τ , p ) .
As a result, we obtain the following expression for the matrix element in the standard basis:
f | S | i = lim τ k + , τ j ω e i ϵ ( p k ) τ k ( ϕ k ¯ ) 1 B ^ k * ( τ k , p k ) e i ϵ ( p j ) τ j ( ϕ j ) 1 B ^ j ( τ j , p j ) .
In this formula, we can place the numerical factors in front of the sign of the (linear) functional ω . In the remaining expression ω B ^ k * ( τ k , p k ) B ^ j ( τ j , p j ) , we can assume that times are ordered; hence, we obtain a Green’s function, which is what was required.
We can see that the scattering matrix can be expressed in terms of the asymptotic behavior of the Green’s function in the ( τ , p ) -representation. As already explained, we can determine the asymptotics in the ( ϵ , p ) -presentation using the poles with respect to the energy variable, which provides the LSZ formula for admissible operators.
Møller matrices commute with time and space shifts, and as such with energy and momentum operators. If the theory can be interpreted in terms of particles, then they specify the unitary equivalence of the operators H ^ and P ^ in the space H ¯ with the free Hamiltonian d p ϵ ( p ) a + ( p ) a ( p ) and momentum operator d p p a + ( p ) a ( p ) in the asymptotic space. It follows that the joint spectrum of H ^ and P ^ coincides with the spectrum of a free boson.
Let us now introduce i n -operators and o u t -operators obeying CCR relations by the following formulas:
a o u t + ( f ) S + = S + a + ( f ) , a i n + ( f ) S = S a + ( f ) , a o u t ( f ) S + = S + a ( f ) , a i n ( f ) S = S a ( f ) .
It is obvious that
H ^ = d p ϵ ( p ) a i n + ( p ) a i n ( p ) ,
P ^ = d p p a i n + + ( p ) a i n ( p ) .
Similar formulas are valid for o u t -operators.
It is important to note that on a dense subset of S ( H a s ) we have
a o u t + ( g ) = lim τ + B ( f , τ ) ,
where g = f ϕ is defined by (65) (see Section 8.2).
If there is a representation in terms of particles, any vector in H ¯ has the following decomposition:
r 0 d p 1 d p r c r ( p 1 , , p r ) a i n + ( p 1 ) a i n + ( p r ) θ .
This follows from similar decomposition in asymptotic space.
Let us represent the vector | B ^ | θ in the form (75). For simplicity, assume that θ | B ^ | θ = 0 . Then, we can represent B ^ ( f , τ ) θ as
B ^ ( f , τ ) θ = d p ϕ ( p ) f ( p ) a i n + ( p ) θ +
r 2 d p 1 d p r e i τ ( ε ( p 1 ) + ε ( p r ) ε ( p 1 + p r ) ) c r ( p 1 , , p r ) f ( p 1 + + p r ) a i n + ( p 1 ) a i n + ( p r ) θ .
In this formula, the summation should be over all r; however, while for r = 1 the time dependence disappears due to cancellation in the exponent. The contribution of r = 1 provides the first summand. In most cases, the sum over r 2 tends to zero because the exponent i τ ( ε ( p 1 ) + ε ( p r ) ε ( p 1 + p r ) ) tends to infinity as τ for almost all values of the arguments. If we assume that the energy and momentum conservation laws forbid the decay of the particle, as was assumed in Section 8.2, then the exponent always tends to infinity. We obtain the relation (65) with g ( p ) = ϕ ( p ) f ( p ) together with the fact that Φ ( g ) is a projection of B ^ ( f , τ ) θ on one-particle space, which is equal to Φ ( g ) .
Differentiating the expression for B ^ ( f , τ ) θ with respect to τ , we can obtain (64) by means of similar reasoning.
This means that in theories having a particle interpretation almost all elements B A are admissible. Indeed, it seems that all operators encountered in physics are admissible.
This concludes the proof of the LSZ formula.

9.4. Inclusive Scattering Matrix

Now, we want to apply these results to express the inclusive scattering matrix in terms of generalized Green’s functions. For this purpose, we will consider i n -states. Earlier, we considered them as vectors in Hilbert space; now, we will consider them as positive functionals on algebra. Every vector Ψ H ¯ specifies a positive functional on the algebra. If we apply some operator, for example B ( f , τ ) , to a vector Ψ , then to obtain a positive functional corresponding to B ( f , τ ) Ψ we should apply an operator L ( f , τ ) = B ˜ ( f , τ ) B ( f , τ ) to the positive functional corresponding to Ψ . Recall that the algebra element defines two operators on functionals; we can either multiply the argument from the right, or multiply the argument by the adjoint element from the left.
In terms of such operators, the i n -state corresponding to the vector (67) and considered as a positive functional can be written in the following form:
ν = lim τ L ( f 1 , τ ) L ( f n , τ ) ω .
Now, consider the following expression:
1 | L ( f 1 , τ ) L ( f n , τ ) L ( f 1 , τ ) L ( f n , τ ) | ω .
One can prove the existence of a limit of this expression, as at τ + , τ , assuming that all functions f i and f j do not overlap (recall that asymptotic commutativity is always assumed). We denote this limit by Q; it can be represented in the form
Q = lim τ + 1 | L ( f 1 , τ ) L ( f n , τ ) | ν .
Expressing Q in terms of functions g k , g k related to functions f k , f k by Formula (65), we obtain an expression Q = Q ( g 1 , , g n , g 1 , , g n ) that we call the inclusive scattering matrix. All inclusive cross-sections can be calculated in terms of the inclusive scattering matrix. This is clear from the expression
Q = ν ( a o u t ( g n ) a o u t ( g 1 ) a o u t + ( g 1 ) a o u t + ( g n ) ) ) ,
which follows the relation 1 | σ = σ ( 1 ) and the formula lim τ + B ( f k , τ ) = a o u t + ( g k ) (see (74)). Recall that the notions of the inclusive cross-section and inclusive scattering matrix were discussed at the end of Section 7.2.
Q is a non-linear expression with respect to g and g ; it is a quadratic (or rather hermitian) expression. Any quadratic expression can be extended to a bilinear form, and a hermitian expression can be extended to a sesquilinear form, meaning that for one variable it will be linear, while for another it will be antilinear.
In order to obtain an expression that will be linear (or antilinear somewhere), we introduce the notation L ( g ˜ , g , τ ) = B ˜ ( f ˜ , τ ) B ( f , τ ) . The variables f and f ˜ are separated here; now, what used to be treated as a hermitian expression is treated as a sesquilinear expression depending on a doubled number of variables:
ρ ( g ˜ 1 , g 1 , , g ˜ n , g n , g ˜ 1 , g 1 , , g ˜ n , g n ) =
lim τ + , τ 1 | L ( g ˜ 1 , g 1 , τ ) L ( g ˜ n , g n , τ ) L ( g ˜ 1 , g 1 , τ ) L ( g ˜ n , g n , τ ) | ω .
This expression will also be called the inclusive scattering matrix, and we will express it in terms of generalized Green’s functions. First of all, we represent it in the form
ρ ( g ˜ 1 , g 1 , , g ˜ n , g n , g ˜ 1 , g 1 , , g ˜ n , g n ) =
lim τ + , τ 1 | B ( f 1 , τ ) B ( f n , τ ) B ˜ ( f ˜ n , τ ) B ˜ ( f ˜ 1 , τ ) ×
B ( f 1 , τ ) B ( f n , τ ) B ˜ ( f ˜ n , τ ) B ˜ ( f ˜ 1 , τ ) | ω .
It is convenient to use the more general formula
ρ ( g ˜ 1 , g 1 , , g ˜ n , g n , g ˜ 1 , g 1 , , g ˜ n , g n ) =
lim τ j , τ ˜ j + , τ i , τ ˜ i 1 | B ( f 1 , τ 1 ) B ( f n , τ n ) B ˜ ( f ˜ n , τ ˜ n ) B ˜ ( f ˜ 1 , τ ˜ 1 ) ×
B ( f 1 , τ 1 ) B ( f n , τ n ) B ˜ ( f ˜ n , τ ˜ n ) B ˜ ( f ˜ 1 , τ ˜ 1 ) | ω .
We can assume that in this expression the times are ordered. Part of the times tend to + and another part to ; due to asymptotic commutativity, we can rearrange the factors within each group in any order, in particular, in order of decreasing times. Applying Formula (23), we can see that what stands under the limit sign can be expressed in terms of generalized Green’s functions. The inclusive scattering matrix is expressed in terms of the time asymptotics of this function. The same reasoning was used for the usual scattering matrix; only the number of arguments is doubled here.

10. Lecture 10

10.1. Elimination of Redundant States

In this lecture, we plan to explain how one can represent quantum mechanics as a form of classical mechanics in which we have the ability to measure only a part of the observables. From the point of view of physics, this is quite natural. While our instruments allow us to measure some observables, it is possible that there exist better instruments that could allow us to measure other things.
We will use the geometric approach to quantum theory. In the geometric approach, we start from a set of states, which is a bounded convex closed subset C 0 of the complete topological vector space L . The evolution operators must belong to some group V which consists of automorphisms of the set of states C 0 (either all automorphisms or part of them). The evolution operator σ A ( t ) satisfies the equation
d σ A ( t ) d t = A σ A ( t ) ,
where A L i e ( V ) is an element of the Lie algebra of the group V (the “Hamiltonian”). This equation must have a solution; that is, the “Hamiltonian” must generate a one-parameter subgroup σ A ( t ) of the group V . We would like to say that “Hamiltonians” are observables in the geometric approach; however, observables should provide numbers, and as such we say instead that the observable is a pair ( A , a ) , where A L i e ( V ) is a “Hamiltonian” and a is a linear functional invariant with respect to the group σ A ( t ) generated by the operator A, which is equivalent to the condition a ( A z ) = 0 .
In ordinary quantum mechanics, an observable is specified by a self-adjoint operator A ^ , the corresponding “Hamiltonian” acts on the density matrices as a commutator (up to a numerical factor), and the functional a is defined by the formula a ( K ) = Tr A ^ K .
The group V acts naturally on the observables: A is transformed according to adjoint representation, with a as a function on L .
Let us investigate whether there are redundant states in our theory. When there are two states x , y C 0 such that a ( x ) = a ( y ) for any observable ( A , a ) , then we can say that one of these two states can be eliminated. If we identify those states which provide the same answer for all observables, the result will be a new theory without redundant states which is essentially equivalent to the original theory.

10.2. Quantum Mechanics from Classical Mechanics

Let us now apply these considerations to the case where the classical theory is taken as the starting point. In this theory, pure states are points of the phase space (symplectic manifold) M, while mixed states are probability distributions on M and each mixed state can be uniquely represented as a mixture of pure states; physical observables are real functions on M, and a function a specifies a vector field A on M as a Hamiltonian vector field with Hamiltonian a. Identifying the vector field with a first-order differential operator, we can express A in terms of Poisson brackets, as follows: A f = { a , f } . We assume that by integrating this vector field we will obtain a one-parameter group σ A ( t ) of canonical transformations (symplectomorphisms) of manifold M. This group additionally acts on mixed states and on observables describing the evolution of these objects. The equation of motion for the probability density (the Liouville equation) has the form d ρ d t = { a , ρ } , and the equation of motion for the observables has the form d f d t = { a , f } .
Suppose that our devices are able to see only a part of the observables and that the set Λ of “observable observables” is a linear space that is closed with respect to the Poisson bracket. Let us index this set by elements of a Lie algebra denoted by 𝔤 (here, the mapping γ a γ transforming γ 𝔤 to a γ Λ is an isomorphism of Lie algebras 𝔤 and Λ .
Hamiltonian vector fields A γ with Hamiltonians a γ specify an action of Lie algebra 𝔤 on M. The assumption that vector fields A γ generate one-dimensional subgroups means that this action is induced by the action of a simply connected Lie group G having 𝔤 as a Lie algebra.
The above considerations can be applied to infinite-dimensional symplectic manifolds as well as to infinite-dimensional Lie algebras and Lie groups; however, in the infinite-dimensional case these considerations are not rigorous.
Define the moment map μ from M to 𝔤 as a mapping x μ x , where μ x ( γ ) = a γ ( x ) ; here, x M , γ 𝔤 and 𝔤 denotes the space of linear functionals on 𝔤 . This mapping is G-equivariant with respect to the coadjoint action of G on 𝔤 , in other words, it commutes with transformations from the group G. For each state of the classical system (i.e., for each probability distribution ρ on M) we can define a point ν ( ρ ) 𝔤 as an integral of μ x over x M with the measure ρ :
ν ( ρ ) = x M μ x d ρ .
The point ν ( ρ ) belongs to the convex envelope N of μ ( M ) (the convex envelope of a subset E of a topological vector space L is defined as the smallest convex closed subset of L containing E).
The group G acts naturally in the space of classical states. It follows from the G-equivariance of moment map that the mapping ν is a G-equivariant mapping of classical states into 𝔤 with a coadjoint action G .
We can say that two classical states (i.e., two probability distributions ρ and ρ ) are equivalent if
x M a γ ( x ) d ρ = x M a γ ( x ) d ρ
for each γ 𝔤 . In other words, two states are equivalent if the calculations with these states provide the same results for each Hamiltonian a γ , that is, our devices cannot distinguish these two states.
Let us now prove the following statement: two states ρ and ρ are equivalent if and only if ν ( ρ ) = ν ( ρ ) .
First, note that, for each γ 𝔤 ,
ν ( ρ ) ( γ ) = x M μ x ( γ ) d ρ = x M a γ ( x ) d ρ .
Similarly,
ν ( ρ ) ( γ ) = x M μ x ( γ ) d ρ = x M a γ ( x ) d ρ .
These formulas imply the above statement.
In the classical theory, where only Hamiltonians from the set Λ = { a γ } where γ 𝔤 are allowed, equivalent states must be identified (i.e., we eliminate redundant states). The mapping ν induces a bijective mapping of the space of equivalence classes to the set N, obtained as a convex envelope of the set μ ( M ) (“quantum states”), and the G-equivariance of the mapping ν means that the evolution of classical states is consistent with the evolution of quantum states.
Let us apply our constructions to the complex projective space C P . We define this space as a sphere | | x | | = 1 in the complex Hilbert space H with identifications x λ x , where λ C , | λ | = 1 . Group U of unitary operators acts transitively on C P . There exists a single (up to a constant factor) U-invariant symplectic structure on this space; this allows us to consider the complex projective space as a homogeneous symplectic manifold.
Suppose that we can observe only Hamiltonians of the form a C ( x ) = x , C x , where C is a self-adjoint operator. The set Λ of such Hamiltonians is a Lie algebra with respect to the Poisson bracket. This Lie algebra is isomorphic to the Lie algebra 𝔤 of self-adjoint operators, where the operation is defined as the commutator multiplied by i . The one-parameter group of unitary operators corresponding to the Hamiltonian a C is provided by the formula σ ( t ) = e i C t .
The moment map transforms the point x into a linear functional on the space of self-adjoint operators, mapping the operator C into Tr K x C , where K x is the projection H on vector x, i.e., K x ( z ) = z , x x . Recall that, in our notation, the points of the complex projective space are represented by normalized vectors. The convex envelope of the image of the moment map consists of positively defined self-adjoint operators with unit trace, i.e., it consists of density matrices.
It can be seen that we obtain ordinary quantum mechanics by applying our general construction to a complex projective space. In this case, our considerations are close to the “nonlinear quantum mechanics” of Weinberg, who proposed considering the classical theory on C P as a deformation of quantum mechanics.
The orbits of the group G in a coadjoint representation (in the space 𝔤 dual to the Lie algebra g of the G group) provide a rich source of examples of the above construction. These orbits play an important role in representation theory. They are homogeneous symplectic manifolds; by quantizing them, we can obtain unitary representations of the group G.
In order to introduce the structure of a symplectic manifold on an orbit, we should note that one can define a Poisson bracket for functions on 𝔤 (elements of the Lie algebra 𝔤 can be thought of as linear functions on 𝔤 ; for these functions, the Poisson bracket is defined as an operation in Lie algebra. Using the properties of the Poisson bracket, we can then define the bracket for arbitrary smooth functions). The space 𝔤 has the structure of a Poisson manifold; however, this manifold is not symplectic (the Poisson bracket is degenerate). By restricting the Poisson bracket to an orbit, we obtain a nondegenerate Poisson bracket specifying a symplectic structure.
Elements of the Lie algebra 𝔤 define a family Λ of functions on the orbit; we can apply the above construction to this family. In the case under consideration, the moment map is simply the embedding of the orbit into 𝔤 . We can see that when considering only Hamiltonians from the family Λ as “observable observables” in the classical theory, we obtain a theory in which the set of states is a convex envelope of the orbit. The group V can be identified with the group G.
Let us now illustrate the above constructions in the case where G is the group U of unitary transformations of a Hilbert space H . In this case, we can identify the elements of the Lie algebra 𝔤 with bounded self-adjoint operators. With a suitable choice of topology in 𝔤, one can identify the dual space 𝔤 with a linear space of self-adjoint operators having a trace. To simplify the notations, we assume that the Hilbert space is finite-dimensional; however, the same considerations can be applied to the infinite-dimensional case as well.
Consider the orbits of U in the space 𝔤 (i.e., in the coadjoint representation).
If dim H = n , the orbit is indexed by real numbers λ 1 , , λ r (the eigenvalues of the operators belonging to the orbit) and positive integers k 1 , , k r (the multiplicities of the eigenvalues). The multiplicities must satisfy the condition k 1 + k r = n . The stationary group U ( n ) of the point belonging to the orbit is isomorphic to the direct product of groups U ( k i ) , meaning that the orbit is homeomorphic to U ( n ) / U ( k 1 ) × × U ( k r ) (i.e., to the flag manifold).
If r = 2 , then the orbit is homeomorphic to the Grassmanian. If r = 2 , k 1 = n 1 , k 2 = 1 , we obtain a complex projective space. The Grassmanian G k ( H ) is defined as the space of all k-dimensional subspaces of the space H , and can be viewed as a symplectic U-manifold).

11. Notations, Conventions, and Definitions

When talking about vector spaces, we always have in mind vector spaces over R or over C . If the field is not specified, then we have in mind a complex vector space.
An algebra is a vector space over R or over C equipped with an operation of multiplication satisfying the distributivity axiom
( a + b ) c = a c + b c , c ( a + b ) = c a + c b
where a , b are elements of the algebra and c is an element of the algebra or a number.
An algebra is unital if there exists an element 1 obeying 1 · a = a · 1 = a .
An algebra is associative if ( a b ) c = a ( b c ) .
A *-algebra is an associative algebra with antilinear involution * obeying a * * = a , and ( a b ) * = b * a * . A homomorphism or automorphism of *-algebra should agree with involution. We use the notation A for *-algebra and A u t ( A ) for its group of automorphisms.
The algebra of bounded linear operators in Hilbert space is a *-algebra with respect to the involution A A * , where A * denotes the operator adjoint ( Hermitian conjugate) to A . We say that a self-adjoint operator A is positive definite if x , A x 0 . Note that in more standard terminology such operators are called positive semi-definite.
A Lie algebra is an algebra with an operation [ a , b ] obeying [ a , b ] = [ b , a ] and [ [ a b ] , c ] + [ [ b , c ] , a ] + [ [ c , a ] , b ] = 0 (Jacobi identity).
A topological group, vector space, algebra, …is a group, vector space algebra, …equipped with topology in such a way that all operations are continuous.
We always assume that a map (mapping) of topological spaces (or homomorphism of topological groups, etc.) is continuous.
Note that sometimes it is convenient to consider topological algebras where the operation of multiplication is defined only on a dense subset. For example, we can consider the Lie algebra of (not necessarily bounded) self-adjoint operators in Hilbert space (the commutator of two unbounded self-adjoint operators is not necessarily well-defined).
A derivation of an algebra is a linear operator D obeying the Leibniz rule D ( a b ) = D a · b + a · D b . A commutator of derivations is a derivation; hence, we can talk about the Lie algebra of derivations.
If we are dealing with a topological algebra, we can consider derivations defined on a dense subset. Their commutator is not necessarily well-defined; nonetheless, we can regard the set of such derivations as a topological Lie algebra that can be considered as the Lie algebra of the automorphism group of the algebra.
We say that the derivation D is an infinitesimal automorphism; it can be considered as a tangent vector of a one-parameter subgroup of the automorphism group, i.e. there exists a solution of the equation d U / d t = D U ( t ) with initial condition U ( 0 ) = 1 (this solution can be written in the form exp ( D t ) ).
The set of infinitesimal automorphisms can be regarded as topological Lie algebra, which can be interpreted as the Lie algebra of the group of automorphisms.
More generally, if we have a topological group, we can define a Lie algebra of this group either from the set of tangent vectors to the curves in the group at the unit element or from the set of tangent vectors to one-parameter subgroups.
Note that the above definitions and statements are not rigorous. For example, we have not specified the topology in the group of automorphisms, provided a definition of tangent vector, etc. It should be remembered that several reasonable definitions can be provided for these and other notions. We always disregard these subtleties.
We denote by S = S ( R n ) the space of smooth fast decreasing functions on R n (Schwartz space). More precisely, this space consists of smooth functions such that
sup | x 1 α 1 x n α n 1 β 1 n β n |
is finite for any choice of non-negative integers α i , β j .
The expressions (78) can be regarded as seminorms specifying the topology in S .
We consider generalized functions on R n (distributions) as continuous linear functionals on S .
More generally, generalized functions are defined as linear functionals on some topological vector space of functions on R n (on the space of test functions); we represent such a functional as a formal integral: f ( ϕ ) = d x f ( x ) ϕ ( x ) .
In the above definitions, test functions (hence, generalized functions) can be regarded as functions taking values in a space C r (as vector-valued functions).
The elementary space h can be considered as a space of smooth fast decreasing functions on R d taking values in C r .
Spatial translations (spatial shifts) are denoted by T a . In coordinate representation, they act on an element of h as a shift of the argument x by a , while in momentum representation they act as multiplication of the function f ( k ) by e i a k (coordinate and momentum representation are related by Fourier transform).
Temporal translations (time translations or time shifts) are denoted by T τ . They should commute with spatial translations; it follows that in momentum representation they act on the elements of 𝔥 by the formula ( T τ f ) ( k ) = e i τ E ( k ) f ( k ) .
The notations T a and T τ are used for spatial and temporal translations in the elementary space as well as in other situations. Together, the space and time translations generate a commutative group denoted by T .
We assume that all convex sets we consider are closed subsets of some complete topological vector space L . The convex envelope of set E L is the smallest convex subset of L containing E .
A convex cone C is by definition a closed convex set that is invariant with respect to dilations x λ x (here, λ denotes a non-negative real number). Note that this definition of a convex cone is not quite standard; usually some additional conditions are imposed.

12. Problems 

0. A complex vector space E is equipped with a non-negative scalar product. Prove that we can obtain pre-Hilbert space factorizing E with respect to vectors with x , x = 0 . Hint: Check that these vectors constitute a linear subspace of E .
Density matrices are defined as positive-definite self-adjoint operators having unit trace and acting in complex Hilbert space.
1. Prove that the set of density matrices is convex. Check that extreme points of this set are one-dimensional projectors K Ψ ( x ) = x , Ψ Ψ where | | Ψ | | = 1 (i.e., they are in one-to-one correspondence with non-zero vectors of Hilbert space with identification Ψ λ Ψ ) .
2. Prove that the set of density matrices in two-dimensional Hilbert space is a three-dimensional ball and that the set of of its extreme points is a two-dimensional sphere.
The linear envelope T of the set of density matrices is the space of all self-adjoint operators belonging to trace class (a self-adjoint operator belongs to trace class if it has discrete spectrum and the the series of its eigenvalues is absolutely convergent). We consider T as a normed space with the norm | | T | | = | λ k | , where λ k are eigenvalues of T . By definition, an automorphism of the set of density matrices is a bicontinuous linear operator in T generating a bicontinuous map of the set density matrices. We can saysthat a map is bicontinuous if it is continuous and has a continuous inverse. It is obvious that a unitary operator specifies an automorphism of the set of density matrices by the formula T U T U 1 .
3. Prove that automorphisms of the set of density matrices are in one-to-one correspondence with unitary operators.
We do not know how to solve this problem (this does not mean that it is difficult, we did not try). Maybe this fact has been proved somewhere; however, we do not know of any such reference.
In the next problems the term “operator” means “linear operator”. It is convenient to define e t A , where A is an operator as a solution of the equation
d U ( t ) d t = A · U ( t )
with initial condition U ( 0 ) = 1 . If ( A 1 , , A n ) is a family of commuting operators, then we define e t i A i as a product e t 1 A 1 e t n A n .
4. Prove that e i a P = T a , where P denotes the momentum operator P = i and T a stands for the translation operator transforming the function f ( x ) into a function f ( x + a ) .
5. Let us define an operator C A acting in the space of operators by the formula C A ( X ) = [ A , X ] (for definiteness, it can be assumed that A and X are bounded operators in Hilbert space, although this assumption is not important here). Prove that
e t C A ( X ) = e t A X e t A ,
or equivalently that
e t A X e t A = X + t [ A , X ] + t 2 2 ! [ A , [ A , X ] ] + t 3 3 ! [ A , [ A , [ A , X ] ] ] +
Hint: differentiate these equalities.
6. Let us assume that the commutator of operators X and Y is a number C (or more generally, an operator C commuting with X and Y). Prove the following:
e X e Y = e X + Y e 1 2 C ,
e X e Y = e C e Y e X .
7. Let us assume that, for an operator A acting in Banach space, the norms of operators e t A where t R are uniformly bounded (i.e.,
sup < t < + | | e t A | |
is finite). Prove that all eigenvalues of A are purely imaginary.
8. Let A denote an operator acting in finite-dimensional complex vector space. Assume that the norms of operators e t A where t R are uniformly bounded. Prove that the operator A is diagonalizable (i.e., there exists a basis consisting of eigenvectors of A).
Hint: use Jordan normal form. Prove that all Jordan cells are one-dimensional.
9. Let us consider a Grassmann algebra with generators ϵ 1 , ϵ 2 , ϵ 3 , ϵ 4 . Calculate
( ϵ 1 + ϵ 2 ) ( ϵ 2 + ϵ 3 ) ( ϵ 3 + ϵ 4 ) ,
( ϵ 1 + ϵ 2 ) ( ϵ 2 + ϵ 3 ) ( ϵ 3 + ϵ 4 ) ( ϵ 4 + ϵ 1 ) d 4 ϵ ,
ϵ 2 ( ϵ 1 ϵ 2 ϵ 3 ) ,
cos ( ϵ 1 ϵ 2 + ϵ 3 ϵ 4 ) .
10. Let us consider a unital associative algebra with commuting generators x 1 , , x n and anticommuting generators ξ 1 , , ξ n (tensor product of polynomial algebra and Grassmann algebra). Prove that d 2 = 0 and Δ 2 = 0 , where
d = ξ i x i ,
Δ = x i ξ i .
A Weyl algebra is defined as a unital associative algebra with generators a k * , a k obeying CCR ( [ a k , a l * ] = δ k , l , [ a k , a l ] = [ a k * , a l * ] = 0 ) . An involution * in a Weyl algebra transforms a k into a k * . A Fock representation of Weyl algebra is equivalent to a representation with cyclic vector | 0 obeying a ^ k | 0 = 0 . The scalar product is defined by the condition that a ^ k * is adjoint to a ^ k
Fock space is the complete space of Fock representation.
11. The Poisson vector is defined by the formula Ψ λ = e λ a ^ * | 0 , where λ a ^ * = λ k a ^ k * .
(a) Check that the Poisson vector is an eigenvector of all operators a ^ k .
(b) Find the scalar product of two Poisson vectors.
12. Let us consider the Hamiltonian H ^ ( t ) = ω ( t ) a ^ * a ^ (a harmonic oscillator with a time-dependent frequency).
(a) Let us suppose that ω ( t ) = ω 0 + ω exp ( α t ) for t 0 (here, α is a small positive number). Calculate the evolution of the matrix entries of the density matrix in H ^ -representation (i.e., express the matrix entries of K ( t ) in terms of the matrix entries of K ( 0 ) ). Here, H ^ = H ^ ( 0 ) .
(b) Assuming that ω is a random variable with a given expectation value and dispersion uniformly distributed on some interval, calculate the average of the matrix entries of the density matrix K ( t ) for t = c o n s t α .
13. A molecule is placed near a microwave oven. Provide a rough estimate of the decoherence time for this molecule.
Hint: decoherence appears if the phase factors entering the expressions for non-diagonal matrix entries of density matrix are changed significantly by the electric field of the oven. This change can be calculated using perturbation theory; the first order contribution comes from the dipole momentum. The information about dipole momenta of ground state and excited states can be found online; only the order of magnitude of these momenta and of the electric field are needed.
14. Let us define the L-functional corresponding to the density matrix K by the the formula
L K ( α ) = Tr e α a ^ * e α * a ^ K .
Consider the case when there is only one degree of freedom, [ a ^ , a ^ * ] = 1 .
Calculate the L-functional corresponding to the coherent state (i.e., to the normalized Poisson vector).
Remember that every normalized vector Φ defines a density matrix K Φ and Tr A K Φ = A Φ , Φ . The coherent state is a normalized eigenvector of a ^ .
15. Let us assume that supp ( Œ ) is a compact set. Then, for large | τ | we have
| ( T τ ϕ ) ( x ) | < C n ( 1 + | x | 2 + τ 2 ) n ,
where x τ U ϕ , the initial data ϕ = ϕ ( x ) is the Fourier transform of ϕ ( k ) , and n is an arbitrary integer.
Here, supp ( Œ ) is the closure of the set of points where ϕ ( k ) 0 ,
U ϕ is a set of all points of the form ϵ ( k ) where k belongs to a neighborhood of supp ( Œ ) ,
T τ ϕ ) ( x ) = d k e i k x i ϵ ( k ) τ ϕ ( k ) , and the function ϵ ( k ) is smooth.
16. A generalized function ρ ( ϵ ) is a sum of of the function A ϵ E i 0 and a square integrable function. Find the asymptotic behavior of its Fourier transform ρ ( t ) .
17. Let us consider a system with the Hamiltonian H ^ = ϵ ( p ) a ^ * ( p ) a ^ ( p ) d p and momentum operator P ^ = p a ^ * ( p ) a ^ ( p ) d p , where a ^ , a ^ * obey CCR.
(a) Calculate the time and space translations in the formalism of L-functionals
(b) Find the operators a ^ ( τ , p ) , a ^ * ( τ , p ) and th ecorresponding operators acting on the L-functionals.
(c) Prove that the L-functional ω = exp ( d p α * ( p ) n ( p ) α ( p ) ) specifies a stationary translation-invariant state (here, n ( p ) is an arbitrary function).
(d) Calculate the two-point generalized Green functions for the state ω in the ( τ , p ) and ( ε , p ) representations.

Author Contributions

Investigation I.F. and A.S.; Methodology A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Schwarz, A. Geometric approach to quantum theory. SIGMA Symmetry Integr. Geom. Methods Appl. 2020, 16, 20. [Google Scholar] [CrossRef]
  2. Schwarz, A. Geometric and algebraic approaches to quantum theory. Nucl. Phys. B 2021, 973, 115601. [Google Scholar] [CrossRef]
  3. Schwarz, A. Scattering in Algebraic Approach to Quantum Theory—Associative Algebras. Universe 2022, 8, 660. [Google Scholar] [CrossRef]
  4. Schwarz, A. Scattering in geometric approach to quantum theory. Universe 2022, 8, 663. [Google Scholar] [CrossRef]
  5. Schwarz, A. Scattering in Algebraic Approach to Quantum Theory—Jordan Algebras. Universe 2023, 9, 173. [Google Scholar]
  6. Shvarts, A.S. New formulation of quantum theory. Dokl. Akad. Nauk SSSR 1967, 173, 793. [Google Scholar]
  7. Schwarz, A. Inclusive scattering matrix and scattering of quasiparticles. Nucl. Phys. B 2020, 950, 114869. [Google Scholar] [CrossRef]
  8. Lehmann, H.; Symanzik, K.; Zimmermann, W. On the formulation of quantized field theories—II. Nuovo C. 1957, 6, 319–333. [Google Scholar] [CrossRef]
  9. Van Leeuwen, R.; Dahlen, N.E.; Stefanucci, G.; Almbladh, C.O.; von Barth, U. Introduction to the Keldysh Formalism; Springer: Berlin/Heidelberg, Germany, 2006; pp. 33–59. [Google Scholar]
  10. Bogoliubov, N.N.; Shirkov, D.V. Introduction to the Theory of Quantized Fields; Interscience: New York, NY, USA, 1959. [Google Scholar]
  11. Akhiezer, A.I.; Berestetskii, V.B. Quantum Electrodynamics; Interscience: New York, NY, USA, 1965. [Google Scholar]
  12. Itzykson, C.; Zuber, J.-B. Quantum Field Theory; McGraw-Hill: New York, NY, USA, 1980. [Google Scholar]
  13. Peskin, M.E.; Schroeder, D.V. An Introduction to Quantum Field Theory; Westview Press: Boulder, CO, USA, 1995. [Google Scholar]
  14. Weinberg, S. The Quantum Theory of Fields; Cambridge University Press: Cambridge, UK, 1995. [Google Scholar]
  15. Stone, M. The Physics of Quantum Fields; Springer: Berlin/Heidelberg, Germany, 2000. [Google Scholar]
  16. Srednicki, M. Quantum Field Theory; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
  17. Zee, A. Quantum Field Theory in a Nutshell, 2nd ed.; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
  18. Mandl, F.; Shaw, G. Quantum Field Theory, 2nd ed.; John Wiley & Sons: Chichester, UK, 2010. [Google Scholar]
  19. Duncan, A. The Conceptual Framework of Quantum Field Theory; Oxford University Press: Oxford, UK, 2012. [Google Scholar]
  20. Schwartz, M.D. Quantum Field Theory and the Standard Model; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
  21. Ilisie, V. Concepts in Quantum Field Theory. A Practitioner’s Toolkit 2015; Springer: Cham, Switzerland, 2015. [Google Scholar]
  22. Duẗsch, M. From Classical Field Theory to Perturbative Quantum Field Theory; Springer: Berlin/Heidelberg, Germany, 2019; Volume 74. [Google Scholar]
  23. Williams, A. Introduction to Quantum Field Theory: Classical Mechanics to Gauge Field Theories; Cambridge University Press: Cambridge, UK, 2022. [Google Scholar]
  24. Pascual, J.; von Neumann, J.; Wigner, E.P. On an algebraic generalization of the quantum mechanical formalism. In The Collected Works of Eugene Paul Wigner; Springer: Berlin/Heidelberg, Germany, 1993. [Google Scholar]
  25. Kontsevich, M. Deformation quantization of Poisson manifolds. Lett. Math. Phys. 2003, 66, 157–216. [Google Scholar] [CrossRef] [Green Version]
  26. Berezin, F.A. The Method of Second Quantization; Elsevier: Amsterdam, The Netherlands, 1966; Volume 24, pp. 1–228. [Google Scholar]
  27. Berezin, F.A. Covariant and contravariant symbols of operators. Math. USSR-Izv. 1972, 6, 1117–1151. [Google Scholar] [CrossRef]
  28. Berezin, F.A.; Shubin, M.A. The Schrödinger Equation, Mathematics and its Applications (Soviet Series); Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1991; Volume 66. [Google Scholar]
  29. Tyupkin, Y.S.; Fateev, V.A.; Shvarts, A.S. Classical limit of the S matrix in quantum field theory. SPhD 1975, 20, 194. [Google Scholar]
  30. Faddeev, L.D.; Takhtajan, L.A. Hamiltonian Methods in the Theory of Solitons; Springer: Berlin/Heidelberg, Germany, 1987; Volume 23. [Google Scholar]
  31. Faddeev, L.D.; Korepin, V.E. Quantum theory of solitons. Phys. Rep. 1978, 42, 1–87. [Google Scholar] [CrossRef]
  32. Soffer, A. Soliton dynamics and scattering. Int. Congr. Math. 2006, 3, 459–471. [Google Scholar]
  33. Liu, B.; Soffer, A. The large time asymptotic solutions of nonlinear Schrödinger type equations. Appl. Numer. Math. 2023, in press. [Google Scholar] [CrossRef]
  34. Tao, T. Why are solitons stable? Bull. Am. Math. Soc. 2009, 46, 1–33. [Google Scholar] [CrossRef] [Green Version]
  35. Strauss, W.A. 1973 Nonlinear scattering theory. In Scattering Theory in Mathematical Physics, Proceedings of the NATO Advanced Study Institute, Denver, CO, USA, 11–29 June 1973; Springer: Dordrecht, The Netherlands, 1973. [Google Scholar]
  36. Araki, H.; Haag, R. Collision cross sections in terms of local observables. Commun. Math. Phys. 1967, 4, 77–91. [Google Scholar] [CrossRef]
  37. Haag, R. Local Quantum Physics, Texts and Monographs in Physics, 2nd ed.; Springer: New York, NY, USA, 1996; ISBN 978-3-540-61451-7. [Google Scholar]
  38. Borchers, H.-J. On revolutionizing quantum field theory with Tomita’s modular theory. J. Math. Phys. 2000, 41, 3604–3673. [Google Scholar] [CrossRef]
  39. Hunziker, W.; Sigal, I.M. The quantum n-body problem. J. Math. Phys. 2000, 41, 3448–3510. [Google Scholar] [CrossRef]
  40. Segal, I. Space-time decay for solutions of wave equations. Adv. Math. 1976, 22, 305–311. [Google Scholar] [CrossRef] [Green Version]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Frolov, I.; Schwarz, A. Quantum Mechanics and Quantum Field Theory: Algebraic and Geometric Approaches. Universe 2023, 9, 337. https://doi.org/10.3390/universe9070337

AMA Style

Frolov I, Schwarz A. Quantum Mechanics and Quantum Field Theory: Algebraic and Geometric Approaches. Universe. 2023; 9(7):337. https://doi.org/10.3390/universe9070337

Chicago/Turabian Style

Frolov, Igor, and Albert Schwarz. 2023. "Quantum Mechanics and Quantum Field Theory: Algebraic and Geometric Approaches" Universe 9, no. 7: 337. https://doi.org/10.3390/universe9070337

APA Style

Frolov, I., & Schwarz, A. (2023). Quantum Mechanics and Quantum Field Theory: Algebraic and Geometric Approaches. Universe, 9(7), 337. https://doi.org/10.3390/universe9070337

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop