Abstract
Explicit formulae for the Lorentz transformation matrices corresponding to a pure boost and a pure three-dimensional rotation are very well known. Significantly less well known is the explicit formula for a general Lorentz transformation with arbitrary non-zero boost and rotation parameters. We revisit this more general formula by presenting two different derivations. The first derivation (which is somewhat simpler than previous ones appearing in the literature) evaluates the exponential of a real matrix A, where A is a product of the diagonal matrix and an arbitrary real antisymmetric matrix. The formula for depends only on the eigenvalues of A and makes use of the Lagrange interpolating polynomial. The second derivation exploits the observation that the spinor product transforms as a Lorentz four-vector, where and are two-component spinors. The advantage of the latter derivation is that the corresponding formula for a general Lorentz transformation reduces to the computation of the trace of a product of matrices. Both computations are shown to yield equivalent expressions for .
1. Introduction
In the theory of special relativity, space and time are combined into Minkowski spacetime (e.g., see Ref. [1]). Two different inertial reference frames (with coinciding origins fixed) are related through a Lorentz transformation. Equivalently, consider a four-vector, , with squared-length (with an implied double sum over the repeated indices ), where is the Minkowski spacetime metric. One can also define the Lorentz transformation as a symmetry transformation of a four-vector, , that preserves the length of . Since the length of a four-vector is a scalar quantity and thus invariant under a Lorentz transformation, it follows that , which serves as the general definition of the Lorentz transformation matrix [cf. Equations (30)–(32)]. Moreover, this same equation implies that is an invariant tensor. Indeed, the Lorentz transformations (along with spacetime translations) are the maximally allowed symmetry transformations of Minkowski spacetime in which the spacetime metric is left invariant (e.g., see Ref. [2]).
Consider two inertial reference frames with coinciding origins, where one reference frame is moving with respect to the other with three-vector velocity . The corresponding Lorentz transformation is called a Lorentz boost. The boost parameters are defined by the components of the three-vector , where and c is the speed of light. However, this is not the most general Lorentz transformation. For example, let R be an arbitrary orthogonal matrix of unit determinant, i.e., a proper rotation matrix parametrized by the components of the three-vector (such that is the angle of rotation, counterclockwise, about a fixed axis that lies along the unit vector ). Then, the transformation and is also a Lorentz transformation as it leaves the Minkowski spacetime metric invariant. The corresponding matrix representations of the general Lorentz boost and three-dimensional rotation are quite well known [see Equations (22) and (26), respectively] and are reviewed in Section 2.
A more general Lorentz transformation matrix, which shall henceforth be denoted by , corresponds to a simultaneous boost and rotation. As shown in Section 3, can be expressed as the exponential of a matrix,
In contrast to and , which correspond to a Lorentz boost matrix and a three-dimensional rotation, respectively, an explicit form for is much less well known.
The first published formula for appeared in Ref. [3]. Subsequent derivations have also been given in Refs. [4,5,6]. These derivations are based on the Cayley–Hamilton theorem of linear algebra (e.g., see Section 8.4 of Ref. [7]), which asserts that any matrix A satisfies its own characteristic equation, , where is the identity matrix and is an nth-order polynomial whose roots are the eigenvalues of A. That is, is equal to the zero matrix. It follows that for any integer , the matrix can be expressed as a linear combination of , A, . In particular,
where each of the coefficients is an infinite series whose terms depend on the eigenvalues of A. Note that by setting either or in Equation (1), one can easily compute the resulting matrix exponential to derive the well-known expressions given in Equations (22) and (26), respectively. In contrast, if both the boost vector and the rotation vector are non-zero, then the corresponding computation of the matrix exponential, which is carried out in Refs. [3,4], is significantly more difficult. In Ref. [5], this computation is performed by showing that a Lorentz transformation matrix g exists such that the matrix in block matrix form is made up of very simple matrix blocks. The exponential is then easy to evaluate directly via its Taylor series to obtain the coefficients , and
Finally, Ref. [6] derives a system of four linear equations for the coefficients in Equation (2), whose solution provides the desired expression for .
In this paper, we shall provide a somewhat simpler and more straightforward evaluation of as compared to the derivations given in Refs. [3,4,5,6]. In Section 2, we first exhibit the explicit forms for the general Lorentz boost and the three-dimensional rotation matrices of Minkowski spacetime, which correspond to special cases of the more general Lorentz transformation matrix, as noted above. In Section 3, an expression for the most general Lorentz transformation is then derived. Indeed, it is sufficient to consider the set of all Lorentz transformations that are continuously connected to the identity, known as the proper orthochronous Lorentz transformations (e.g., see Ref. [1]). The matrix representation of any element of this latter set can be expressed in the form given by Equation (1), as discussed below Equation (40). In Section 4, we explicitly evaluate Equation (1) for arbitrary boost and rotation parameters. We then demonstrate that an alternative derivation of can be given that only involves the manipulation of matrices, by making use of two-component spinors. In particular, we show in Section 5 that the most general proper orthochronous Lorentz transformation matrix can be expressed as a trace of the product of four matrices, which is then explicitly evaluated. Both methods for computing are carried out in pedagogical detail. In Section 6, we check that both computations yield the same expression for . Final remarks are presented in Section 7, and some related discussions are relegated to the appendices.
2. Lorentz Transformations—Special Cases
In a first encounter with special relativity, a student learns how the spacetime coordinates change between two inertial reference frames K and . If the spacetime coordinates with respect to K are and the spacetime coordinates with respect to are , where is moving relative to K with velocity in the x direction, then
where c is the speed of light and
It is straightforward to generalize the above results for an arbitrary velocity by writing
where is the projection of along the direction of , and is perpendicular to (so that ). The definition of implies that
where . Note that 0 ≤ β < 1 for any particle of non-zero mass.
In light of Equation (10), Equations (4)–(7) are equivalent to
where . Note that for any particle of non-zero mass. More explicitly,
which yield and = 0 as required. Inserting the expressions given in Equation (14) back into Equations (11)–(13), we end up with the well-known result (e.g., see Equation (11.19) of Ref. [8]):
Following Equation (11.20) of Ref. [8], it is convenient to introduce the boost parameter (also called the rapidity),
since the definitions of and are consistent with the relation . In particular, note that . We then define the boost vector to be the vector of magnitude that points in the direction of . Since Equation (17) yields , it follows that
In terms of the boost vector and its magnitude , Equations (15) and (16) yield
Before proceeding, it is instructive to distinguish between active and passive Lorentz transformations (e.g., see Ref. [1]). The Lorentz transformation discussed above is a passive transformation, since the reference frame K (specified by the coordinate axes) is transformed into , while leaving the observer fixed. Equivalently, one can consider an active transformation, in which the coordinate axes are held fixed while the location of the observer in spacetime is boosted using the inverse of the transformation specified by Equations (19) and (20). That is, a spacetime point of the observer located at is transformed by the boost to using Equations (19) and (20) with replaced by . Henceforth, all Lorentz transformations treated in this paper will correspond to active transformations.
The transformation that boosts the spacetime point to is given by
where the matrix can be written in block matrix form as
after converting Equations (19) and (20) to an active transformation via . In Equation (22),
where the Latin indices refer to the x, y, and z components of the three-vector , and there is an implicit sum over the repeated index j on the right hand side of Equation (21).
The matrix is sometimes inaccurately called the Lorentz transformation matrix. In fact, this matrix represents a special type of Lorentz transformation consisting of a boost without rotation [the latter is indicated by the second argument of ]. Furthermore, note that is the identity matrix. Any Lorentz transformation of the form can be continuously deformed into the identity matrix by continuously shrinking the vector to the zero vector.
Another example of a Lorentz transformation is a three-dimensional proper rotation of the vector into the vector by an angle , counterclockwise, about a fixed axis , where R is a orthogonal matrix of unit determinant, and the time coordinate is not transformed. In this notation, is a unit vector (i.e., ). It is then convenient to define a three-vector quantity called the rotation vector,
where . In the case of a proper three-dimensional rotation, the transformation of the spacetime point to is given by
where the matrix can be written in block matrix form as
where [] are the components of the zero row [column] vector (with i, ), and
In Equation (27), the Levi–Civita symbol is defined by [] when is an even [odd] permutation of 123, and if any two of the indices coincide. Equation (27) is known as Rodrigues’ rotation formula (e.g., see Refs. [9,10]). A clever proof of this formula is provided in Appendix A.
3. General Lorentz Transformations
Consider a four-vector . Under an active Lorentz transformation, the spacetime components of the four-vector transform as
where the Greek indices such as , , and there is an implied sum over any repeated upper/lower index pair. The quantities can be viewed as the elements of a real matrix, where labels the row and labels the column. In special relativity, the metric tensor (in a rectangular coordinate system) is given by the diagonal matrix.
where the so-called mostly minus convention for the metric tensor has been chosen.
To construct a Lorentz-invariant scalar quantity that is unchanged under a Lorentz transformation, one only needs to combine tensors in such a way that all upper/lower index pairs are summed over and no unsummed indices remain. For example,
Using Equations (28) and (30), it follows that
Since the four-vector v is arbitrary, it follows that
Equation (32) defines the most general Lorentz transformation matrix . The set of all such Lorentz transformation matrices is a group (under matrix multiplication) and is denoted by O. Here, the notation refers to the number of plus and minus signs in the metric tensor [cf. Equation (29)]. In particular, O is a Lie group, appropriately called the Lorentz group (e.g., see Refs. [1,2,10]).
After taking the determinant of both sides of Equation (32), one obtains . Hence,
Moreover, by setting in Equation (32) and summing over and , one obtains
The Lie group SO is the group of proper Lorentz transformation matrices that satisfy . The elements of the subgroup of SO that also satisfy are continuously connected to the identity element [the identity matrix, denoted by ] and constitute the set of proper orthochronous Lorentz transformations, which is often denoted by SO0. Three examples of Lorentz transformations that are not continuously connected to the identity are as follows
In particular, there is no way to continuously change the parameters of a proper orthochronous Lorentz transformation to yield a Lorentz transformation with and/or in light of Equations (33) and (34).
The complete list of Lorentz transformations is then given by
Consequently, to determine the explicit form of the most general Lorentz transformation, it suffices to consider the explicit form of the most general proper orthochronous Lorentz transformation.
The Lie algebra of the Lorentz group is obtained by considering an infinitesimal Lorentz transformation,
where A is a matrix that depends on infinitesimal Lorentz group parameters. In particular, terms that are quadratic or of higher order in the infinitesimal group parameters are neglected. Inserting Equation (37) into Equation (32), and denoting to be the matrix whose matrix elements are , it follows that
Keeping only terms up to linear order in the infinitesimal group parameters, we conclude that or equivalent (since G is a diagonal matrix),
That is, is a real antisymmetric matrix. Hence, the Lie algebra of the Lorentz group, henceforth denoted by , consists of all real matrices A such that is an antisymmetric matrix.
To construct a proper orthochronous Lorentz transformation, one can choose any real matrix A that satisfies Equation (39), and consider a large positive integer n such that is an infinitesimal quantity. Then, a proper orthochronous Lorentz transformation is obtained by applying a sequence of n infinitesimal Lorentz transformations in the limit as ,
Note that is continuously connected to the identity matrix since one can continuously deform A into the zero matrix. Hence, it follows that . However, one can make a stronger statement: the exponential map, , is surjective. A proof of this result can be found in Section 6.3 of Ref. [10]. That is, the set of all proper orthochronous Lorentz transformations consists of matrices of the form , where is a real antisymmetric matrix.
Let us first reconsider the two special cases examined in Section 2. A matrix representation of an infinitesimal boost is obtained by evaluating Equation (22) to leading order in ,
where the three matrices are defined by
Similarly, a matrix representation of an infinitesimal rotation is obtained by evaluating Equations (26) and (27) to leading order in (with ),
where the three matrices are defined by
The six matrices and satisfy the following commutation relations:
where and there is an implicit sum over the repeated index ℓ.
Using Equations (41) and (43), it follows that the matrix representation of a general infinitesimal Lorentz transformation, to linear order in the boost and rotation parameters, is given by
Note that we also could have written in Equation (46), since the infinitesimal Lorentz transformations commute at linear order.
In light of the remarks below Equation (40), one can conclude that the most general proper orthochronous Lorentz transformation matrix is a matrix given by
Here, we follow the conventions of Refs. [11,12]. Note that in the notation of Ref. [8], and , where the matrix representations of and are given in Equation (11.91) of Ref. [8] and yield . The argument of exp differs by an overall sign with Equation (11.93) of Ref. [8], where a passive Lorentz transformation is employed, which amounts to replacing with .
Equations (42), (44) and (47) imply that
As anticipated in Equation (39), is the most general real antisymmetric matrix, which depends on six real independent parameters and (). The satisfy the commutation relations [Equation (45)] of the real Lie algebra . As indicated in Equation (48), A is a real linear combination of the six Lie algebra generators and thus constitutes a general element of . In Section 4, we provide an explicit computation of .
Before moving on, we shall introduce a useful notation that assembles the matrices into six independent non-zero matrices, (with , ) such that
Note that Equation (49) implies that , so that the six independent matrices can be taken to be and (). The matrix elements of the are given by
where indicates the row and indicates the column of the corresponding matrix.
Using Equation (49), one can check that Equation (50) is equivalent to Equations (42) and (44). In addition, the commutation relations exhibited in Equation (45) now take the following form:
One can also assemble the boost and rotation parameters into a second rank antisymmetric tensor by defining
With this new notation, Equation (47) can be rewritten as
where . As usual, there is an implied sum over each pair of repeated upper/lower indices.
4. An Explicit Evaluation of
We now proceed to evaluate , where A is given by Equation (48). First, we compute the characteristic polynomial of A,
where
Solving Equation (55) for and yields
Note that and so that . The individual signs of a and b are not determined, but none of the results that follow depend on these signs. The eigenvalues of A, denoted by (), are the solutions of , which are given by
If , then the four eigenvalues of A [Equation (58)] are distinct, which implies that A is a diagonalizable matrix.
To evaluate for a diagonalizable matrix A, we shall make use of a formula [Equation (60) below] that is based on the Lagrange interpolating polynomial. Consider an matrix A with n eigenvalues of which m are distinct and denoted by (). The matrix A is diagonalizable if and only if (e.g., see Section 8.3.2 of Ref. [7] or Section 7.11 of Ref. [13])
where is the identify matrix. Note that if (i.e., all n eigenvalues are distinct), then A is diagonalizable, since in this case Equation (59) is automatically satisfied due to the Cayley–Hamilton theorem.
Any function of a diagonalizable matrix A is given by the following formula (e.g., see Equations (7.3.6) and (7.3.11) of Ref. [13], Equation (5.4.17) of Ref. [14], or Chapter V, Section 2.2 of Ref. [15]):
if and if . Note that .
Applying Equation (60) to , where A is given by Equation (48), under the assumption that , it follows that
Simplifying the above expression yields
Combining terms, we end up with
where a and b are defined in Equation (55) and
in agreement with the results previously obtained in Refs. [3,4,5,6].
The matrix A and its powers can be conveniently written in block matrix form:
and
The element of can be simplified by noting that the element of any antisymmetric matrix must be of the form (after summing over the repeated index k). Thus,
Multiplying the above equation by and summing over i and j yields
It follows that That is, we have derived the identity
Thus, the matrix [Equation (67)] can be rewritten in a more convenient form,
Consider separately the case of . The eigenvalues given in Equation (58) are no longer distinct. If and , then the matrix A is diagonalizable since A satisfies Equation (59), i.e., . In particular, if then Equation (55) implies that and . Plugging these results into Equations (66) and (71) yields . Consequently, one can make use of Equation (60) with to obtain
One can check that Equation (72) coincides with the limit of Equations (63)–(65) after making use of .
Likewise, if and , then the matrix A is diagonalizable since A satisfies Equation (59), i.e., . In particular, if , then Equation (55) implies that and . Plugging these results into Equations (66) and (71) yields . Consequently, one can make use of Equation (60) with to obtain
One can check that Equation (73) coincides with the limit of Equations (63)–(65) after making use of .
Finally, in the case of , Equation (55) yields and . Using Equation (71), it then follows that . Thus, the Taylor series of the exponential terminates and one obtains
Although one cannot directly employ Equation (60) in this final case (since A is no longer diagonalizable), one can still recover Equation (74) either by taking the limit of Equation (72) or the limit of Equation (73).
It is instructive to check the two limiting cases exhibited in Section 2. First, if , then and . It then follows that
where is a matrix of zeros. Using Equations (72) and (75), we obtain
in agreement with Equation (22).
Second, if , then and . It follows that
Using Equations (73) and (77), we end up with
after identifying . We have thus recovered Equation (26) and Rodrigues’ rotation formula [Equation (27)].
A final limiting case of interest is the most general orthochronous Lorentz transformation in spacetime dimensions. In this case, we can choose and , which implies that [cf. Equation (55)]. Without loss of generality, one can take and , where is the square of the rotation angle (in two space dimensions, there is no danger in confusing with the second component of the vector ). Hence, Equation (73) yields
where the matrices A and are given in block-diagonal form by
with i, (and an implied sum over ), , and .
5. An Explicit Evaluation of
In Section 3, we remarked that a general element of the Lie algebra is a real linear combination of the six generators . In particular, the matrix A defined in Equation (48) provides a four-dimensional matrix representation of . The corresponding matrix that represents a general element of the proper orthochronous Lorentz group, SO0(1,3), is then obtained by exponentiation, . In this section, we will take advantage of the existence of a two-dimensional matrix representation of . It is noteworthy that by exponentiating this two-dimensional representation, one obtains a two-dimensional matrix representation of the group of complex matrices with unit determinant, which defines the Lie group . Thus, the two-dimensional matrix representation of provides representation matrices M [defined in Equation (81) below] for the elements of SO0(1,3). However, in this case, the matrices M and of represent the same element of SO0(1,3) [cf. Equation (91)].
For example, consider the general element of the two-dimensional representation of that is given by
where and are the boost and rotation vectors that parametrize an element of the proper orthochronous Lorentz group and are the three Pauli matrices assembled into a vector whose components are the matrices,
It is convenient to define a fourth Pauli matrix, , where is the identity matrix. We can then define the four Pauli matrices in a unified notation. Following the notation of Refs. [11,12], we define
where . Note that these sigma matrices have been defined with an upper (contravariant) index. They are related to sigma matrices with a lower (covariant) index in the usual way:
However, the use of the spacetime indices and is slightly deceptive since the sigma matrices defined above are fixed matrices that do not change under a Lorentz transformation.
It is also convenient to introduce the set of matrices,
One can then rewrite Equation (81) in the following form that is reminiscent of Equation (53),
That is, the six independent matrices are generators of the Lie algebra of , henceforth denoted by . It is straightforward to check that the matrices possess the same commutation relations as the matrices [cf. Equation (51)], which establishes the isomorphism .
Under an active Lorentz transformation, a two-component spinor (where ) transforms as
Suppose that and are two-component spinors and consider the spinor product . Under a Lorentz transformation,
We assert that the quantity transforms as a Lorentz four-vector,
The standard proof of this assertion based on the analysis of infinitesimal Lorentz transformations is given in Appendix B. (See also Appendix C, where the corresponding result is obtained by employing the four-component spinor formalism.) Equations (88) and (89) imply that the following identity must be satisfied:
Multiplying Equation (90) on the right by and using , it follows that
It is now convenient to introduce the complex vector, , and the associated quantity,
One can now evaluate the matrix exponential [cf. Equation (81)] by making use of Equation (60) if . The corresponding eigenvalues of are . Hence,
Note that the limit as is continuous and yields .
Since the Pauli matrices are hermitian,
We shall evaluate in four separate cases depending whether the spacetime index is 0 or . In particular, using block matrix notation, Equation (91) yields
where we have used to obtain the final matrix expression above.
Plugging Equations (93) and (94) into Equation (91) and evaluating the traces,
we end up with the following expressions:
where means the complex conjugate of the previous term and is defined in Equation (92). Note that since is a complex quantity, and in Equations (99)–(102).
We can check the results of Equations (99)–(102) in three special cases. First, consider the case of a pure boost, where . Then, and . Plugging these values into Equations (99)–(102) yields the following block matrix form:
which again reproduces the result of Equation (22).
Second, consider the case of . Then, and . Plugging these values into Equations (99)–(102) and writing yields
Once again, we have recovered Equation (26) and Rodrigues’ rotation formula [Equation (27)].
Third, one can check that Equations (99)–(102) reduce to the most general orthochronous Lorentz transformation in spacetime dimensions for if we take , , and , which implies that . The resulting formulae reproduce the expressions obtained in Equations (79) and (80).
Finally, it is instructive to consider the case of an infinitesimal Lorentz transformation. Working to linear order in and , note that in light of Equation (92). Hence, Equations (99)–(102) reduce to the following result given in block matrix form:
which coincides with Equation (46).
6. Reconciling the Results of Section 4 and Section 5
In this section, we shall verify that the explicit expressions for obtained, respectively, in Section 4 and Section 5 coincide in the general case of non-zero boost and rotation parameters.
First, it is convenient to rewrite Equations (56) and (57) as follows:
where is defined in Equation (92). As noted below Equation (57), a, but their undetermined signs have no impact on the expressions obtained for the matrix elements of . Using Equation (55), we can fix the relative sign of a and b by choosing . It then follows that
After taking the positive square root, the signs of a and b are now fixed by identifying
One can check that Equations (99)–(102) are unchanged if and/or . This reflects the fact that the expressions obtained for the matrix elements of do not depend on the choice of signs for a and b.
Thus, Equations (63)–(66) and (71) yield:
after making use of Equation (106). We now employ the following two identities:
Hence, Equations (108) and (109) yield
in agreement with Equation (99).
Next, Equations (63)–(66) and (71) yield
Using Equation (55), it follows that and [the latter with the sign conventions adopted above Equation (107)]. Inserting these results into Equation (113), we obtain
We can rewrite Equation (114) with the help of some identities. It is straightforward to show that
Collecting the results obtained above, we end up with
in agreement with Equation (100).
The computation of is nearly identical. The only change is due to the change in the sign multiplying the term proportional to the Levi–Civita tensor. Consequently, it is convenient to replace Equation (117) with an equivalent form:
Hence, we end up with
in agreement with Equation (101).
Finally, we use Equations (63)–(66) and (71) to obtain
The following identities can be derived:
Note that the terms proportional to in Equation (121) combine nicely and yield
after putting .
Collecting the results obtained above, we end up with
in agreement with Equation (102).
We have therefore verified by an explicit computation that the results obtained in Equations (63)–(65) are equivalent to Equations (99)–(102). In particular, we have established that
where .
7. Final Remarks
The main goal of this paper is to exhibit an explicit form for the proper orthochronous Lorentz transformation matrix as a function of general boost and rotation parameters and . Whereas the matrices and are well known and appear in many textbooks, the explicit form for more general is much less well known. Two different derivations are provided for . One derivation evaluates the exponential of a real matrix A that satisfies [where ], and a second derivation evaluates , where the matrix . Although the results obtained by the two computations look somewhat different at first, we have verified by explicit calculation that these two results are actually equivalent.
One can also obtain the most general proper orthochronous Lorentz transformation in another way by invoking the following theorem (e.g., see Section 1.5 of Ref. [1], Section 6.6 of Ref. [16], or Section 4.5 of Ref. [17]):
In contrast to Equation (130), when considering infinitesimal Lorentz transformations, the boost matrix [Equation (41)] and the rotation matrix [Equation (43)] commute at linear order, which results in Equation (46). The effects of the noncommutativity appear first at quadratic order in the boost and rotation parameters.
In particular, if none of the parameters are zero, then and due to the fact that boosts and rotations do not commute [as a consequence of the commutation relations given in Equation (45)]. Indeed, for non-vanishing boost and rotation parameters,Every proper orthochronous Lorentz transformation possesses a unique factorization into a product of a boost and a rotation in two different ways:for an appropriate choice of parameters , and , respectively. Equation (129) is called the polar decomposition of SO0(1,3) in Refs. [10,18,19].
Given the parameters , (or , ), it would be quite useful to be able to obtain expressions for the corresponding parameters of . The formulae that determine in Equation (129) are quite complicated [20], although they could in principle be derived by using the explicit matrix representations given in this paper. This is left as an exercise for the reader.
Funding
This research was partially supported by the U.S. Department of Energy Grant number DE-SC0010107.
Data Availability Statement
Data are contained within the article.
Acknowledgments
I am grateful to João P. Silva for discussions in which he challenged me to provide an explicit proof of Equation (128) and for his encouragements during the writeup of this work.
Conflicts of Interest
The author declares no conflicts of interest.
Appendix A. Rodrigues’ Rotation Formula
A proper rotation matrix [which satisfies and ] represents an active transformation consisting of a counterclockwise rotation by an angle about an axis with respect to a fixed Cartesian coordinate system. For example, the matrix representation of the counterclockwise rotation by an angle about the z-axis is given by
The matrix elements of will be denoted by , where the indices of the tensors in this Appendix are written in the lowered position to simplify the typography of the presentation. The goal of this Appendix is to provide a simple derivation of Rodrigues’ formula for an active (counterclockwise) rotation by an angle about an axis that points along the unit vector . Note that since is a unit vector, it follows that
The traditional approach to deriving Rodrigues’ rotation formula involves the computation of the exponential of an arbitrary real antisymmetric matrix (e.g., see Refs. [9,10]). Below, we provide an alternative derivation of the formula for that makes use of the techniques of tensor algebra.
Consider how changes under an orthogonal change of basis, which can be viewed as a orthogonal transformation of the coordinate axes. Using the well-known results derived in any textbook on matrices and linear algebra, one can check that the transformation of the components of under a change of basis corresponds to the transformation law of a second-rank Cartesian tensor. Likewise, the are components of a vector (equivalently, a first-rank tensor). Two other important quantities of the analysis are the invariant tensors (the Kronecker delta) and (the Levi–Civita tensor). If we invoke the covariance of Cartesian tensor equations, then one must be able to express in terms of a second-rank tensor composed of , and , as there are no other tensors in the problem that could provide a source of indices. Thus, the form of the formula for must be
where there is an implicit sum over the repeated index k in the last term of Equation (A3). The numbers a, b and c are real scalar quantities. As such, a, b and c are functions of , since the rotation angle is the only scalar variable in this problem.
We now determine the conditions that are satisfied by a, b and c. The first condition is obtained by noting that
This is clearly true, since , when acting on a vector, rotates the vector around the axis , whereas any vector parallel to the axis of rotation is invariant under the action of . In terms of components,
To determine the consequence of this equation, we insert Equation (A3) into Equation (A5). In light of Equation (A2), it follows immediately that . Hence,
Since the formula for given by Equation (A3) must be completely general, it must hold for any special case. In particular, consider the case where . In this case, Equations (A1) and (A3) yield
after using . Consequently, Equations (A6) and (A7) yield
Inserting these results into Equation (A3), we obtain Rodrigues’ rotation formula:
Note that
Combining these two results, it follows that
which implies that any three-dimensional proper rotation can be described by a counterclockwise rotation by an angle about some axis , where .
Appendix B. Transforms as a Lorentz Four-Vector
Equation (89) asserts that the spinor product transforms as a Lorentz four-vector. In light of Equation (88), it follows that Equation (90) must be satisfied (and vice versa). In this Appendix, we shall establish Equation (90) by demonstrating that both sides of this identity agree to first order in and .
In addition to the defined in Equation (85), it is convenient to introduce the set of matrices,
Then, using the properties of the Pauli matrices, Equations (81) and (86) yield
Working to first order in the parameters and making use of Equations (50), (53), (86), and (A14),
It then follows that
One can easily derive the following identity [11,12]:
Hence, Equation (A18) yields
after using the antisymmetry of in the penultimate step above. After employing Equation (A15) in the final step above, we conclude that
thereby confirming Equation (90). In particular, it follows that transforms as a Lorentz four-vector in light of Equations (88) and (89), as previously noted. Equation (A21) is a statement of the well-known isomorphism SO(1, 3)0 SL(2, )/, since the SL(2, matrices M and correspond to the same Lorentz transformation .
Of course, the derivation of Equation (A21) is much simpler than a direct derivation of Equation (91), which requires the explicit evaluation of all the relevant matrix exponentials. Indeed, we can assert that having derived Equation (A21) to first order in , this result must be true for arbitrary . The reason that a derivation based on the infinitesimal forms of , M and is sufficient is due to the strong constraints imposed by the group multiplication law of the Lorentz group near the identity element, which in light of the discussion following Equation (40) implies that a proper orthochronous Lorentz transformation can be expressed as an exponential of an element of the corresponding Lie algebra.
There is a second inequivalent two-dimensional matrix representation of whose general element is represented by the matrix , as discussed in greater detail in Refs. [11,12]. This leads to a second identity that is similar to that of Equation (A21):
One can derive Equation (A22) by again working to first order in the parameters and making use of Equations (A15)–(A17):
In light of the identity [11,12],
it follows that
which establishes Equation (A22) after employing Equation (A15) in the final step above.
Multiplying Equation (A22) on the right by and using , it follows that
which provides yet another formula for the most general orthochronous Lorentz transformation matrix. Using block matrix notation, Equation (A26) yields
after noting that [cf. Equation (83)]. Comparing with Equation (95), we see that and , which results in and , or equivalently and . In addition, the block off-diagonal elements of have
changed sign. Under these replacements, it is straightforward to check that the resulting
expressions for are the same as those obtained previously in Equations (99)–(102). That
is, Equation (A26) is established by explicit calculation.
Appendix C. γμΨ Transforms as a Lorentz Four-Vector
Most textbook treatments of the Dirac equation employ the more familiar four-component spinors and Dirac gamma matrices (e.g., see Ref. [21]). The relation between the two-component and four-component spinor formalisms is briefly presented in this Appendix. Further details can be found in Refs. [11,12].
One can construct four-component spinors
in terms of a pair of two-component spinors and . The Dirac gamma matrices are defined via their anticommutation relations:
In the so-called chiral representation of the gamma matrices,
It is convenient to introduce
where . The Dirac adjoint spinor is defined by
The matrices and satisfy
Four-component spinors transform under an active Lorentz transformation as
where
combines the two inequivalent two-dimensional matrix representations of ,
To compute the corresponding matrix inverses, simply change the overall sign of the parameters . For example,
In light of Equation (A34), one can easily check that the matrix satisfies
Using Equations (A32) and (A35), it then follows that
Finally, taking the hermitian conjugate of Equation (A40) and using Equation (A33) [which implies that in light of Equation (A29)], we end up with
under an active Lorentz transformation.
It is now straightforward to verify that the identities, Equations (A21) and (A22), derived in Appendix B, are equivalent to
after employing Equations (A30) and (A36). Consequently, in light of Equations (A35), (A42) and (A43), it follows that under an active Lorentz transformation,
That is, under a Lorentz transformation, transforms as a four-vector. Moreover, using , Equation (A43) yields
Of course, Equation (A45) is equivalent to Equations (91) and (A26) taken together.
References
- Sexl, R.U.; Urbantke, H.K. Relativity, Groups, Particles: Special Relativity and Relativistic Symmetry in Field and Particle Physics; Springer: Wien, Austria, 2001. [Google Scholar]
- Markoutsakis, M. Geometry, Symmetries, and Classical Physics—A Mosaic; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
- Zeni, J.R.; Rodrigues, W.A., Jr. The Exponential of the Generators of the Lorentz Group and the Solution of the Lorentz Force. Hadron. J. 1990, 3, 317–327. [Google Scholar]
- Geyer, C.M. Catadioptric Projective Geometry: Theory and Applications. Ph.D. Dissertation, University of Pennsylvania, Philadelphia, PA, USA, 2003. [Google Scholar]
- Dimitro, G.K.; Mladenov, I.M. A New Formula for the Exponents of the Generators of the Lorentz Group. In Proceedings of the Seventh International Conference on Geometry, Integrability and Quantization, Varna, Bulgaria, 2–10 June 2005; Mladenov, I.M., de León, M., Eds.; pp. 98–115. [Google Scholar]
- Andrica, D.; Rohan, R.-A. A new way to derive the Rodrigues formula for the Lorentz group. Carpathian J. Math. 2014, 30, 23–29. [Google Scholar] [CrossRef]
- Carrell, J.B. Groups, Matrices, and Vector Spaces; Springer Science+Business Media, LLC: New York, NY, USA, 2017. [Google Scholar]
- Jackson, J.D. Classical Electrodynamics, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1999. [Google Scholar]
- Marsden, J.E.; Ratiu, T.S. Introduction to Mechanics and Symmetry—A Basic Exposition of Classical Mechanical Systems, 2nd ed.; Springer: New York, NY, USA, 1999. [Google Scholar]
- Gallier, J.; Quaintance, J. Differential Geometry and Lie Groups—A Computational Perspective; Springer Nature Switzerland AG: Cham, Switzerland, 2020. [Google Scholar]
- Dreiner, H.K.; Haber, H.E.; Martin, S.P. Two-component spinor techniques and Feynman rules for quantum field theory and supersymmetry. Phys. Rep. 2010, 494, 1–196. [Google Scholar] [CrossRef]
- Dreiner, H.K.; Haber, H.E.; Martin, S.P. From Spinors to Supersymmetry; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
- Meyer, C.D. Matrix Analysis and Applied Linear Algebra; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar]
- Mehta, M.L. Matrix Theory—Selected Topics and Useful Results; Hindustan Publishing Corporation: New Delhi, India, 1989. [Google Scholar]
- Gantmacher, F.R. Theory of Matrices; Chelsea Publishing Company: New York, NY, USA, 1959; Volume I. [Google Scholar]
- Rao, K.N.S. The Rotation and Lorentz Groups and Their Representations for Physicists; Wiley Eastern Limited: New Delhi, India, 1988. [Google Scholar]
- Scheck, F. Mechanics: From Newton’s Laws to Deterministic Chaos, 6th ed.; Springer: Berlin, Germany, 2018. [Google Scholar]
- Moretti, V. The interplay of the polar decomposition theorem and the Lorentz group. Lect. Notes Semin. Interdiscip. Mat. 2006, 5, 153–171. [Google Scholar]
- Urbantke, H.K. Elementary Proof of Moretti’s Polar Decomposition Theorem for Lorentz Transformations. arXiv 2002, arXiv:math-ph/0211077. [Google Scholar]
- Karplyuk, K.S.; Kozak, M.I.; Zhmudskyy, O.O. Factorization of the Lorentz Transformations. Ukr. J. Phys. 2023, 68, 19–24. [Google Scholar] [CrossRef]
- Peskin, M.E.; Schroeder, D.V. An Introduction to Quantum Field Theory; Westview Press: Boulder, CO, USA, 1995. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).