Finn Kennedy

The Lorentz Group and the Dirac Equation
MATH 4500

Finn Kennedy

May 2024

The generalized orthogonal group, O(p,q)

Introduction

The group O(p,q) is defined as the set of all real matrices that preserve the following symmetric bilinear form:

\[(x,y)_{pq} = x_1y_1 + x_2y_2 + ... + x_py_p - x_{p+1}y_{p+1} - ... - x_{p+q}y_{p+q}\]

Which means that for all \(\Lambda \in O(p,q)\),

\[(\Lambda x,\Lambda y)_{pq} = (x,y)_{pq}\]

Or, equivalently,

\[(x,y)_{pq} = x^T\eta y = (x,\eta y)\]

Where \(\eta\) is an \((p+q)\times(p+q)\) diagonal matrix with \(+1\)’s for the first p diagonal entries, and \(-1\)’s for the next q entries. This implies the following:

\[(x,\eta y) = (\Lambda x,\eta\Lambda y) = (x,\Lambda^T\eta\Lambda y) \implies \Lambda^T\eta\Lambda = \eta\]

From which follows: \(\det(\Lambda) = \pm1\), as \(\det(\Lambda^T)\det(\eta)\det(\Lambda) = \det(\eta)\), and \(\det(\Lambda)^2 = 1\).

The Special generalized orthogonal group, SO(p,q) is likewise defined as the subgroup of O(p,q) with \(\det=1\).

In summary: \[O(p,q) = \{\Lambda\in GL(p+q,R) | \Lambda^T\eta\Lambda=\eta \}\] \[SO(p,q) = \{\Lambda\in O(p,q) | \det(\Lambda)=1 \}\]

Topology

When p and q are both non-zero (otherwise we are simply dealing with the orthogonal group), O(p,q) has four distinct, disconnected components. These four components correspond to preserving or flipping orientation on the p and q dimensional subspaces O(p,q) acts on.

To see why this is the case, consider the maximal compact subgroup of O(p,q) = \(O(p)\times O(q)\). This subgroup will have the same number of connected components as the group O(p,q). Since O(p) and O(q) are both not connected, having both two connected components each (corresponding to the +1 and -1 determinants), the total number of connected components of O(p,q) will be four, given \(p,q\neq 0\) \(^{[8]}\).

Lorentz group

In special relativity, the speed of light, c, is said to be the same, regardless of inertial reference frame. So, given two reference frames, \((t,x,y,z)\) and \((t',x',y',z')\), any event, or spacetime vector, will have the same magnitude in either reference frame\(^{[2]}\):

\[c^2t^2-x^2-y^2-z^2 = c^2t'^2 - x'^2 - y'^2 - z'^2\]

Where the speed of light remains constant in both frames. Because this condition is the same as preserving the bilinear form Q:

\[Q(t,x,y,z)\mapsto t^2-x^2-y^2-z^2\]

The Lorentz group is defined as O(1,3), the group which preserves Q, or alternatively, which preserves distance in Minkowski spacetime, written as \(\mathbb{R}^{1,3}\).

The four components of the Lorentz group can be understood in terms of whether or not a given transformation flips the orientation of the space and/or time dimensions. Transformations which preserve the direction of the time axis are called orthochronous, and those which preserve spatial orientation are called proper. The component of the Lorentz group which is both proper and orthochronous is deemed the restriced Lorentz group, \(SO^+(1,3)\), and is the largest subgroup of the Lorentz group, as it contains the identity.

The rest of this paper deals with the restricted Lorentz group, which is often just called the Lorentz group for brevity.

Lie algebra and Generators

First, an aside on the lie algebra of the Lorentz group. Understanding the Lie algebra will be important for understanding why the double cover of the restricted Lorentz group has two nonequivalent irreducible representations, which will correspond to left and right chiral spinors in physics.

The Lie algebra is calculated by taking some path A(t) in the Lorentz group such that A(0) = I, and the derivative of the path \(A'(t) = X\). Then,

\[A(t)^T\eta A(t) = \eta\]

Differentiating, and using the product rule:

\[\frac{d}{dt}(A(t)^T\eta A(t)) = A'(t)^T\eta A(t) + A(t)^T\eta'A(t) + A(t)^T\eta A'(t) = \frac{d}{dt}(\eta)\]

Noticing \(\eta\)’ = 0, because it is a constant matrix, and setting t = 0:

\[A'(t)^T\eta A(t) + A(t)^T\eta A'(t) = 0\]

\[X^T\eta+\eta X = 0\]

And, noticing that \(\eta = \eta^{-1}\),

\[\mathfrak{so}(3,1) = \{X\in M(p+q,\mathbb{R}) | \eta X\eta = -X^T\}\]

The matrices that satisfy this equation are of the form:

\[X = \begin{pmatrix} 0 & a & b & c \\ a & 0 & d & e \\ b & -d & 0 & f \\ c & -e & -f & 0 \end{pmatrix}\]

Where the coefficients a,b,c,d,e,f correspond to the six basis vectors of our tangent space. The a,b,c matrices are renamed \(K_1,K_2,K_3\), and the d,e,f are renamed \(J_1,J_2,J_3\), as is convention. When exponentiated back into the Lorentz group, these represent the generators of the group, which correspond to spacetime rotations (the J’s), and Lorentz boosts, or hyperbolic rotations (the K’s)\(^{[4]}\).

The double cover of \(SO^+(1,3)\)

In quantum mechanics, the mapping \(\varphi:SL(2,\mathbb{C}) \rightarrow SO^+(1,3)\) is of particular importance. The following is a proof that this homomorphism is a surjective 2-1 mapping, and thus that \(SL(2,\mathbb{C})\) is the double cover of \(SO^+(1,3)\). In fact, \(SL(2,\mathbb{C})\) is the universal covering group of \(SO^+(1,3)\), and is isomorphic to the group \(Spin(1,3)\)

To investigate this, a homomorphism between the two groups must be defined. \(SL(2,\mathbb{C})\) acts on the space of \(2\times2\) complex matrices, so if we can identify a basis in this space that is isomorphic to Minkowski space, \(\mathbb{R}^{1,3}\), we can classify which matrices in \(SL(2,\mathbb{C})\) should act in the same way as the Lorentz group on \(\mathbb{R}^{1,3}\). The basis is constructed as follows:

\[X = \begin{pmatrix} t + z & x-iy \\ x+iy & t-z \\ \end{pmatrix} \longleftrightarrow \begin{pmatrix} t & x & y & z \end{pmatrix} = \vec{x}\]

Where it is noted that: \[\det(X) = (\vec{x},\vec{x}) = t^2-x^2-y^2-z^2\] Meaning that the Minkowski bilinear form is preserved in our representation of the vector \(\vec{x}\) in the space of \(2\times2\) complex matrices. Notice that this basis is really the Pauli matrices:

\[\sigma_0 = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix} \ \sigma_1 = \begin{pmatrix} 0 & 1 \\ 1 & 0 \\ \end{pmatrix} \ \sigma_2 = \begin{pmatrix} 0 & -i \\ i & 0 \\ \end{pmatrix} \ \sigma_3 = \begin{pmatrix} 1 & 0 \\ 0 & -1 \\ \end{pmatrix}\]

Matrices \(S\in\) \(SL(2,\mathbb{C})\) transform vectors X as: \(X \mapsto SXS^\dagger\). If \(SL(2,\mathbb{C})\) is faithful to the Lorentz group, then conjugation by these matrices should preserve determinant as Lorentz transformations preserve the bilinear form. This is true because all matrices in \(SL(2,\mathbb{C})\) are \(\det(S) = 1\), and therefore:

\[\det(SXS^\dagger) = \det(S)\det(X)\det(S^\dagger)= \det(X)\]

To make this more explicit, we can write a Lorentz transformation: \(\vec{x'} = \Lambda \vec{x}\) in terms of a transformation S in \(SL(2,\mathbb{C})\). Notice that \(\sigma_n\sigma_m\neq0\), so we must reason another way of defining an inner product.

\[(\sigma_n,\sigma_m) = \frac{1}{2}Tr(\sigma_n\sigma_m)\]

Using the trace of the product of two matrices as the inner product on our "vector" space works well, as \((\sigma_n,\sigma_m)=\delta_{mn}\). Therefore:

\[x'_n = (S X S^\dagger,\sigma_n) = \frac{1}{2}Tr(S \sigma_m S^\dagger\sigma_n)x_m = \Lambda \vec{x}\]

Now, it is obvious that \(SL(2,\mathbb{C})\) is the double cover of \(SO^+(1,3)\), as the matrices \(\{+S,-S\}\) both map to the same \(\Lambda\). Furthermore, the kernel, Ker\(\varphi\), or all matrices S that map to the identity in \(SO^+(1,3)\), is \(\{+I,-I\}\), as the only value of S that make the matrix \(\frac{1}{2}Tr(S \sigma_m S^\dagger\sigma_n) = I\) are S = \(\{+I,-I\}\), in which case \(\frac{1}{2}Tr(S \sigma_m S^\dagger\sigma_n) = \frac{1}{2}Tr(\sigma_m\sigma_n) = \delta_{nm} = I\).

This mapping is surjective, as it can be shown that every basis element of \(SO^+(1,3)\) has a corresponding matrix in \(SL(2,\mathbb{C})\).

So, by the first isomorphism theorem, \(SO^+(1,3) \cong SL(2,\mathbb{C})/\{\pm I\}\). Furthermore, because \(SL(2,\mathbb{C})\) is connected, we can say that it is the universal covering group of \(SO^+(1,3)\), and therefore, \(SL(2,\mathbb{C})\cong Spin(1,3)\). \(^{[3,4]}\)

A not so brief discussion of representations

In quantum field theory, representations of groups are very important. This is because, in general, wave function solutions should be invariant under transformations by certain groups. This basically means that the laws of physics should be the same no matter where you are. A certain representation of a group is defined as the way the group acts upon a vector space \(V\), or specifically, the map \(\varphi: G \rightarrow GL(V)\), where we essentially are choosing a way to describe the group as a matrix. If we instead wish for a description of the group on the vector space \(V^*\), the matrices representing the group G will in principle not be the same. In QFT, these different possible vector spaces are taken to be solution spaces for a particles wave function. So, if a wave function is invariant under a group \(G\), then there exist solutions to the wave functions in all the vector spaces of its irreducible representations, then, in principle, it is possible to describe how the irreducible representations would act differently on the same vector space, and thus correspond to different possible particle states.

We seek to do this for the double cover of \(SO^+(1,3)\), \(SL(2,\mathbb{C})\), as the wave function of a particle in quantum mechanics is defined with complex coefficients, and thus requires a representation of the group in complex space. In general this process can be fairly difficult, however, for simply connected matrix Lie groups (which \(SL(2,\mathbb{C})\) is), there is a one to one correspondence between representations in the Lie group and representations in the Lie algebra\(^{[1]}\).

This is a very powerful theorem, that makes identifying representations much easier. If we notice that:

\[\mathfrak{sl}(2,\mathbb{C}) = \{X \in M(n,\mathbb{C}) | Tr(X) = 0\}\] \[\mathfrak{su}(2) = \{X \in M(n,\mathbb{C}) | X + X^\dagger = 0, Tr(X) = 0\}\]

We can then write some matrix X in \(\mathfrak{sl}(2,\mathbb{C})\) as:

\[X = \frac{X - X^\dagger}{2} + i\frac{X + X^\dagger}{2i}\]

Redefining the first term as \(X_1\) and the second as \(iX_2\), we can see that both \(X_1\) and \(X_2\) satisfy the relations of the \(\mathfrak{su}(2)\) Lie algebra. We can then see that\(^{[7]}\):

\[\mathfrak{sl}(2,\mathbb{C}) \cong \mathfrak{su}(2)\times\mathfrak{su}(2)\]

This is an extremely powerful statement, because we can now see that irreducible representations in \(SL(2,\mathbb{C})\) are simply going to be the combinations of the irreducibles of \(SU(2)\), due to the one to one correspondence between representations of a group and its Lie algebra. That is, for two representations \((\pi_{s_1},V^{s_1})\), \((\pi_{s_2},V^{s_2})\) of \(SU(2)\), an irreducible representation of \(SL(2,\mathbb{C})\) will be given by\(^{[4]}\):

\[(\pi_{s_1}\otimes\pi_{s_2},V^{s_1}\otimes V^{s_2})\]

We will be most interested in the two representations: \((\frac{1}{2},0)\) and \((0,\frac{1}{2})\), which correspond to the representations which act on the left and right handed Weyl spinors.

It will now be useful to understand how the two representations act differently on the same vector space. Schur’s first lemma states that two matrix representations, \(\pi_1(g)\) and \(\pi_2(g)\), are equivalent if there exists a matrix A such that:

\[\pi_1(g) = A^{-1}\pi_2(g)A\]

Given our first representation S of \(SL(2,\mathbb{C})\), the only other matrix form that is not "reachable" by conjugation of A will be \((S^\dagger)^{-1}\), as conjugation by A will not change the eigenvalues of the matrix S, but complex conjugation can\(^{[4]}\). This is in contrast to \(SU(2)\), which has entirely real eigenvalues, meaning the complex conjugate of any matrix in \(SU(2)\) is an equivalent representation.

Furthermore, in general, when there are two irreducible representations for a group over complex space, and one is given by \(\Pi(g)\), the other, \(\Pi^*(g)\) = \(\Pi(g^{-1})^\dagger\) \(^{[1]}\).

This means that in \(SL(2,\mathbb{C})\), left spinors and right spinors will transform as:

\[\psi'_L = S\psi_L\] \[\psi'_R = (S^\dagger)^{-1}\psi_R\]

Physicists typically reach this result by looking at the parity transformations of representations of the Lie algebra of the Lorentz group, with the idea that parity will flip one irreducible to its dual. Importantly, parity leaves rotations invariant, but flips the signs of Lorentz boosts. Exponentiating the parity transformed Lie algebra basis back into the group yields \((S^\dagger)^{-1}\) in terms of \(S \in SL(2,\mathbb{C})\), as in the above relationship. This is analagous to constructing the dual representation using the complexification of the \(\mathfrak{so}(3,1) \cong \mathfrak{sl}(2,\mathbb{C})\) algebra.\(^{[5]}\)

The Dirac Equation and Weyl spinors

Using the language of left and right spinors, the Dirac equation can be derived. First, we start by defining a Dirac spinor, which is simply the direct sum of the left and right representations.\(^{[6]}\)

\[\Psi' = D(S)\Psi\] \[\Psi = \begin{pmatrix} \psi_L \\ \psi_R \end{pmatrix} \; D(S) = \begin{pmatrix} S & 0 \\ 0 & (S^\dagger)^{-1} \end{pmatrix}\]

Typically, in a well behaved quantum system we can immediately start taking inner products and expectation values, however, because of our two irreducible representations, \(D(S)^\dagger D(S) \neq 1\) typically. Therefore in order to correspond this representation to real world measurements, a new inner product has to be defined. To do this we have to introduce the gamma matrices (or Dirac matrices). Following [5], the Dirac adjoint is defined as:

\[\bar{\psi} = \psi^\dagger \gamma_0\]

\[\gamma_0 = \begin{pmatrix} 0 & I_2 \\ I_2 & 0 \end{pmatrix}\]

Where \(\bar{\psi}\psi\) is a scalar quantity, a Lorentz scalar, invariant under Lorentz transformations. The rest of the gamma matrices are:

\[\gamma_i = \begin{pmatrix} 0 & \sigma_i \\ -\sigma_i & 0 \end{pmatrix}\]

Where \(\sigma_i\) is the i’th Pauli matrix. The gamma matrices are technically the Clifford algebra, Cliff\((1,3)\), which satisfy the anti-commutator relation: \(\{\gamma_\mu,\gamma_\nu\}=2\eta\).

By reasoning of the Lorentz scalar, \(\bar{\psi}\gamma_\mu\psi\) is called a Lorentz vector, and is also invariant to Lorentz transformations. Note that identification of these matrices allows us to write any vector in Minkowski spacetime as (introducing Feynman slash notation):

\[\slashed{x} = t\gamma_0 + x\gamma_1 + y\gamma_2 + z\gamma_3\]

Which shows the connection between our description of the double cover of the Lorentz group and our new representation on the Dirac spinors. The Minkowski norm, Q, is preserved as \(\gamma_0^2 = 1\), and \(\gamma_i^2 = -1\) for \(i = 1,2,3\). Our representation in the Dirac spinor space is just the tensor product of our two irreducible representations, with the \(\gamma\) matrices serving as the Pauli matrices of both at the same time.

Now, to get the dirac equation, we attempt to write a Lorentz invariant action (where the \(\mu\) has become a superscript of the \(\gamma\) matrices to satisfy contravariance and covariance):

\[\int dx^4 \bar{\psi}(i\gamma^\mu\partial_\mu-m)\psi\]

Where the Lagrangian density in the integral is found from assuming there is a square root of the Klein-Gordon equation. We can now show that this Lagrangian is Lorentz invariant:

\[\mathcal{L} = i\bar{\psi}\gamma^\mu\partial_\mu\psi- \bar{\psi}m\psi\]

By noting that both terms are either Lorentz scalars or vectors(when contracted over the \(\mu\) indices, \(\partial\) is is a scalar quantity multiplying the \(\gamma\) matrices), which are invariant under the action of the Lorentz group. Then: variation of the lagrangian with respect to \(\psi\) yields \(^{[6]}\):

\[(i\gamma^\mu\partial_\mu-m)\psi = 0\]

The Dirac equation. Written using Feynman slash notation, where \(\gamma\) matrices are contracted into the differential:

\[(i\slashed{\partial}-m)\psi = 0\]

Importance of the Dirac Equation

The Dirac equation is one of the most important results in physics. Solutions to the Dirac equation describe all spin \(\frac{1}{2}\) particles. Furthermore, when these particle fields are quantized, which was not done here, both matter and anti-matter solutions are found. Using this equation, Dirac correctly predicted anti-matter years before it was experimentally confirmed.

Citations

[1] Hall, B. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, 2nd edition, Springer, 2015.
Einstein, A. Relativity: the Special and General Theory, Penguin Random House, 1995.
Feng, Y, Stange, K. The spin homomorphism SL2(C) → SO(1,3)(R), Lecture notes, University of Colorado, Boulder, Department of Mathematics.
Woit, P. Quantum Theory, Groups and Representations: An Introduction, Springer, 2017.
Tong, D. Quantum Field Theory: University of Cambridge Part III Mathematical Tripos, Lecture notes, University of Cambridge, Department of Applied Mathematics and Theoretical Physics, 2006-2007.
Rabin, J. Introduction to Quantum Field Theory for Mathematicians, IAS/Park City Mathematics Series, Volume 1, 1995.
Stillwell, J. Naive Lie Theory, Springer, 2008.
Serre, D. Matrices, Theory and Applications 2nd edition, Springer, 2010.
The following Wikipedia articles were very helpful for understanding, and were fact checked against the previous sources: "Lorentz Group", "Lorentz Transformation", "Indefinite orthogonal group", "Representation theory of the Lorentz group", "Dirac equation", "Dual Representation", "Weyl equation".