“This month’s read” will cover some interesting, but fascinating concepts in field theory and the calculus of variations. Most physicists have encountered the notion of (Faddeev–Popov) ‘ghost’ at some point or another in there carreer and might have thought “What in the world is this? Do we really need this?” On the other hand, science aficionados might have encountered that same term in (popular) science news, where it is often depicted as some mysterious and magical thing. This post aims to clear up some of the mystery and explain why these objects are ‘necessary’ for a rigorous mathematical treatment of the subject.


The story starts with the treatment of constrained dynamics by Dirac. He found a way to elegantly incorporate constraints into the classical Hamiltonian formulation of classical mechanics. Extending this formulation in an algebraic manner, quite naturally leads to the notion of ‘ghosts’, even in the particle setting, where the so-called BRST formalism is introduced most easily. Going beyond particles is then rather straight forward, although the Lagrangian treatment is more natural in this case (since field theories are often treated from the Lagrangian point of view, although a Hamiltonian formulation is still possible).

At the same time, to formally treat variational problems and their subtleties, it will be necessary to pass beyond the classical geometry of phase spaces and enter the realm of jet bundles. The resulting structure will be a combination of the variational bicomplex and the BV-BRST complex. Although this might look like some casual name-dropping, all these notions will be made clear by the end of this post.


One of the foundational formulations of classical mechanics was introduced by Hamilton in 1833. The central object in this formulation is the Hamiltonian (function) $H:\mathbb{R}^{2n}\rightarrow\mathbb{R}$, which depends on both the ordinary coordinates $q^i$ and their associated momenta $p_i$. To give an example, consider a particle moving in 3D. the coordinates are then $q^1=x$, $q^2=y$ and $q^3=z$, and the associated momenta are $p_1=mv_1$, $p_2=mv_2$ and $p_3=mv_3$. However, for more involved systems, such as those involving electromagnetism, these expression can change.

But what about constraints of the form

\[\xi(q,p)=0\,.\]

For example, in the point particle case, a pendulum attached to a rigid bar (of length $a\in\mathbb{R}^+$). This would be modeled by a constraint of the form

\[\xi_{\text{pendulum}}(q,p) := r^2-a^2 = (x^2+y^2+z^2)-a^2\,.\]

Another example would be a solid wall (e.g. to the left of $x=0$). This would be modeled by a constraint of the form1

\[\xi_{\text{wall}}(q,p) := \begin{cases}+\infty & x<0\\ 0 & x\geq0\,.\end{cases}\]

In the Lagrangian setting2, where we are simply minimizing (or extremizing) the action, these constraints are easily incorporated. We just have to replace the action

\[S[q,\dot{q}] := \int_{t_1}^{t_2}L(q,\dot{q})\,dt\]

by the extended action

\[S[q,\dot{q},\lambda] := \int_{t_1}^{t_2}\Big(L(q,\dot{q}) + \sum_a\lambda^a\xi_a\Bigr)\,dt\,,\]

where the $\lambda^a$ are called Lagrange multipliers. The constraint equations then simply follow from the extremality conditions

\[\frac{\delta S}{\delta\lambda^a}=\xi_a=0\,.\]

All of this can also easily be done in the Hamiltonian framework, thereby opening up the study of Poisson geometry to the inclusion of constraints. Hamilton’s equations of motion read

\[\frac{\partial H}{\partial q^i} = \dot{p}_i \qquad\text{and}\qquad \frac{\partial H}{\partial p_i}=-\dot{q}^i\,.\]

After introducing the Poisson bracket

\[\{f,g\} := \frac{\partial f}{\partial p_i}\frac{\partial g}{\partial q^i} - \frac{\partial g}{\partial p_i}\frac{\partial f}{\partial q^i}\,,\]

these equations can be rewritten as

\[\dot{p}_i = \{H,q^i\} \qquad\text{and}\qquad \dot{q}^i = \{H,p_i\}\,.\]

The constrained equations of motion are then generated by replacing the original Hamiltonian $H$ by the extended Hamiltonian

\begin{gather} \label{extended_hamiltonian} H_E(q,p,\lambda) := H(q,p) + \sum_a\lambda^a\xi_a(q,p)\,, \end{gather}

where the $\lambda^a$ play a role similar to the Lagrange multipliers in the Lagrangian framework. These can also be derived from the Hamiltonian action

\[S_E[q,p,\lambda] := \int_{t_1}^{t_2}\left(\dot{q}^ip_i-H(q,p)+\sum_a\lambda^a\xi_a(q,p)\right)\,dt\,.\]

Dirac’s remarkable insight was to not just stop here, but consider the consistency conditions on the constraint functions and study the algebraic structure that arises. So, assume that we start with a set of constraint functions ${\xi_a}_{a\leq n}$, the so-called primary constraints. Under Hamiltonian evolution, the constraints should be weakly conserved, i.e. conserved when the equations of motion and the constraints are satisfied:

\[\{H_T,\xi_a\}\approx0\,,\]

where $f(q,p)\approx0$ means that the phase space function $f$ vanishes on shell, i.e. on the space of solutions of the equations of motion. These consistency conditions are either tautologies, i.e. no new constraints are obtained, a condition on the Lagrange multipliers is obtained, or they lead to equations not involving the Lagrange multipliers:

\[\chi_b(q,p)=0\,.\]

These new constraints are said to be secondary.3 Besides this primary-secondary distinction, another distinction exists. A phase space function is said to be first class if its Poisson bracket with every constraint vanishes weakly. For example, the consistency conditions state that the (total) Hamiltonian is first class! If a function is not first class, it is said to be second class. Primary constraints are also first class. For first-class, primary costraints, this gives rise to the following expression:

\begin{gather} \label{algebra} \{\xi_a,\xi_b\} = C^c_{ab}\xi_c + T^i_{ab}\frac{\delta S}{\delta x^i}\,. \end{gather}

Now, for the people that have some background in Lie algebra theory (or more complex subjects such as Kac–Moody algebras), this equation might look familiar. However, it is worthwile to analyze it in detail. Three cases can be considered:

  • $T^i_{ab}=0$ and $C^c_{ab}$ are constants: In this case, the constraints form an ordinary Poisson algebra. These constraint algebras are also called closed algebras, since they can be exponentiated to obtain a group, a Lie group even.
  • $T^i_{ab}=0$: Here, the $C^c_{ab}(q,p)$ are called the structure functions of the constraint algebra. Such algebras are also called soft algebras.
  • General case: These are called open algebras since they only close on shell.
Note

Technically, it is not entirely correct that all closed algebras correspond to Lie algebras. This is only the case when the constraints are independent, the so-called irreducible case. For reducible constraint algebras, the resulting structure is that of a $L_\infty$-algebra. For soft algebras, we obtain a similar situation. However, these are not algebras in proper sense. They form Lie algebroids or, for reducible theories, $L_\infty$-algebroids. And to finalize this remark, the last case gives rise to central extensions of $L_\infty$-algebroids.4

Although not entirely relevant for my story, I do want to include some more information about Dirac’s work since he is one of the god’s of theoretical physics. The keen reader might have noticed that the above algebraic structure applies to the first-class constraints only. But what about second-class constraints? The simplest solution, due to Dirac, was just to remove these from the picture. Instead of using the original Poisson bracket, the Dirac bracket is introduced:

\[\{f,g\}_D := \{f,g\} - \{f,\chi_a\}C^{ab}\{\chi_b,g\}\,,\]

where $C^{ab}$ is the inverse of the matrix $C_{ab}:={\chi_a,\chi_b}$.


Now, what we are really interested in is the algebra of physical observables. We start of with a general phase space $M$ (this is usually modeled as a symplectic or Poisson manifold) with its space of smooth functions $C^\infty(M)$. However, the constraints act to reduce this structure in two ways:

  • they cut out a submanifold $\Sigma\subseteq M$, which is called the constraint surface, and
  • they foliate the phase space by gauge orbits, in the sense that they generate transformations that leave the physical states invariant, the so-called gauge transformations.

If we denote the ideal generated by the constraints by $\mathcal{N}$, the first step is implemented as follows:

\begin{gather} C^\infty(\Sigma) \cong C^\infty(M)/\mathcal{N}\,, \end{gather}

The second step is implemented by passing to the invariant functions $C^\infty(\Sigma)^{\mathcal{N}}$.

Now, how to characterize the algebra $C^\infty(\Sigma)$? In general, $\Sigma$ will not be a nice manifold and, hence, describing its algebra of smooth functions can be difficult analytically. (Quotient space are notoriously hard from a differential-geometric point of view.) For this reason, it is interesting to consider a more algebraic approach. What we want to do is build a space of functions such that all functions generated by the constraints vanish: $f(\xi(q,p))\approx0$. The solutions to this problem was originally introduced in mathematics by Koszul and will allow us to treat the irreducible cases.5

We start with the function space $C^\infty(M)$. These functions are generated by the variables $q^1,\ldots,q^n,p_1,\ldots,p_n$. To these variables, we now adjoin a new odd variable $\mathcal{P}_a$ for every constraint: $\mathcal{P}_a\mathcal{P}_b=-\mathcal{P}_b\mathcal{P}_a$. These variables are called antighosts. On the (tensor) product algebra $C^\infty(M)\otimes\mathbb{C}[\mathcal{P}_a]$, which is polynomial in the antighosts, one can then define a derivation $\delta_K$ as follows (it is extended to arbitrary functions through the Leibniz rule):

\[\delta_K q^i=\delta_K p_i = 0\qquad\text{and}\qquad\delta_K\mathcal{P}_a = \xi_a\,.\]

The next step will be obvious for the people that know some homological algebra. It is not hard to see that the above rules turn $\delta_K$ into a nilpotent operator, i.e. $\delta_K^2=0$. Accordingly, we can form the quotient

\[H^\bullet(\delta_K) := \frac{\ker(\delta_K)}{\mathrm{im}(\delta_K)}\,.\]

This group, called a cohomology group, obtains a grading from the antighost degree, which is defined as follows:

\[\mathrm{antigh}\bigl(f(q,p)\mathcal{P}_{a_1}\cdots\mathcal{P}_{a_n}\bigr) := n\,.\]

Now, it can be shown that functions $f$ that vanish on shell, i.e. on the constraint surface, must be of the form $f^a(q,p)\xi_a(q,p)$ and, hence, that

\[f = f^a\xi_a = \delta_K(f^a\mathcal{P}_a)\,,\]

i.e. they lie in the image of $\mathrm{im}(\delta_K)$. It follows that the cohomology in antighost degree 0 recovers the physical observables

\begin{gather} C^\infty(\Sigma)\cong H^0(\delta_K)\,. \end{gather}

To model the gauge transformations, a different approach is followed, one that does preserve the geometric aspects of the theory. Every constraint $\xi_a$ induces an infinitesimal gauge transformation given by the following vector field:

\[X_a := \{\xi_a,\cdot\}\,.\]

These span a vector space called the space of longitudinal vector fields. As in ordinary differential geometry, we can now consider its dual, the longitudinal complex $\Omega_L^\bullet(\Sigma)$, which is spanned by the longitudinal 1-forms $\eta^a$. This complex comes equipped with a de Rham-like differential:

\begin{gather} \newcommand{\dr}{\mathrm{d}} \label{CE} \begin{aligned} \dr f &:= \{\xi_a,f\}\dr\eta^a\,,\\
\dr\eta^a &:= -\frac{1}{2}C^c_{ab}\eta^a\wedge\eta^b\,, \end{aligned} \end{gather}

where $C^c_{ab}$ are the structure functions of the gauge algebra \eqref{algebra}. The longitundinal 1-forms are also called ghosts! It is a nice exercise to show that $\delta_K^2=0$ and $\dr^2\approx0$. Moreover, the longitudinal derivative can be extended to the antighost sector through the choice

\[\dr\mathcal{P}_a := C^c_{ba}\eta^b\mathcal{P}_c\,.\]

This has the effect that $\delta_K\dr=-\dr\delta_K$, which implies that $\dr$ descends to a differential on $H^\bullet(\delta_K)$. The induced cohomology, denoted by $H^\bullet(\dr\mid\delta_K)$, is called the BV-BRST cohomology (for Batalin, Vilkovisky and Becchi, Rouet, Stora and Tyutin).

Note
A small digression about the BRST complex is in order. Most people familiar with quantum field theory will have heard about the BRST complex at some point, but fewer will have heard about the BV part. The reason is that the BRST complex, i.e. the ghost sector, describes the quotient by the gauge transformations, which is more relevant (at first sight). The work of Batalin and Vilkovisky (and many others) was to also include the intersection of phase space with the constraints (and equations of motion in the Lagrangian framework).

Now, for the BRST complex itself, Eq. \eqref{CE} might look familiar to some. The differential acting on the ghosts $\eta^a$ has the same action as the Chevalley–Eilenberg differential. This Chevalley–Eilenberg algebra encodes exactly the $L_\infty$-algebroid underlying the constraint algebra as explained in this note above.

Now, people that have ever come into contact with gauge theory and (BV-)BRST cohomology know that this is still not the end of the story. The same holds for people with a background in homological algebra. The particular situation whereby an ‘almost’ differential descends to the cohomology of another differential is a well-studied one. Homological perturbation theory says that there exists a differential $s$ on $C^\infty(M)\otimes\mathbb{C}[\mathcal{P}_a]\otimes\mathbb{C}[\eta^a]$ such that

\begin{gather} (C^\infty(M)/\mathcal{N})^\mathcal{N}\cong H^0(\dr\mid\delta_K)\cong H^0(s)\,. \end{gather}

This BRST differential $s$ can, moreover, be constructed as follows (hence the name perturbation theory):

\[s = \delta_K + \dr + \cdots\,.\]

What makes this even more interesting is that it can be generated as a canonical transformation in the extended Poisson bracket, i.e. there exists a function $\Omega\in C^\infty(M)\otimes\mathbb{C}[\mathcal{P}_a]\otimes\mathbb{C}[\eta^a]$ such that

\[s\,\cdot = \{\Omega,\cdot\}\,.\]

This function is called the BRST charge and it encodes all the structure induced by the constraints! Isn’t that neat. Physical observables are those functions $f$ such that $\{\Omega,f\}=0$.


For field theory, a similar approach is approached. However, here, it is more useful to work in the Lagrangian picture and replace phase space coordinates by actual trajectories. What does this mean?

Well, every point $(q,p)$ in phase space can be seen as an initial condition at some time $t_1\in\mathbb{R}$. Using Hamilton’s equations (and some regularity conditions on the spacetime manifold $M$), one can then obtain the values of $q$ and $p$ at any other time $t_2$. We thus obtain a function $q:\mathbb{R}\rightarrow M$. The covariant phase space $\mathcal{E}$ is defined as the space of all these functions. In field theory, we do not have trajectories in $M$ as fundamental objects, but we have fields living on $M$ and (usually) taking values in some vector space $V$. The covariant phase space is then the space of sections of some vector bundle over $M$, modeled on $V$: $\mathcal{E}=\Gamma(E)$ for some vector bundle $\pi:E\rightarrow M$. This is a lot of technical terms, but suffice it to say that, in case $M$ is for example a Euclidean space $\mathbb{R}^4$, the covariant phase space is given by $\mathcal{E}=C^\infty(\mathbb{R}^4,V)$.

Now, we usually start from an action $S[\phi]$ and considering the equations of motion

\[\frac{\delta S}{\delta\phi^I}=0\]

for all fields $\phi^I$. The shell $\Sigma\subseteq\mathcal{E}$ is then the subset of the covariant phase space consisting of the solutions to these equations. It plays the same role as the constraint surface in the foregoing section. If our action $S$ has some gauge symmetry, it is a well known fact that we obtain, for each generator $\xi_a$ of this symmetry, a Noether identity:

\[\delta_\varepsilon S = \frac{\delta S}{\delta\phi^I}\delta_\varepsilon\phi^I\equiv\frac{\delta S}{\delta\phi^I}R^I_\alpha\varepsilon_\alpha=0\,.\]

In other words, if the action functional has a gauge symmetry, the equations of motion are not independent!6 However, even without the presence of proper gauge symmetries, there is always a form of trivial gauge symmetry present! Every transformation of the form

\[\delta_\varepsilon\phi^I := \varepsilon^{IJ}\frac{\delta S}{\delta\phi^J}\,,\]

where $\varepsilon^{IJ}$ is antisymmetric in its indices, gives rise to a symmetry of the action. This full gauge algebra, consisting of proper gauge symmetries and the so-called zilch symmetries, will now play the role of constrain algebra.


  • P. A. M. Dirac (1950). Generalized Hamiltonian Dynamics. Canadian Journal of Mathematics, 2, 129–148.

  1. $+\infty$ could be replaced by any positive value without altering the final result (at least classically). 

  2. Passing between the Hamiltonian and Lagrangian frameworks requires us to replace the canonical coordinates $(q,p)$ with the generalized coordinates $(q,\dot{q})$, but we will ignore this technicality. 

  3. Note that the distinction between primary and secondary is mainly artificial. One could add to the initial set of constraints and obtain the same theory. For a refinement, see the Dirac conjecture

  4. Central extensions pop up everywhere in mathematics and physics. They arise, for example, in the study of projective representations in quantum mechanics (see my Master thesis) or in conformal field theory (the Virasoro algebra). 

  5. To handle reducible cases, the extension by Tate is required. However, the most important ideas are already present in the work by Koszul and, hence, we will restrict ourselves to this case. 

  6. This is the main difference between global (rigid) symmetries and local (gauge) symmetries. Whereas the former lead, through Noether’s (first) theorem, to conservation laws, the latter lead, through Noether’s second theorem, to identities among the equations of motion.