
This Month's Read III: Electric-magnetic duality
The previous “This month’s read” posts were about probability theory and (quantum) logic. This month’s post will cover something completely different: electromagnetism (and some generalizations). Although we have experimentally discovered only electric charges (magnetic fields are generated by moving or ‘spinning’ electric charges), the laws of electromagnetism exhibit a variety of beautiful symmetries and duality relations when we introduce magnetic charges as well!
This post is structured as follows:
- Introduction to electromagnetism
- Quantization of electromagnetism
- Aharonov–Bohm and Aharonov–Casher effects
- Montonon–Olive duality
Classical electromagnetism is entirely characterized by Maxwell’s equations1 $\newcommand{\vector}[1]{\vec{\mathbf{#1}}}$
together with the Lorentz force \begin{gather} \vector{F} = q(\vector{E}+\vector{v}\times\vector{B})\,. \end{gather} In these equations, $\vector{E}$ and $\vector{B}$ denote the electric and magnetic fields, respectively, $q$ denotes the electric charge and $\vector{J}$ denotes the electric current. Note that we work in natural units where both $c=1=\varepsilon_0$.
To better understand the theory of electromagnetism and eventually quantize it, we first have to unearth its symmetries a bit better, in particular its ‘gauge structure’. This structure is related to certain redundancies in the definition of the electric and magnetic fields.
First, start with Gauss’s law \eqref{B_gauss} for the magnetic field. This equation, through the Helmholtz decomposition, shows that
\begin{gather} \vector{B} = \vector{\nabla}\times\vector{A} \end{gather}
for some vector potential $\vector{A}:\mathbb{R}^3\rightarrow\mathbb{R}^3$. Inserting this in Faraday’s law \eqref{faraday} gives
\begin{gather} \label{electric_potential} \vector{E} = -\vector{\nabla}\varphi - \frac{\partial\vector{A}}{\partial t} \end{gather}
for some smooth function $\varphi:\mathbb{R}^3\rightarrow\mathbb{R}$. Now, due to the properties of the $\vector{\nabla}$-operator,
\[\vector{\nabla}\times(\vector{\nabla}f)=0 \qquad\text{and}\qquad \vector{\nabla}\cdot(\vector{\nabla}\times\vector{X})=0\,.\]for any smooth function $f:\mathbb{R}^3\rightarrow\mathbb{R}$ and smooth vector field $\vector{X}:\mathbb{R}^3\rightarrow\mathbb{R}^3$, the choice of potentials $(\varphi,\vector{A})$ is not unique. For every smooth function $\chi:\mathbb{R}^3\rightarrow\mathbb{R}$, we can perform the simultaneous transformations
\[\varphi\longrightarrow\varphi-\frac{\partial\chi}{\partial t} \qquad\text{and}\qquad \vector{A}\longrightarrow\vector{A}+\vector{\nabla}\chi\]without actually altering the electric and magnetic fields. This freedom will be essential once we consider quantum effects! However, for the moment, let us just consider the electromagnetic potentials without paying too much attention to this freedom.
In classical mechanics, most forces that we encounter are of a specific kind: they are conservative. This means that they can be derived from a potential (similar to the electric field):
\[\vector{F} = -\vector{\nabla} V\]for a smooth function $V:\mathbb{R}^3\rightarrow\mathbb{R}$. Although we could now wonder whether the Lorentz force is also of this type, there is, however, immediately an issue. Conservative forces only depend on the position $\vector{r}$, while the Lorentz force depends on the velocity $\vector{v}\equiv\dot{\vector{r}}$ as well! Luckily, there is a beautiful solution. We can generalize the notion of conservative forces to more general settings involving higher-order derivatives of the position (so even beyond the velocity). To this end, we need to say goodbye to the Newtonian realm and let the Lagrangian formulation enter the stage.
Newton’s second law
\[\vector{F} = \frac{d\vector{p}}{dt}\]is equivalent to the following set of equations, called the Euler–Lagrange equations:
\[\frac{\partial L}{\partial q^i} = \frac{d}{dt}\frac{\partial L}{\partial\dot{q}^i}\,,\]where $L:=T-V$ is the Lagrangian (function), the difference between the kinetic energy and the potential energy, and the $q^i$ are the ‘generalized coordinates’ (these are just our preferred coordinates in which to solve the dynamics of the system). In a slightly less common form, they read
\[\frac{\partial T}{\partial q^i} - \frac{d}{dt}\frac{\partial T}{\partial\dot{q}^i} = Q_i\,,\]with
\[Q_i := \vector{F}\cdot\frac{\partial\vector{r}}{\partial q^i}\]the generalized forces (note that if we work in ordinary coordinates, i.e. $\vector{q}=\vector{r}$, then $\vector{Q}=\vector{F}$). The operator
\[\delta_{\text{EL}} := \sum_{i=1}^3\left(\frac{\partial}{\partial q^i}-\frac{d}{dt}\frac{\partial}{\partial\dot{q}^i}\right)\delta q^i\]is also called the Euler–Lagrange operator or variational derivative. It follows that any generalized force of the form
\[Q = \delta_{\text{EL}}U\]for a generalized potential $U\equiv U(\vector{r},\vector{v},t)$ will lead to valid Euler–Lagrange equations. The fun part is now to check that the Lorentz force is exactly of this form, with
\[U(\vector{r},\vector{v},t) = q\bigl(V(\vector{r},t) - \vector{v}\cdot\vector{A}(\vector{r},t)\bigr)\,.\](This will be left as an exercise to the reader 😉.)
The last step required for (canonical) quantization is to go from the Lagrangian to the Hamiltonian framework. To this end, we needs to derive (conjugate or canonical) momenta from the Lagrangian:
\[p_i := \frac{\partial L}{\partial\dot{q}^i}\,.\]For ordinary potentials, the only contribution comes from the kinetic energy and we would obtain the classical relation
\[\vector{p} = m\vector{v}\,.\]However, since the Lorentz potential is velocity dependent, an additional term is obtained:
\[p_i = m\dot{q}^i + qA_i\,.\]It follows that the canonical momenta are not equal to the kinetic momenta!
Before quantizing electromagnetism, we will first consider what happens when we introduce magnetic charges. In this case, we have to modify Maxwell’s equations in the following way:
Moreover, the Lorentz force also has to be generalized:
\[\vector{F} = q(\vector{E}+\vector{v}\times\vector{B}) \textcolor{red}{+ g(\vector{B}-\vector{v}\times\vector{E})}\,,\]where $g$ is the magnetic (monopole) charge. Note that, with this formulation, these equations are completely symmetric under the following transformations:
To get a cleaner presentation, it is helpful to introduce a new notation motivated by special relativity. In spacetime (whether it be classic Minkowski space $\mathbb{R}^{1,3}$ or some spacetime 4-manifold $M$ modeled on $\mathbb{R}^{1,3}$), we have one temporal component and three spatial components. Instead of treating this separately, we will combine these 4 coordinates into what is called a 4-vector:
\[x^\mu:=(t,\vector{r})\equiv(t,x^1,x^2,x³)\,.\]In a similar way, we can also define the 4-momentum and 4-gradient:
\[p^\mu := (E,p^1,p^2,p^3) \qquad\text{and}\qquad \partial_\mu:=(\partial_t,\vector{\nabla})\equiv(\partial_t,\partial_1,\partial_2,\partial_3)\,.\]Note the position of the index $\mu$. For the coordinate vector it is a superscript, whereas for the derivative it is a subscript! The reason for why we usually do not pay attention to this detail in classical mechanics is that all spatial coordinates are equivalent. However, when including time, the situation changes and indices are raised and lowered using a metric (tensor) $\eta_{\mu\nu}$. For Minkowski spacetime, the metric is given by $\eta\equiv\mathrm{diag}(+1,-1,-1,-1)$ in the mostly-minus2 convention. For example, consider the 4-vector $\xi^\mu\equiv(\xi^0,\xi^1,\xi^2,\xi^3)$. When the indices are superscripts, it is called a contravariant vector. Lowering the indices, i.e. turning it into a covariant vector, is done as follows (with some abuse of notation where we treat rows and column vectors as equivalent):
\begin{align*}
\xi_\mu &= \eta_{\mu\nu}\xi^\mu\\
&= \begin{pmatrix}
+1&0&0&0\\
0&-1&0&0\\
0&0&-1&0\\
0&0&0&-1
\end{pmatrix}
\begin{pmatrix}
\xi^0\\
\xi^1\\
\xi^2\\
\xi^3\\
\end{pmatrix}\\
&= (\xi^0,-\xi^1,-\xi^2,-\xi^3)\,.
\end{align*}
Now, let us introduce the electromagnetic 4-potential $A^\mu:=(\phi,\vector{A})$. This lets us introduce a 2-index object that is of paramount importance in electromagnetism: the field strength. (The second equality in the following equation can again be seen as an exercise to the reader.)
\begin{gather}
\label{field_strength}
F^{\mu\nu} := \partial^\mu A^\nu - \partial^\nu A^\mu =
\begin{pmatrix}
0&-E_1&-E_2&-E_3\\
E_1&0&-B_3&B_2\\
E_2&B_3&0&-B_1\\
E_3&-B_2&B_1&0
\end{pmatrix}
\end{gather}
As a final piece of notation, we introduce the de Rham differential. It is a way to formalize the usual notions of infinitesimal variation and differentials:
\[\mathrm{d}f := (\partial_\mu f)\mathrm{d}x^\mu \qquad\text{and}\qquad \mathrm{d}(\theta_\mu\mathrm{d}x^\mu) := (\partial_\mu\theta_\nu-\partial_\nu\theta_\mu)\mathrm{d}x^\mu\wedge\mathrm{d}x^\nu \qquad\text{and}\qquad\cdots\,.\]The first formula says that a variation of a function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ is given by varying each of the coordinates a little bit in the sense of first-order Taylor expansions. The coordinate variations $\mathrm{d}x^\mu$ are, however, now not just infinitesimal numbers. They are (co)vectors in a well-defined way!
With these definitions, we now see that $F=\mathrm{d}A$ and, moreover, that Maxwell’s equations (with both electric and magnetic charges) are equivalent to
\[\mathrm{d}F = j_E\qquad\text{and}\qquad \mathrm{d}(\ast F)=j_M\]for electric and magnetic 4-currents $j_E$ and $j_M$, respectively! The dual field strength $\ast F$ is defined as follows (in local coordinates):
\[^\ast\!F_{\mu\nu} = \frac{1}{2}\varepsilon_{\mu\nu\kappa\lambda}F^{\kappa\lambda}\]with $\varepsilon$ the totally antisymmetric Levi-Civita symbol. The Lorentz force law also obtains a symmetric form this way:
\[m\ddot{x}^\mu = (qF^{\mu\nu}+g\ ^\ast\!F^{\mu\nu})\dot{x}_\nu\,.\]In quantum mechanics, the fundamental equation is Schrödinger’s equation (where we again use natural units with $\hbar=1$):
\begin{gather} \label{schrodinger} i\frac{\partial}{\partial t}\psi = \widehat{H}\psi\,, \end{gather}
with $\widehat{H}$ the Hamiltonian operator. In the vacuum, for free particles, this Hamiltonian only consists of a kinetic term:
\[\widehat{H}_{\text{free}} := \frac{\widehat{p}^2}{2m}\,.\]Now, a charged particle in an electric field has a potential energy equal to $V=q\varphi$, where $\varphi:\mathbb{R}^3\rightarrow\mathbb{R}$ is the scalar potential from Eq. \eqref{electric_potential}. The question becomes what to do with the vector potential? Does it also play a role? Well, as noted above, the canonical momenta obtain a contribution proportional to $\vector{A}$. So, to retrieve the kinetic momenta, we simply have to subtract this contribution:
\[\widehat{H}_{\text{EM}} = \frac{(\widehat{p}-q\vector{A})^2}{2m}+q\varphi\,.\]This substitution $\widehat{p}\longrightarrow\widehat{p}-q\vector{A}$ is called minimal coupling.
Now, for (canonical) quantization, the canonical momentum $\widehat{p}_i$ is replaced with the differential operator $-i\partial_i$. By analogy, when incorporating an interaction such as electromagnetism, one introduces the covariant derivative:3
\begin{gather} \label{spatial_covariant_derivative} \nabla_i := \partial_i - iqA_i\,. \end{gather}
This way, the electromagnetically coupled Schrödinger equation becomes
\[i\frac{\partial}{\partial t}\psi = \left(-\frac{\nabla^2}{2m}+q\varphi\right)\psi\,.\]In fact, if one works with 4-vectors as in the previous section, the electromagnetically coupled Schrödinger equation is simply the free one written in ‘covariant’ form. To this end, the 4-dimensional covariant derivative, which includes a temporal component, is defined as follows:4
\begin{gather} \label{covariant_derivative} \nabla_\mu := \partial_\mu + iqA_\mu\,, \end{gather}
With these conventions, we can see that the ‘covariant’ Schrödinger equation
\begin{gather} \label{covariant_schrodinger} -i\nabla_t\psi = -\frac{\nabla^2}{2m}\psi \end{gather}
captures both the free-particle case when $\nabla_\mu=\partial_\mu$ and the electromagnetic case when $\vector{A}\neq0$!
Although this section is note entirely related to electric-magnetic duality, it is a very interesting topic on its own!
In classical mechanics, the electromagnetic potentials are actually of no importance. They are useful for calculations, and the gauge freedom often lets us work in a reference frame that further simplifies calculations, but it are only the electric and magnetic field that are of physical importance! There is no classical electromagnetic phenomenon that cannot be characterized only in terms of $\vector{E}$ and $\vector{B}$.
In quantum mechanics, we will now see that this is not true anymore. We can detect (and people have done so) whether an electromagnetic potential is present or not, even though the electromagnetic field might be zero. Consider an infinitely long solenoid (long relative to the motion of the electrons is sufficient in practice). Electromagnetism predicts that, given a uniform current in the wire, there exists a magnetic field inside the solenoid, whereas the magnetic field is zero outside the solenoid. However, this does not mean that $\vector{A}=0$ outside the solenoid, only that $\vector{A}$ is pure gauge. This means that, for every simply-connected subset $O\subset\mathbb{R}^3\backslash\mathcal{S}$ (where $\mathcal{S}\cong\mathbb{R}$ denotes the solenoid), there exists some smooth function $\chi:O\rightarrow\mathbb{R}$ such that $\vector{A}|_O=\vector{\nabla}\chi$. Note that $\mathbb{R}^3$ with the solenoid removed is not simply connected and, hence, the gauge function $\chi$ is not defined everywhere, instead space has to be covered by at least two patches (see Figure 2).
Consider an electron travelling from one side of the solenoid to the other. This can happen in two ways as shown in Figure 1, either along the northern side (path $\gamma_1$), or along the southern (path $\gamma_2$). The total wave function of the electron will be a superposition of these two contributions. Now, in the first section on electromagnetism, we saw that the electromagnetic potential is only defined up to a divergence. In other words, potentials that are pure gauge can be transformed away without altering the physics. In Figure 2 below, two patches covering the outside of the solenoid are shown, together with the associated gauge functions generating the electromagnetic potential.
Transforming the potential away simply means turning the electromagnetically coupled Schrödinger equation \eqref{covariant_schrodinger} into the free Schrödinger equation \eqref{schrodinger}. However, to make the Schrödinger equation fully gauge invariant, we actually need a final piece of information. The wave function itself will also transform. In fact, it simply obtains a phase factor since the physics itself remains invariant:
\[\psi(\vector{r},t)\longrightarrow\exp\left(iq\int_\gamma\vector{A}\cdot d\vector{l}\right)\psi(\vector{r},t)\,.\]Another interpretation, where the potential is not explicitly gauged away, is that of parallel transport. As the particle moves through space along the path $\gamma$, it picks up a phase factor $\exp\left(iq\int_\gamma\vector{A}\cdot d\vector{l}\right)$. At the point where the two paths meet again, the total wave function is given by
\begin{align*}
\Psi(\vector{r},t) &= \psi_{1,A}(\vector{r},t) + \psi_{2,A}(\vector{r},t)\\
&= \exp\left(iq\int_{\gamma_1}\vector{A}\cdot d\vector{l}\right)\psi_1(\vector{r},t) + \exp\left(iq\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\psi_2(\vector{r},t)\\
&= \exp\left(iq\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\left[\textcolor{darkgreen}{\exp\left(iq\oint\vector{A}\cdot d\vector{l}\right)}\psi_1(\vector{r},t)+\psi_2(\vector{r},t)\right]\\
&= \exp\left(iq\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\left[\textcolor{darkgreen}{\exp\left(iq\Phi\right)}\psi_1(\vector{r},t)+\psi_2(\vector{r},t)\right]\,,
\end{align*}
where $\Phi$ is the magnetic flux through the solenoid (here, we made use of the Kelvin–Stokes theorem). It follows that the interference pattern will depend on the flux inside the solenoid, even though the particle itself does not move through there!
Now, in the spirit of electric-magnetic duality, there exists another effect for electric fields and magnetic ‘charges’. In this case, a magnetic dipole (such as a neutral atom) moves around a charged wire. It is a standard result in electromagnetism that a magnetic dipole moment $\vector{\boldsymbol{\mu}}$ couples to a magnetic field through an interaction term of the form $-\vector{\boldsymbol{\mu}}\cdot\vector{B}$. So, why then do we get an effect in an electric field $\vector{E}$? This has the same reason as the spin-orbit interaction in atomic physics.
In special relativity, it is well known that transforming between two reference frames that are in motion with respect to each other mixes up the temporal and spatial components of 4-vectors (and higher-order tensors). These are the Lorentz transformations. Now, since the electric and magnetic fields are merely components of the relativistic field strength $F_{\mu\nu}$ as in Eq. \eqref{field_strength}, they will also get mixed up. The transformations for these fields are as follows:
\begin{align*}
\vector{E}’_\parallel &= \vector{E}\\
\vector{B}’_\parallel &= \vector{B}\\
\vector{E}’_\perp &= \gamma(\vector{E}_\perp+\vector{v}\times\vector{B})\\
\vector{B}’_\perp &= \gamma(\vector{B}_\perp-\vector{v}\times\vector{E})
\end{align*}
where
\[\gamma:=\frac{1}{\sqrt{1-v^2/c^2}}\]is the Lorentz factor. In our case, there is only an initial electric field, so $\vector{B}=0$ and, hence $\vector{B}’ = -\gamma\vector{v}\times\vector{E}$ when we transform to the rest frame of the neutral particle. For $v\ll c$, which is also necessary for our considerations (where we silently make use of the adiabatic theorem), the Lorentz factor is approximately equal to 1, so $\vector{B}’\approx-\vector{v}\times\vector{E}$. The total interaction term is then given by
\[\widehat{H}_{\text{int}}=\vec{\boldsymbol{\mu}}\cdot(\vector{v}\times\vector{E})\,.\]Note that the triple product is cyclic and, hence, $\widehat{H}_{\text{int}}=-\vector{v}\cdot(\vec{\boldsymbol{\mu}}\times\vector{E})$. This is of the same form as (part of) the electromagnetic potential energy: we have found an effective electromagnetic potential $\vector{A}_{\text{eff}}=\vec{\boldsymbol{\mu}}\times\vector{E}$. From here on, the Aharonov–Casher effect proceeds as the Aharonov–Bohm effect: the phase difference between two paths is equal to $q\oint\vector{A}_{\text{eff}}\cdot d\vector{l}$.
Finally, after quite a long trip through the lands of electromagnetism and quantum mechanics, we can finally begin the story of (quantum) electric-magnetic duality. Let us first begin with a famous result by Dirac. Assume that there exist magnetic monopoles in our universe. Mirroring the behaviour of electric charges, the magnetic field of such a monopole would be given by a Coulomb-like field:
\[\vector{B} = \frac{g}{4\pi}\frac{\vector{r}}{r^3}\,.\]Because $\vector{B}$ is singular at the origin, we do not have $\vector{B}=\vector{\nabla}\times\vector{A}$ globally (similar to how we had to locally find gauge functions in the Aharonov–Bohm effect). In fact, the best we can do is find such a potential everywhere up to a half-infinite line: the Dirac string. Let us choose two such potentials, one where the Dirac string is aligned with the positive $z$-axis and one where it is aligned with the negative $z$-axis. In spherical coordinates, this reads as follows:
\begin{align*}
\vector{A}_+ &:= \frac{g}{4\pi r}\frac{1-\cos\theta}{\sin\theta}\widehat{e}_\phi\,,\\
\vector{A}_- &:= -\frac{g}{4\pi r}\frac{1+\cos\theta}{\sin\theta}\widehat{e}_\phi\,.
\end{align*}
On their common domain ($\mathbb{R}^3$ with the $z$-axis removed), there exists (again locally) a gauge function $\chi:\mathbb{R}^3\backslash\mathbb{R}\rightarrow\mathbb{R}$ such that $\vector{A}_+-\vector{A}_-=\vector{\nabla}\chi$. For example (note the discontinuity of this function):
\[\chi = \frac{g}{2\pi}\phi\,.\]Now, as in the Aharonov–Bohm effect, we can consider transporting a charged particle around this monopole in a closed loop. Because the wave function has to be single valued, this implies that
\begin{gather} \label{dirac_condition} qg=2\pi k \end{gather}
for some $k\in\mathbb{Z}$. Hence, as soon as there exists a single magnetic monopole in the universe (sadly we have not detected any), all electric charges have to be multiples of a single minimal charge $e$. The existence of magnetic monopoles would, thus, clarify why we only observe quantized electric charges!
- all electric dyon charges are integral multiples of $e$: $q=ke$, or
- all electric dyon charges are half-integral multiples of $e$.
For this final section, we have to generalize a bit further. Instead of considering plain electromagnetism, we will consider Yang–Mills–Higgs theory such as in the Standard Model or Georgi–Glashow model. In this case, two things change. First of all, the gauge fields $A_\mu$ are not given by complex numbers — electromagnetism is a so-called $\mathrm{U}(1)$ gauge theory — but they are given by unitary $2\times2$-matrices (with determinant 1) — a $\mathrm{SU}(2)$ gauge theory. Secondly, the gauge fields are now coupled to a scalar field $\phi:\mathbb{R}^4\rightarrow\mathbb{C}$. The Lagrangian looks as follows:
\begin{gather} \mathcal{L}_{\text{YMH}} = -\frac{1}{4}F_{\mu\nu}\cdot F^{\mu\nu} + \frac{1}{2}\nabla_\mu\vector{\phi}\cdot\nabla^\mu\vector{\phi} - V(\phi)\,, \end{gather}
where
- $F$ is the field strength (tensor) as before, but now taking values in the Lie algebra $\mathfrak{su}(2)\cong\mathrm{so}(3)$,
- $\vector{\phi}$ is the Higgs field taking values in $\mathbb{C}^3$, and
- $V$ is the Higgs potential, which typically has the Mexican hat shape $V(\phi)=\lambda^2(\phi^2-a^2)^2$.
Whereas the energy density in ordinary electromagnetism is given by
\[\mathcal{H} = \frac{1}{2}(\vector{E}\cdot\vector{E}+\vector{B}\cdot\vector{B})\,,\]it is given by
\begin{gather} \label{energy_density} \mathcal{H} = \frac{1}{2}\sum_{i=1}^3\left(\vector{E}_i\cdot\vector{E}_i+\vector{B}_i\cdot\vector{B}_i + \nabla_i\vector{\phi}\cdot\nabla_i\vector{\phi}\right) + \frac{1}{2}\vector{\Pi}\cdot\vector{\Pi} + V(\phi)\,, \end{gather}
in Yang–Mills–Higgs theory, where $\vector{\Pi}$ is defined as the conjugate momentum to the Higgs field: $\vector{\Pi}:=\nabla_0\vector{\phi}$. The energy (density) has three ‘electromagnetic contributions’ (one for each component or ‘color’ of the gauge field) and a contribution coming from interactions with the Higgs field and the Higgs field itself.5
We try to find finite-energy solutions to the equations of motion, called instantons. To obtain a finite energy $\int_{\mathbb{R}^3}\mathcal{H}\,d\vector{r}<+\infty$, the solution must asymptotically converge to a vacuum solution, i.e. at infinity, the following conditions should hold:
\[F_{\mu\nu}=0 \qquad\qquad \nabla_\mu\vector{\phi}=0 \qquad\qquad V(\phi)=0\,.\]The Mexican hat potential attains it minimum when $\phi^2=a^2$ and, hence, when viewing it as a function $V:\mathbb{R}^3\rightarrow\mathbb{R}$, the field at infinity gives a function $V_\infty:S^2\rightarrow S^2$.6 A classical fact in mathematics, in algebraic topology to be precise, is that continuous functions between two $n$-spheres can be classified by an integer: their degree. The topological index of an instanton is defined as the degree of its potential at spatial infinity $V_\infty$. Now, this topological index cannot change under continuous transformations such as time evolution or gauge transformations. This means that if we start with an instanton of degree $N_\phi\in\mathbb{Z}$, we will always have an instanton of degree $N_\phi$, these solutions are stable in time! (The topology of solution leads to superselection sectors.) An example of such solutions are the ‘t Hooft–Polyakov instantons, whose magnetic charge is given by
\[g=-4\pi\frac{N_\phi}{e} = -\frac{4\pi}{e}\int_{S^2}\varepsilon^{abc}\vector{\phi}\cdot\left(\partial_b\vector{\phi}\times\partial_c\vector{\phi}\right)\,dS_a\,.\]Unlike the Dirac string, these solutions do not have a singular point at the origin and the magnetic charge is a purely topological phenomenon, it does not have to be put in by hand. Moreover, note that the ‘t Hooft–Polyakov instantons have a magnetic charge that is at least twice the minimal charge allowed by the Dirac condition \eqref{dirac_condition}.
Now, let us consider the mass of these instantons. In their rest frame, the mass is given by the energy ($c=1$):
\[m = \int_{\mathbb{R}^3}\mathcal{H}\,d\vector{r}\,.\]From Eq. \eqref{energy_density}, we can bound this as follows:
\[m\geq\frac{1}{2}\int_{\mathbb{R}^3}\sum_{i=1}^3\left(\vector{E}_i\cdot\vector{E}_i+\vector{B}_i\cdot\vector{B}_i+\nabla_i\vector{\phi}\cdot\nabla_i\vector{\phi}\right)\,d\vector{r}\,.\]For any angle $\theta\in[0,2\pi[$, this is also bounded by (this is obtained by ‘completing the square’)
\[m\geq\sin\theta\int_{\mathbb{R}^3}\sum_{i=1}^3\vector{E}_i\cdot\nabla_i\vector{\phi}\,d\vector{r} + \cos\theta\int_{\mathbb{R}^3}\sum_{i=1}^3\vector{B}_i\cdot\nabla_i\vector{\phi}\,d\vector{r}\,.\]With a bit of exercise, the keen reader might now notice that the two terms are actually proportional to the electric and magnetic charges, respectively (through the Kelvin–Stokes theorem, the Bianchi identity7 and the equations of motion):
\[\int_{\mathbb{R}^3}\sum_{i=1}^3\vector{E}_i\cdot\nabla_i\vector{\phi}\,d\vector{r} = aq \qquad\text{and}\qquad \int_{\mathbb{R}^3}\sum_{i=1}^3\vector{B}_i\cdot\nabla_i\vector{\phi}\,d\vector{r} = ag\,.\]The bound, hence, becomes $m\geq aq\sin\theta+ag\cos\theta$ and is sharp whenever $\tan\theta=q/g$. Substituting this back into the inequality above gives the Bogomol’nyi bound:
\[m\geq a\sqrt{q^2+g^2}\,.\]For static solutions, the electric field vanishes and, moreover, the potential should also vanish identically. The only possibility for this to hold without vanishing magnetic charge is that $\lambda=0$. Solutions with these properties saturate the Bogomol’nyi bound and the Bogomol’nyi equation (note that, as for the magnetic charge, the magnetic field is generated by the Higgs mechanism):
\[\vector{B}_i = \nabla_i\vector{\phi}\,.\]Accordingly, they are called Bogomol’nyi–Prasad–Sommerfield (BPS) instantons. The total particle content of this theory is summarized in the table below.
Particle | Mass | Electric charge | Magnetic charge | Spin |
---|---|---|---|---|
Photon | 0 | 0 | 0 | 1 |
Higgs boson | 0 | 0 | 0 | 1 |
$W^\pm$ | aq | $\pm q$ | 0 | 1 |
BPS (Anti-)instanton | ag | 0 | $\pm g$ | 0 |
Looking at the properties of these particles, a few things should stand out. First of all, the Bogomol’nyi bound is saturated for all particles. Moreover, there exists a $\mathbb{Z}_2$ symmetry as in the case of ordinary electromagnetism that interchanges electric and magnetic charges (note that the Bogomol’nyi bound is clearly invariant under electric-magnetic duality). This symmetry consists of two parts in this case:
- Exchanging charges: $(q,g)\longrightarrow(g,-q)$.
- Exchanging particles: instantons $\longleftrightarrow$ (massive) gauge bosons.
Given this symmetry, Montonen and Olive conjectured that there was a more general duality at play: Does there exist a gauge-theoretic description of the theory in which the monopoles are fundamental and the gauge (massive) bosons arise as topological composites. Note that, because of the Dirac condition \eqref{dirac_condition}, one theory will have a small cooupling constant, whereas the other will have a large one — a more general property of dual theories! Since perturbation theory is not applicable to strongly coupled theories (such as QCD), such a dual picture with small coupling would be of major interest. This approach is, for example, considered in the holographic principle.
To finish this section, it is interesting to consider one last modification to the Lagrangian. The terms that were included in the Yang–Mills–Higgs Lagrangian all contributed to the (classical) equations of motion. However, it is possible to add a term that does not affect the equations of motion since it is a total derivative, but does affect quantum mechanical properties (through the path integral):
\[\mathcal{L}_{\theta} = \theta\frac{e^2}{32\pi^2}F_{\mu\nu}\cdot\!\ ^\ast\!F^{\mu\nu}\,.\]This $\theta$-term is, for example, important for a complete treatment of CPT symmetry and the strong CP problem. (Similar to the instanton number, the values of $\theta$ determine superselection sectors.) Moreover, it can be shown that the charge of ‘t Hooft–Polyakov instantons (and, in fact, of all dyons) gets shifted as follows:
\[q = ne + \frac{e\theta}{2\pi}N_\phi\]for some $n\in\mathbb{Z}$ (this is the Witten effect). Including the topological $\theta$-term, the total gauge part of the Lagrangian is given by
\[\mathcal{L}_{\text{gauge}} = \mathcal{L}_{\text{YM}}+\mathcal{L}_\theta = -\frac{1}{4e^2}F_{\mu\nu}\cdot F^{\mu\nu} + \frac{\theta}{32\pi^2}F_{\mu\nu}\cdot\!\ ^\ast\!F^{\mu\nu}\]where a transformation $A_\mu\longrightarrow eA_\mu$ was performed to isolate the dependence on the electric charge. Introducing the complexified field strength $\mathcal{F}_{\mu\nu} = F_{\mu\nu} + i\ ^\ast\!F_{\mu\nu}$, this Lagrangian simply becomes
\[\renewcommand{\Im}{\mathfrak{Im}} \mathcal{L}_{\text{gauge}} = -\frac{1}{32\pi}\Im\left(\tau\mathcal{F}_{\mu\nu}\cdot\mathcal{F}^{\mu\nu}\right)\,,\]where $\tau:=\frac{\theta}{2\pi}+i\frac{4\pi}{e^2}$ is the complexified coupling constant. Two things should now be clear:
- Since $\theta$ is an angle, it is only defined up to multiples of $2\pi$ and, hence, the theory is invariant under the transformation $T:\tau\longrightarrow\tau+1$.
- Electric-magnetic duality says that $e\longrightarrow g=-4\pi/e$ is a symmetry, so, at $\theta=0$ (i.e. CP-nonviolating theories), the theory is also invariant under the transformation $S:\tau\longrightarrow-1/\tau$.
The extended Montonen–Olive duality (conjecture) states that physics is invariant under actions of the modular group8 $\mathrm{(P)SL}(2,\mathbb{Z})$ on the complexified coupling constant! So, even at $\theta\neq0$, $S$ acts as a symmetry.
To be entirely correct, we have to add a third part to the duality conjecture. The $S$-transformation interchanges the following items:
- Electric and magnetic charges: $(q,g)\longleftrightarrow(g,-q)$.
- Complexified coupling constant: $\tau\longleftrightarrow-1/\tau$.
- Gauge groups:9 $G\longleftrightarrow G^\vee$.
One last issue remains (which we will not really treat in detail). The duality conjecture is very pleasing, but the table of the particle spectrum above has two issues. First of all, the saturation of the Bogomol’nyi bound was treated classically, without radiative corrections (i.e. corrections coming from Feynman diagrams with virtual particles). Moreover, the spin of the particles does not match. The gauge bosons are spin 1, whereas the instantons are spin 0. Both of these issues can be solved, however, by considering an appropriate level of supersymmetry. For $N=4$ supersymmetry, it can be shown that one obtains a (super)conformal theory and, accordingly, there are no radiative corrections and there is no running of the coupling constant (or other physical parameters such as the masses of particles), i.e. the $\beta$-function vanishes identically. Moreover, the particles in the supermultiplets have the right spin to allow for the interchange of instantons and gauge bosons.
Some interesting topics were touched upon in this post, e.g.:
- The difference between ‘ordinary’ charges (also called Noether charges due to the relation with Noether’s theorem) and topological charges.
- Superselection rules and sectors.
- Duality principles in physics.
- Modularity of physical theories.
At the moment, the two most likely subjects to be covered in a future post are the difference between Noether charges and topological charges and how this is tied to cohomology theory, and superselection sectors, in particular DHR (Doplicher–Haag–Roberts) theory.
- J. M. Figueroa-O’Farrill (1998). Electromagnetic duality for children. https://www.maths.ed.ac.uk/~jmf/Teaching/Lectures/EDC.pdf.10
- C. Montonen, & D. Olive (1977). Magnetic monopoles as gauge particles? https://doi.org/10.1016/0370-2693(77)90076-4.
- Tong, D. (2018). Lecture notes on gauge theory. https://www.damtp.cam.ac.uk/user/tong/gaugetheory.html.
- Witten, E. (1979). Dyons of charge $\frac{e\theta}{2\pi}$. https://cds.cern.ch/record/133312/files/197909065.pdf.
-
Sometimes also called the Maxwell–Heaviside equations because it was Heaviside who put them in this more elegant form! ↩
-
Two (equivalent) conventions exist: mostly-pluses and mostly-minus. The former is mainly used by particle physicists, whereas the latter is mainly used in (general) relativity. ↩
-
The covariant derivative $\nabla$ will not be written with a vector arrow to distinguish it from the nabla-operator $\vector{\nabla}$. ↩
-
Note the sign flip with respect to Eq. \eqref{spatial_covariant_derivative} due to the minuses in the Minkowsi metric! ↩
-
The reason for separating the temporal term $\vector{\Pi}=\nabla_0\vector{\phi}$ and the spatial terms $\nabla_i\vector{\phi}$ is because in the Lagrangian they occur with different signs (due to the Minkowski metric), whereas here, they all contribute positively. ↩
-
The domain $S^2$ is because we look at spatial infinity, the codomain $S^2$ is because of the condition $\phi^2=a^2$. ↩
-
$\nabla_\mu\ ^\ast\!F^{\mu\nu}=0$ ↩
-
This group is generated by the transformations $T$ and $S$. ↩
-
$G^\vee$ denotes the Langlands dual of $G$ (sometimes called the magnetic dual). ↩
-
A reworked version can be found in Lundin, J. (2018). Electromagnetic Duality in $\mathrm{SO}(3)$ Yang–Mills Theory. https://www.diva-portal.org/smash/get/diva2:1222380/FULLTEXT01.pdf. ↩