
This Month's Read III: Electric-magnetic duality
Last “This month’s read” post was about probability theory. This month’s post will cover something completely different: electromagnetism (and some generalizations). Although we have experimentally discovered only electric charges (magnetic fields are generated by moving or ‘spinning’ electric charges), the laws of electromagnetism exhibit a variety of beautiful symmetries and duality relations when we introduce magnetic charges as well!
This post is structured as follows:
- Intro to electromagnetism
- Quantizing electromagnetism
- Aharonov effects (with Bohm and Casher)
- Montonon–Olive duality
Classical electromagnetism is entirely characterized by Maxwell’s equations1 $\newcommand{\vector}[1]{\vec{\mathbf{#1}}}$
together with the Lorentz force
\begin{gather}
\vector{F} = q(\vector{E}+\vector{v}\times\vector{B})\,.
\end{gather}
In these equations, $\vector{E}$ and $\vector{B}$ denote the electric and magnetic fields, respectively, $q$ denotes the charge and $\vector{J}$ denotes the electric current. Note that we work in natural units where both $c=1=\varepsilon_0$.
To better understand the theory of electromagnetism and eventually quantize it, we first have to unearth its symmetries a bit better, in particular its ‘gauge structure’. This structure is related to certain redundancies in the definition of the electric and magnetic fields.
First, start with Gauss’s law \eqref{B_gauss} for the magnetic field. This equation, through the Helmholtz decomposition, shows that
\begin{gather} \vector{B} = \vector{\nabla}\times\vector{A} \end{gather}
for some vector potential $\vector{A}:\mathbb{R}^3\rightarrow\mathbb{R}^3$. Inserting this in Faraday’s law \eqref{faraday} then gives
\begin{gather} \label{electric_potential} \vector{E} = -\vector{\nabla}\varphi - \frac{\partial\vector{A}}{\partial t} \end{gather}
for some smooth function $\varphi:\mathbb{R}^3\rightarrow\mathbb{R}$. Now, due to the properties of the $\vector{\nabla}$-operator, the choice of potentials $(\varphi,\vector{A})$ is not unique. For every smooth function $\chi:\mathbb{R}^3\rightarrow\mathbb{R}$, we can perform the simultaneous transformations
\[\varphi\longrightarrow\varphi-\frac{\partial\chi}{\partial t} \qquad\text{and}\qquad \vector{A}\longrightarrow\vector{A}+\vector{\nabla}\chi\]without actually altering the electric and magnetic fields. This freedom will be essential once we consider quantum effects! However, for the moment, let us just consider the electromagnetic potentials without paying too much attention to this freedom.
In classical mechanics, most forces that we encounter are of a specific kind: they are conservative. Just like the electric field, this means that they can be derived from a potential:
\[\vector{F} = -\vector{\nabla} V\]for a smooth function $V:\mathbb{R}^3\rightarrow\mathbb{R}$. We could now wonder whether the Lorentz force is also of this type. There is, however, immediately an issue. Conservative forces only depend on the position $\vector{r}$, while the Lorentz force depends on the velocity $\vector{v}\equiv\dot{\vector{r}}$ as well! Luckily, there is a beautiful solution. We can generalize the notion of conservative forces to more general settings involving higher-order derivatives of the position (so even beyond the velocity). To this end, we need to say goodbye to the Newtonian realm and let the Lagrangian formulation enter the stage.
Newton’s second law
\[\vector{F} = \frac{d\vector{p}}{dt}\]is equivalent to the following set of equations, called the Euler–Lagrange equations,
\[\frac{\partial L}{\partial q^i} = \frac{d}{dt}\frac{\partial L}{\partial\dot{q}^i}\,,\]where $L:=T-V$ is the Lagrangian (function), the difference between the kinetic energy and the potential energy, and the $q^i$ are the ‘generalized coordinates’ (these are just our preferred coordinates in which to solve the dynamics of the system). In a slightly less common form, they read
\[\frac{\partial T}{\partial q^i} - \frac{d}{dt}\frac{\partial T}{\partial\dot{q}^i} = Q_i\,,\]with
\[Q_i := \vector{F}\cdot\frac{\partial\vector{r}}{\partial q^i}\]the generalized forces (note that if we in ordinary coordinates, i.e. $\vector{q}=\vector{r}$, then $\vector{Q}=\vector{F}$). The operator
\[\delta_{\text{EL}} := \sum_{i=1}^k\left(\frac{\partial}{\partial q^i}-\frac{d}{dt}\frac{\partial}{\partial\dot{q}^i}\right)\delta q^i\]is also called the Euler–Lagrange operator or variational derivative. It follows that any generalized force of the form
\[Q = \delta_{\text{EL}}U\]for a generalized potential $U\equiv U(\vector{r},\vector{v},t)$ will lead to valid Euler–Lagrange equations. The fun part is now to check that the Lorentz force is exactly of this form, with
\[U(\vector{r},\vector{v},t) = q\bigl(V(\vector{r},t) - \vector{v}\cdot\vector{A}(\vector{r},t)\bigr)\,.\](This will be left as an exercise to the reader 😉.) Now, to go from the Lagrangian framework to the Hamiltonian framework, which is preferred for quantum mechanics, one needs to derive (conjugate or canonical) momenta from the Lagrangian:
\[p_i := \frac{\partial L}{\partial\dot{q}^i}\,.\]For ordinary potentials, the only contribution comes from the kinetic energy and we would obtain the classical relation
\[\vector{p} = m\vector{v}\,.\]However, since the Lorentz potential is velocity dependent, an additional term is obtained:
\[p_i = m\dot{q}^i + qA_i\,.\]It follows that the canonical momenta are not equal to the kinetic momenta!
Before quantizing electromagnetism, we will first consider what happens when we introduce magnetic charges. In this case, we have to modify Maxwell’s equations in the following way:
Moreover, the Lorentz force also has to be generalized:
\[\vector{F} = q(\vector{E}+\vector{v}\times\vector{B}) \textcolor{red}{+ g(\vector{B}-\vector{v}\times\vector{E})}\,,\]where $g$ is the magnetic (monopole) charge. Note that, with this formulation, these equations are completely symmetric under the following transformations:
To get a cleaner presentation, it is helpful to introduce a new notation motivated by special relativity. In spacetime (whether it be classic Minkowski space $\mathbb{R}^{1+3}$ or some spacetime manifold $M$), we have one temporal component and three spatial components. Instead of treating this separately, we will combine these 4 coordinates into what is called a 4-vector:
\[x^\mu:=(t,\vector{r})\equiv(t,x^1,x^2,x³)\,.\]In a similar way, we can also define the 4-momentum and 4-gradient:
\[p^\mu := (E,p^1,p^2,p^3) \qquad\text{and}\qquad \partial_\mu:=(\partial_t,\vector{\nabla})\equiv(\partial_t,\partial_1,\partial_2,\partial_3)\,.\]Note the position of the index $\mu$. For the coordinate vector it is a superscript, whereas for the derivative it is a subscript! The reason why in classical mechanics we usually do not pay attention to this detail is that all spatial coordinates are equivalent. However, when including time, the situation changes and indices are raised and lowered using a metric (tensor) $\eta_{\mu\nu}$. For Minkowski spacetime, the metric is given by $\eta\equiv\mathrm{diag}(+1,-1,-1,-1)$ in the mostly-minus2 convention. For example, consider the 4-vector $\xi^\mu\equiv(\xi^0,\xi^1,\xi^2,\xi^3)$. When the indices are superscripts, it is called a contravariant vector. Lowering the indices, i.e. turning it into a covariant vector, is done as follows (with some abuse of notation where we treat rows and column vectors as equivalent):
\begin{align*}
x_\mu &= \eta_{\mu\nu}\xi^\mu\\
&= \begin{pmatrix}
+1&0&0&0\\
0&-1&0&0\\
0&0&-1&0\\
0&0&0&-1
\end{pmatrix}
\begin{pmatrix}
\xi^0\\
\xi^1\\
\xi^2\\
\xi^3\\
\end{pmatrix}\\
&= (\xi^0,-\xi^1,-\xi^2,-\xi^3)\,.
\end{align*}
Now, let us introduce the electromagnetic 4-potential $A^\mu:=(\phi,\vector{A})$. This lets us introduce a 2-index object: the field strength. (The following can again be seen as an exercise to the reader.)
\begin{gather}
\label{field_strength}
F^{\mu\nu} := \partial^\mu A^\nu - \partial^\nu A^\mu \equiv
\begin{pmatrix}
0&-E_1&-E_2&-E_3\\
E_1&0&-B_3&B_2\\
E_2&B_3&0&-B_1\\
E_3&-B_2&B_1&0
\end{pmatrix}
\end{gather}
As a final piece of notation, we introduce the de Rham differential. It is a way to formalize the usual notions of infinitesimal variation and differentials:
\[\mathrm{d}f := (\partial_\mu f)\mathrm{d}x^\mu \qquad\text{and}\qquad \mathrm{d}(\theta_\mu\mathrm{d}x^\mu) := (\partial_\mu\theta_\nu-\partial_\nu\theta_\mu)\mathrm{d}x^\mu\wedge\mathrm{d}x^\nu \qquad\text{and}\qquad\cdots\,.\]The first formula says that a variation of a function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ is given by varying each of the coordinates a little bit in the sense of first-order Taylor expansions. The coordinate variations $\mathrm{d}x^\mu$ are, however, now not just infinitesimal numbers, they are covectors in a well-defined way.
With these definitions, we now see that $F=\mathrm{d}A$ and, moreover, that Maxwell’s equations (with both electric and magnetic charges) are equivalent to
\[\mathrm{d}F = j\qquad\text{and}\qquad \mathrm{d}(\ast F)=k\]for electric and magnetic 4-currents $j$ and $k$, respectively! The dual field strength $\ast F$ is defined as follows in coordinates:
\[^\ast\!F_{\mu\nu} = \frac{1}{2}\varepsilon_{\mu\nu\kappa\lambda}F^{\kappa\lambda}\]with $\varepsilon$ the totally antisymmetric Levi-Civita symbol. The Lorentz force law in this case also becomes rather symmetric:
\[m\ddot{x}^\mu = (qF^{\mu\nu}+g\ ^\ast\!F^{\mu\nu})\dot{x}_\nu\,.\]In quantum mechanics, the fundamental equation is Schrödinger’s equation (where we again use natural units with $\hbar=1$):
\begin{gather} \label{schrodinger} i\frac{\partial}{\partial t}\psi = \widehat{H}\psi\,, \end{gather}
with $\widehat{H}$ the Hamiltonian operator. In the vacuum, for free particles, this Hamiltonian only consists of a kinetic term:
\[\widehat{H}_{\text{free}} := \frac{\widehat{p}^2}{2m}\,.\]Now, a charged particle in an electric field has a potential energy equal to $V=q\varphi$, where $\varphi:\mathbb{R}^3\rightarrow\mathbb{R}$ is the scalar potential from Eq. \eqref{electric_potential}. The question becomes what to do with the vector potential? Does it also play a role? Well, as noted above, the canonical momenta obtain a contribution proportional to $\vector{A}$. So, to retrieve the kinetic momenta, we simply have to subtract this contribution:
\[\widehat{H}_{\text{EM}} = \frac{(\widehat{p}-q\vector{A})^2}{2m}+q\varphi\,.\]This substitution $\widehat{p}\longrightarrow\widehat{p}-q\vector{A}$ is called minimal coupling. Under (canonical) quantization, the canonical momentum $\widehat{p}_i$ is replaced with the differential operator $-i\partial_i$. By analogy, when incorporating an interaction such as electromagnetism, one introduces the covariant derivative:3
\begin{gather} \label{spatial_covariant_derivative} \nabla_i := \partial_i - iqA_i\,. \end{gather}
This way, the electromagnetically coupled Schrödinger equation becomes
\[i\frac{\partial}{\partial t}\psi = \left(-\frac{\nabla^2}{2m}+q\varphi\right)\psi\,.\]In fact, if one works with 4-vectors as in the previous section, the electromagnetically coupled Schrödinger equation is simply the free one written in ‘covariant’ form. To this end, the 4-dimensional covariant derivative, which includes a temporal component, is defined as follows:4
\begin{gather} \label{covariant_derivative} \nabla_\mu := \partial_\mu + iqA_\mu\,, \end{gather}
With these conventions, we can see that the ‘covariant’ Schrödinger equation
\begin{gather} \label{covariant_schrodinger} -i\nabla_t\psi = -\frac{\nabla^2}{2m}\psi \end{gather}
captures both the free-particle case when $\nabla_\mu=\partial_\mu$ and the electromagnetic case when Eq. \eqref{covariant_derivative} holds!
Now, in classical mechanics, the electromagnetic potentials are actually of no importance. They are useful for calculations, and the gauge freedom often lets us transform into a reference frame that further simplifies calculations, but it are only the electric and magnetic field that are of physical importance!
In quantum mechanics, we will now see that this is not true anymore. We can detect (and people have done so) whether an electromagnetic potential is present or not, even though the electromagnetic field might be zero. Consider an infinitely long solenoid (long with respect to electrons is sufficient in practice). Electromagnetism predicts that, given auniform current in the wire, there exists a magnetic field inside the solenoid, whereas the magnetic field is zero outside the solenoid. However, this does not mean that $\vector{A}=0$ outside the solenoid, only that $\vector{A}$ is pure gauge. This means that, for every simply-connected subset $O\subset\mathbb{R}^3\backslash\mathcal{S}$ (where $\mathcal{S}\cong\mathbb{R}$ denotes the solenoid), there exists some smooth function $\chi:O\rightarrow\mathbb{R}$ such that $\vector{A}|_O=\vector{\nabla}\chi$. Note that $\mathbb{R}^3$ with the solenoid removed is not simply connected and, hence, the gauge function $\chi$ is not defined everywhere, instead space has to be covered by at least two patches.
Consider an electron travelling from one side of the solenoid to the other. This can happen in two ways as shown in Figure 1, either along the northern side (path $\gamma_1$), or along the southern (path $\gamma_2$). The total wave function of the electron will be a superposition of these two contributions. Now, as mentioned in the first section on electromagnetism, we saw that the electromagnetic potential is defined up to a divergence. In other words, potentials that are pure gauge can be transformed away without altering the physics. In Figure 2 below, two patches covering the outside of the solenoid are shown, together with the associated gauge functions generating the electromagnetic potential.
Transforming the potential away, simply means turning the electromagnetically coupled Schrödinger equation \eqref{covariant_schrodinger} into the free Schrödinger equation \eqref{schrodinger}. However, to make the Schrödinger equation fully gauge-invariant, we actually need a final piece. The wave function itself will also transform. In fact, it simply obtains a phase factor since the physics itself should remain invariant:
\[\psi(\vector{r},t)\longrightarrow\exp\left(q\int_\gamma\vector{A}\cdot d\vector{l}\right)\psi(\vector{r},t)\,.\]Another interpretation, where the potential is not explicitly gauged away, is that of parallel transport. As the particle moves through space along the path $\gamma$, it picks up a phase $\exp\left(\int_\gamma\vector{A}\cdot d\vector{l}\right)$. At the point where the two paths meet again, the total wave function is given by
\begin{align*}
\Psi(\vector{r},t) &= \psi_{1,A}(\vector{r},t) + \psi_{2,A}(\vector{r},t)\\
&= \exp\left(q\int_{\gamma_1}\vector{A}\cdot d\vector{l}\right)\psi_1(\vector{r},t) + \exp\left(q\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\psi_2(\vector{r},t)\\
&= \exp\left(q\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\left[\textcolor{darkgreen}{\exp\left(q\oint\vector{A}\cdot d\vector{l}\right)}\psi_1(\vector{r},t)+\psi_2(\vector{r},t)\right]\\
&= \exp\left(q\int_{\gamma_2}\vector{A}\cdot d\vector{l}\right)\left[\textcolor{darkgreen}{\exp\left(q\Phi\right)}\psi_1(\vector{r},t)+\psi_2(\vector{r},t)\right]\,,
\end{align*}
where $\Phi$ is the magnetic flux through the solenoid (here, we made use of the Kelvin–Stokes theorem). It follows that the interference pattern will depend on the flux inside the solenoid, even though the particle itself does not move through there!
Now, in the spirit of electric-magnetic duality, there exists another effect for electric fields and magnetic ‘charges’. In this case, a magnetic dipole (such as a neutral atom) moves around a charged wire. It is a standard result in electromagnetism that a magnetic dipole moment $\vector{\boldsymbol{\mu}}$ couples to a magnetic field through an interaction term of the form $-\vector{\boldsymbol{\mu}}\cdot\vector{B}$. So, why then do we get an effect in an electric field $\vector{E}$? This has the same reason as the spin-orbit interaction in atomic physics.
In special relativity, it is well known that transforming between two reference frames that are in motion with respect to each other mixes up the temporal and spatial components of 4-vectors. These are Lorentz transformations. Now, since the electric and magnetic fields are merely components of the relativistic field strength $F_{\mu\nu}$ as in Eq. \eqref{field_strength}, they will also get mixed up. The transformations for these fields are as follows:
\begin{align*}
\vector{E}’_\parallel &= \vector{E}\\
\vector{B}’_\parallel &= \vector{B}\\
\vector{E}’_\perp &= \gamma(\vector{E}_\perp+\vector{v}\times\vector{B})\\
\vector{B}’_\perp &= \gamma(\vector{B}_\perp-\vector{v}\times\vector{E})
\end{align*}
where
\[\gamma:=\frac{1}{\sqrt{1-v^2/c^2}}\]is the Lorentz factor. In our case, there is only an initial electric field, so $\vector{B}=0$ and, hence $\vector{B}’ = -\gamma\vector{v}\times\vector{E}$ when we transform to the rest frame of the neutral particle. For $v\ll c$, which is also necessary for our considerations (where we silently make use of the adiabatic theorem), the Lorentz factor is approximately equal to 1, so $\vector{B}’\approx-\vector{v}\times\vector{E}$. The total interaction term is then given by
\[\widehat{H}_{\text{int}}=\vec{\boldsymbol{\mu}}\cdot(\vector{v}\times\vector{E})\,.\]Note that the triple product is cyclic and, hence, $\widehat{H}_{\text{int}}=-\vector{v}\cdot(\vec{\boldsymbol{\mu}}\times\vector{E})$. This is of the same form as (part of) the electromagnetic potential energy, i.e. we have found an effective electromagnetic potential $\vector{A}_{\text{eff}}=\vec{\boldsymbol{\mu}}\times\vector{E}$. From here on, the Aharonov–Casher effect proceeds as the Aharonov–Bohm effect: the phase difference between two paths is equal to $\oint\vector{A}_{\text{eff}}\cdot d\vector{l}$.
Here, general electric-magnetic duality (Montonen–Olive duality) will be discussed in the future.
- J. M. Figueroa-O’Farrill (1998). Electromagnetic duality for children. https://www.maths.ed.ac.uk/~jmf/Teaching/Lectures/EDC.pdf.
* ?. ??? (???). ???. ???.
-
Sometimes also called the Maxwell–Heaviside equations because it was Heaviside who put them in this more elegant form! ↩
-
Two (equivalent) conventions exist: mostly-pluses and mostly-minus. The former is mainly used by particle physicists, whereas the latter is mainly used in (general) relativity. ↩
-
The covariant derivative $\nabla$ will not be written with a vector arrow to distinguish it from the nabla $\vector{\nabla}$. ↩
-
Note the sign flip with respect to Eq. \eqref{spatial_covariant_derivative} due to the minuses in the Minkowsi metric! ↩