1 Introduction

The study of higher-spin theories is motivated by both practical and theoretical questions. Massive higher-spin particles play a phenomenological role in describing composite states (such as those occurring in nuclear resonances of QCD) and are a crucial part of the spectrum of string theories. In the massless case, constructing higher-spin gravitational theories is important for understanding the landscape of possible theories of quantum gravity and has important implications ranging from holography to conformal bootstrap (see [1,2,3,4,5,6] for reviews). But on a purely theoretical level, one can view higher-spin theories as a playground to explore what is—and is not—possible in the general frameworks of classical and quantum field theory.

Over the last 50 years, it has become clear that this is a theoretical playground with many rules. The possible array of higher-spin theories is tightly constrained by many no-go theorems, both for asymptotically flat spacetimes (e.g., [7,8,9]) and asymptotically (A)dS spacetimes (e.g., [10,11,12])—see [3, 4, 13, 14] for overviews. In attempting to evade these no-go theorems, there is no free lunch: providing an explicit construction of a higher-spin theory (even at the classical level) usually requires abandoning at least one feature that we usually think of as desirable for a physically interesting theory, such as locality, unitarity, or manifest covariance. To date, most known examples of local, massless higher-spin theories are either (quasi-)topological [15,16,17,18,19,20,21,22,23,24,25,26,27,28,29] or higher-spin extensions of conformal gravity with higher-derivative equations of motion [30,31,32]. Moreover, those higher-spin theories which are defined in flat space turn out to have trivial S-matrices due to the severe constraints imposed on the interactions by the infinite-dimensional higher-spin symmetry (cf., [28, 33, 34]).Footnote 1 Depending on the audience, the triviality of higher-spin scattering can either be an intriguing feature of—or a compelling reason to ignore—higher-spin theories. In any case, it is an interesting question to ask: Is there a higher-spin theory in flat space with non-trivial scattering amplitudes, and if so what properties must it possess in order to avoid the many familiar constraints and no-go theorems?

In this paper, we study a chiral, higher-spin version of Yang–Mills theory in four-dimensional flat spacetime which has non-trivial tree-level scattering amplitudes. This theory has been partially constructed before in the literature [37,38,39], and our work builds on these previous investigations. For brevity, we refer to this theory as higher-spin Yang–Mills (HS-YM): it has many non-standard properties, which allow it to evade the net of no-go results on higher-spin theories with non-trivial S-matrices. In particular, the spin-s gauge potentials of the theory live in certain “un-balanced” spin-s representations of the Lorentz group, which we refer to as chiral representations for every spin \(s>1\), making the resulting gauge potentials intrinsically chiral. There are two on-shell degrees of freedom (i.e., positive and negative helicity) at each spin in this theory, but only one gauge-invariant field strength, from which the Lagrangian is constructed. The built-in chiral representations mean that the fields, Lagrangian and action functional of the theory are complex-valued in real, Lorentzian Minkowski spacetime.

Having a complex action means that the theory is non-unitary, and the chiral representations break parity invariance. On the one hand, this means that HS-YM fails to have basic properties usually required of physical theories. But, on the other hand, this ensures that HS-YM falls outside the assumptions of practically every no-go theorem constraining higher-spin interactions and scattering amplitudes. The chirality of the theory and higher-spin symmetries mean that in exchange processes, only spin-1 positive helicity particles can contribute (this is related to gauge invariance of the scattering amplitudes), although negative helicity particles can have arbitrary spin. Furthermore, its lack of unitarity and parity are fairly mild: self-dual Yang–Mills, self-dual gravity and conformal fishnet theory are also non-unitary with complex Lagrangians, but nevertheless encode a rich array of physically relevant information (cf., [40,41,42,43,44,45,46,47,48,49] and [50, 51]). In any case, at tree-level one is always free to consider the theory defined in complexified or alternative spacetime signatures (such as Euclidean or (2, 2)-signature), where the notion of complex fields and Lagrangians is less problematic.

In practical terms, we show that HS-YM has non-trivial scattering amplitudes by explicitly calculating the tree-level four-point amplitude using the Feynman rules of the spacetime action. When there are two positive helicity and two negative helicity external states, we find a non-vanishing amplitude, with the spins of the negative helicity particles identical but otherwise arbitrary.

In [38], a self-dual subsector of HS-YM was defined, and it is straightforward to show that the full theory admits a perturbative (i.e., small coupling) expansion around this subsector. We give a description of self-dual HS-YM in terms of twistor theory [52, 53], showing that it is classically integrable. A spacetime manifestation of this is the fact that self-dual HS-YM can be described by an infinite tower of massless, adjoint-valued scalar fields with cubic interactions; this is a higher-spin generalization of the Chalmers-Siegel description of self-dual Yang–Mills theory [54], which has already been written down as a contraction of chiral higher-spin gravity [37].

These facts have several important consequences. Firstly, it means that HS-YM can be perturbatively expanded around a classically integrable subsector where we have vanishing tree amplitudes. Theories with such a structure can often be described in terms of twistor actions,Footnote 2 classical reformulations of the theory on twistor space which have enhanced gauge invariance, that is a powerful tool for computing scattering amplitudes, and HS-YM is no exception. Furthermore, recent results on covariantizing chiral higher-spin theories [19, 20, 28, 37, 68] using twistor-inspired methods and free differential algebras [36, 38, 69,70,71] hint that twistor theory an ideal framework for constructing local higher-spin theories like HS-YM.

For scattering amplitudes in a helicity grading, the maximal helicity violating (MHV) configuration, with two negative helicity and arbitrarily many positive helicity external states, represents the first non-trivial perturbation away from self-duality and can be computed to all multiplicity directly from the twistor action of HS-YM. Remarkably, this leads to a compact formula for the n-point, color-ordered tree-level MHV amplitude of HS-YM written in spinor-helicity variables:

$$\begin{aligned} {\mathcal {A}}_{n}^{\textrm{MHV}}=\frac{\textrm{g}^{n-2}}{2}\,\delta ^{4}\!\left( \sum _{a=1}^{n}k_a\right) \,\frac{\langle i\,j\rangle ^{2s+2}}{\langle 1\,2\rangle \,\langle 2\,3\rangle \cdots \langle n\,1\rangle }, \end{aligned}$$
(1.1)

where \(\textrm{g}\) is the (dimensionless) coupling constant of the theory; \(k^{\mu }_a\) is the on-shell 4-momentum of the \(a^{\textrm{th}}\) external particle; and the negative helicity particles ij have integer spin \(s\ge 1\) while all others have positive helicity and spin-1. While this formula can be guessed from \(n=3,4\) explicit calculations and checked using BCFW recursion [72], the twistor action provides a first-principles derivation of the MHV amplitude.

Besides the explicit construction of HS-YM on spacetime and calculation of tree-level scattering amplitudes, our main results can be summarized as follows:

  • Theorem 1: There is a one-to-one correspondence between solutions of the self-dual HS-YM equations and certain holomorphic vector bundles on twistor space, implying the classical integrability of the self-dual sector.

  • Theorem 2: The self-dual sector of HS-YM is described on spacetime by the action

    $$\begin{aligned} \frac{1}{2}\,\sum _{s=1}^{\infty }\,\int \textrm{tr}\left( \textrm{d}\phi ^{(s)}\wedge *\textrm{d}\phi ^{(s)}\right) +\frac{1}{3}\sum _{s=1}^{\infty }\,\int \mu _{a,a}\wedge \textrm{tr}\left( \phi ^{(s)}\,\sum _{r+t=s+1}\textrm{d}\phi ^{(r)}\wedge \textrm{d}\phi ^{(t)}\right) ,\nonumber \\ \end{aligned}$$
    (1.2)

    where \(\{\phi ^{(s)}\}_{s=1,\ldots ,\infty }\) are scalar functions valued in the adjoint representation of the gauge group and \(\mu _{a,a}:=a_{\alpha }\,a_{\beta }\,\textrm{d}x^{\alpha {{\dot{\alpha }}}}\wedge \textrm{d}x^{\beta }{}_{{{\dot{\alpha }}}}\) for \(a_{\alpha }\) some constant spinor.

  • Theorem 3: The classical action of full HS-YM theory is equivalent to an action functional on twistor space which has a local piece corresponding to the self-dual sector and a non-local piece encoding non-self-dual interactions.

The paper is structured as follows. Section 2 provides a definition of the spacetime theory for HS-YM and analyses higher-spin propagating degrees of freedom in the chiral representation. Section 3 computes the 3- and 4-point scattering amplitudes of HS-YM using Feynman rules before presenting the n-point formula (1.1) for tree-level MHV scattering. Section 4 investigates the properties of self-dual HS-YM using twistor theory, establishing classical integrability of the self-dual sector and providing descriptions of it both on twistor space and spacetime. In Sect. 5, we give a twistor action description of full HS-YM and use it to derive our formula for the tree-level MHV amplitudes. Section 6 concludes, and “Appendix A” provides a check on the MHV formula using BCFW recursion.

Notation: Throughout, we denote SL\((2,{\mathbb {C}})\) spinor indices of negative chirality by \(\alpha ,\beta ,\ldots \) \(=0,1\) and SL\((2,{\mathbb {C}})\) spinor indices of positive chirality by \({\dot{\alpha }},{\dot{\beta }},\ldots =\dot{0},\dot{1}\). Spinor indices are raised and lowered using the two-dimensional Levi-Civita symbols:

$$\begin{aligned} b^{\alpha }=\epsilon ^{\alpha \beta }\,b_{\beta }, \qquad b_{\alpha }=b^{\gamma }\,\epsilon _{\gamma \beta }, \qquad \epsilon ^{\alpha \beta }\,\epsilon _{\alpha \gamma }=\delta ^{\alpha }_{\gamma }, \end{aligned}$$
(1.3)

and likewise for dotted indices. We often make use of the spinor helicity notation for SL\((2,{\mathbb {C}})\)-invariant contractions of spinors:

$$\begin{aligned} \langle a\,b\rangle :=a^{\alpha }\,b_{\alpha }, \qquad [c\,d]:=c^{{{\dot{\alpha }}}}\,d_{{{\dot{\alpha }}}}, \end{aligned}$$
(1.4)

where these inner products are skew-symmetric. Totally symmetric combinations of spinor indices will be denoted by \((\alpha _1\cdots \alpha _k)\equiv \alpha (k)\), \(({{\dot{\alpha }}}_1\cdots {{\dot{\alpha }}}_k)\equiv {{\dot{\alpha }}}(k)\), where symmetrization is always assumed to come with a prefactor of \(\frac{1}{k!}\).

Note added: While this paper was being prepared, we became aware of [73], which gives an interesting alternative construction of the self-dual sector of HS-YM in terms of non-projective twistor space.

2 The space-time theory

In four dimensions, a spin-s gauge field is usually thought of as a totally symmetric rank-s symmetric tensor [74]; exploiting the local isomorphism between the Lorentz group and SL\((2,{\mathbb {C}})\), this is equivalent to representing the spin-s gauge field by an object with s un-dotted/negative chirality SL\((2,{\mathbb {C}})\) spinor indices and s dotted/positive chirality spinor indices. However, there are also “un-balanced” spin-s representations of the gauge field, which have 2s total but unequal numbers of negative/positive chirality spinor indices. The price to be paid by working with such un-balanced representations is that they are not Lorentzian-real, as complex conjugation interchanges the spinor representations in Lorentzian signature, but in complexified spacetime or in Euclidean or split signature they are perfectly well defined (Fig. 1).

Fig. 1
figure 1

The standard Fronsdal representation (red) versus the chiral representation (blue) of a spin-s gauge potential (color figure online)

Following [38], we will be interested in the un-balanced representation which has \(2s-1\) un-dotted spinor indices and a single dotted spinor index for each integer spin \(s\ge 1\). We refer to this as the chiral representation, although it has also been called by other names (“maximally un-balanced” or “twistor” representations) in the literature. We construct a theory whose field content is a higher-spin generalization of the Yang–Mills gauge potential in the chiral representation:

$$\begin{aligned} \left\{ A_{\alpha {{\dot{\alpha }}}}(x),\, A_{(\alpha \beta \gamma ){{\dot{\alpha }}}}(x),\,\ldots \right\} =\bigcup _{s=1}^{\infty }\left\{ A_{\alpha (2s-1){{\dot{\alpha }}}}\right\} , \end{aligned}$$
(2.1)

where each of these potentials is valued in the adjoint of some simple Lie algebra \({\mathfrak {g}}\). As the notation suggests, for each value of s the associated potential is totally symmetric in its negative chirality spinor indices, and since it is only these un-dotted spinor indices which proliferate when \(s>1\), the theory is intrinsically chiral.

In this section, we review the basic classical structure of this higher-spin Yang–Mills (HS-YM) theory on space-time, including its field content, gauge symmetries and degrees of freedom. We note that aspects of this theory have appeared before in the literature: an action for the self-dual sector was given in [38], and some features of the full theory were identified in [39].

2.1 Fields & action

From now on, we assume that we are working either on complexified Minkowski space \({\mathbb {M}}\), orFootnote 3 Euclidean \({\mathbb {R}}^4\). A standard method for compactly encoding higher-spin fields is to introduce an auxiliary commuting SL\((2,{\mathbb {C}})\) spinorFootnote 4\(y^\alpha \) and consider “master” gauge potentials which are polynomials in these auxiliary parameters/variables. We will adopt a slightly different (but completely equivalent) strategy which treats the master gauge potential not as a polynomial in \(y^{\alpha }\) but as a homogeneous section of a bundle over \({\mathbb {M}}\).

To do this, we introduce an auxiliary commuting SL\((2,{\mathbb {C}})\) spinor \(\lambda ^{\alpha }\) which is considered only up to projective rescalings: that is, we identify \(\lambda ^{\alpha }\sim r\,\lambda ^{\alpha }\) for any \(r\in {\mathbb {C}}^*\). This is equivalent to viewing the projective equivalence class \([\lambda ^\alpha ]\) as homogeneous coordinates on the Riemann sphere \({\mathbb {P}}^1\). In this projective framework, a generic polynomial in \(\lambda ^\alpha \) is not well defined (as it has no fixed weight under the projective scaling). Therefore, we will require the master gauge field to be a section of the infinite jet bundle of the holomorphic tangent bundle of the Riemann sphere, \(J^{\infty }(T{\mathbb {P}}^1)\), which is homogeneous of degree zero under projective rescalings. We will abuse notation slightly by abbreviating this bundle to \(J^{\infty }_{{\mathbb {P}}^1}\), and also using this to denote the space of sections of the bundle.

In this work, we define HS-YM to be a theory of a non-abelian gauge connection

$$\begin{aligned} {\textbf{D}}_{\alpha {{\dot{\alpha }}}}=\partial _{\alpha {{\dot{\alpha }}}}+{\textbf{A}}_{\alpha {{\dot{\alpha }}}}(x|\lambda ), \end{aligned}$$
(2.2)

where the connection 1-form takes values in \(\Omega ^{1}({\mathbb {M}})\otimes {\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}\). Explicitly, this means that \({\textbf{A}}_{\alpha {{\dot{\alpha }}}}\) has an expansion of the form

$$\begin{aligned} {\textbf{A}}_{\alpha {{\dot{\alpha }}}}(x|\lambda )=\sum _{s=1}^{\infty }{\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}(x)\,\lambda ^{\beta (2s-2)}\,\partial _0^{s-1}, \end{aligned}$$
(2.3)

where the spacetime fields \(\{{\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}\}\) are valued in some Lie algebra \({\mathfrak {g}}\) and \(\partial _0\) is the generator of the holomorphic tangent bundle of \({\mathbb {P}}^1\) (i.e., the section which trivializes the holomorphic tangent bundle). Note that \(\lambda ^{\beta (2s-2)}\) is a convenient notation for \(\lambda ^{(\beta _1}...\lambda ^{\beta _{2s-2})}\). Since \(T{\mathbb {P}}^1\cong {\mathcal {O}}(2)\) as holomorphic line bundles, it follows that \(\partial _0\) has weight \(-2\) in \(\lambda ^\alpha \), and thus each term in (2.3) is homogeneous of degree zero in \(\lambda ^\alpha \), as required.Footnote 5

Ar In order to define the action of the connection \({\textbf{D}}_{\alpha {{\dot{\alpha }}}}\) on objects valued in \({\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}\), we first define a (somewhat trivial) Lie bracket for sections of \({\mathfrak {g}}\otimes J^{k}_{{\mathbb {P}}^1}\) for any \(k\in {\mathbb {N}}\):

$$\begin{aligned}{}[\![f\partial _0^{a},\,g\partial _0^{b}]\!]:=[f,\,g]\,\partial _0^{a+b}, \end{aligned}$$
(2.6)

where fg are Lie-algebra valued and \([\cdot ,\cdot ]\) is the usual Lie bracket on \({\mathfrak {g}}\). It is easy to check that (2.5) is itself a Lie bracket, and the connection acts on any adjoint-valued section \(\Phi \) of \(J^{\infty }_{{\mathbb {P}}^1}\) as

$$\begin{aligned} {\textbf{D}}_{\alpha {{\dot{\alpha }}}}\Phi :=\partial _{\alpha {{\dot{\alpha }}}}\Phi +[\![{\textbf{A}}_{\alpha {{\dot{\alpha }}}},\,\Phi ]\!]. \end{aligned}$$
(2.7)

This enables us to define a field strength associated to the higher-spin gauge connection:

$$\begin{aligned} {\textbf{F}}_{\alpha \beta }(x|\lambda ):=\frac{\epsilon ^{{{\dot{\alpha }}}{{\dot{\beta }}}}}{2}\,[\![{\textbf{D}}_{\alpha {{\dot{\alpha }}}},\,{\textbf{D}}_{\beta {{\dot{\beta }}}}]\!]\in \Omega ^2_{-}({\mathbb {M}})\otimes {\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}, \end{aligned}$$
(2.8)

where \(\Omega ^{2}_{-}({\mathbb {M}})\) are the anti-self-dual (ASD) 2-forms on \({\mathbb {M}}\). In particular, we only consider the ASD part of the curvature associated with the partial connection.

Under non-abelian gauge transformations

$$\begin{aligned} {\textbf{A}}_{\alpha {{\dot{\alpha }}}}\rightarrow {\textbf{g}}\,{\textbf{A}}_{\alpha {{\dot{\alpha }}}}\,{\textbf{g}}^{-1}-\partial _{\alpha {{\dot{\alpha }}}}{\textbf{g}}\,{\textbf{g}}^{-1}, \end{aligned}$$
(2.9)

where \({\textbf{g}}\) is valued in \({\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}\), the field strength transforms covariantly, \({\textbf{F}}_{\alpha \beta }\rightarrow {\textbf{g}}\,{\textbf{F}}_{\alpha \beta }\,{\textbf{g}}^{-1}\), as expected. It is easy to see that the non-ASD parts of the curvature of \({\textbf{D}}\) do not transform covariantly with respect to these gauge transformations as a result of the underlying chirality of the construction (i.e., growing higher-spin degrees of freedom in un-dotted chiral representations, but not dotted ones). Thus, we see that the construction is doubly chiral: using the chiral representation (un-dotted) means that one only obtains sensible field strength components of corresponding chirality (ASD).

As it stands, this setup contains too many higher-spin gauge potentials. To see this, simply expand the coefficients \({\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}(x)\) into un-dotted SL\((2,{\mathbb {C}})\) irreducibles:

$$\begin{aligned} {\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}(x)= A_{(\beta (2s-2)\alpha ){{\dot{\alpha }}}}(x)+\epsilon _{\alpha (\beta _1}\,\textsf{A}_{\beta (2s-3)){{\dot{\alpha }}}}(x), \quad \forall s>1. \end{aligned}$$
(2.4)

That is, for \(s\ge 2\), the master gauge field contains not one, but two non-abelian spin-s gauge potentials: \(A_{\alpha (2s-1){{\dot{\alpha }}}}\) (the desired content) and \(\textsf{A}_{\beta (2(s+1)-3){{\dot{\alpha }}}}\). It is easy to see that the superfluous field content decouples from any theory constructed from the \({\textbf{F}}_{\alpha \beta }\), however. Indeed, \(\textsf{A}_{\beta (2s-3){{\dot{\alpha }}}}\) drops out of \({\textbf{F}}_{\alpha \beta }\), and \({\textbf{F}}_{\alpha \beta }\) is left invariant by the local shift transformations

$$\begin{aligned} {\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}(x)\rightarrow {\mathcal {A}}_{\beta (2s-2)|\alpha {{\dot{\alpha }}}}(x)+\epsilon _{\alpha (\beta _1}\,\vartheta _{\beta (2s-3)){{\dot{\alpha }}}}(x), \end{aligned}$$
(2.10)

which can be used to remove all of the \(\textsf{A}_{\beta (2s-3){{\dot{\alpha }}}}\) components of the partial connection.

So without loss of generality, the master gauge potential can be taken to contain exactly the field content (2.1):

$$\begin{aligned} {\textbf{A}}_{\alpha {{\dot{\alpha }}}}(x|\lambda ):=\sum _{s=1}^{\infty } A_{\beta (2s-2)\alpha {{\dot{\alpha }}}}\,\lambda ^{\beta (2s-2)}\,\partial _0^{s-1}, \end{aligned}$$
(2.11)

as desired. This means that the component expansion of \({\textbf{F}}_{\alpha \beta }\) is always totally symmetric in its un-dotted spinor indices, so that

$$\begin{aligned} {\textbf{F}}_{\alpha \beta }(x|\lambda )=\sum _{s=1}^{\infty }F_{(\alpha \beta \gamma (2s-2))}(x)\,\lambda ^{\gamma (2s-2)}\,\partial _0^{s-1}, \end{aligned}$$
(2.12)

with curvature components at each spin \(s\ge 1\) given by:

$$\begin{aligned} F_{\alpha (2s)}:=\partial _{(\alpha _1}{}^{{{\dot{\gamma }}}}A_{\alpha (2s-1)){{\dot{\gamma }}}}+\sum _{r+t=s+1}\left[ A_{(\alpha (2r-1)}{}^{{{\dot{\gamma }}}},\,A_{\alpha (2t-1)){{\dot{\gamma }}}}\right] . \end{aligned}$$
(2.13)

Note that when \(s=1\), this story truncates to the usual spinor description of (the ASD part of) a Yang–Mills gauge field. However, for \(s>1\) the various higher-spin degrees of freedom mix with each other through the commutator terms: the gauge potential of spin \(s>1\) will generate source terms in the field strengths at \(s'>s\).

Up to this point, the discussion has been purely kinematical, but we are now ready to define classical HS-YM with a spacetime action functional. For this, we require one additional structure, which is a Möbius-invariant inner product on sections of \(J^{\infty }_{{\mathbb {P}}^1}\)—this is virtually identical to the inner product on the polynomial ring \({\mathbb {C}}[y^\alpha ]\) introduced in [38]. Let

$$\begin{aligned} f=\sum _{a=0}^{\infty }f_{\alpha (2a)}\,\lambda ^{\alpha (2a)}\,\partial _0^{a}, \qquad g=\sum _{b=0}^{\infty }g_{\alpha (2b)}\,\lambda ^{\alpha (2b)}\,\partial _0^{b}, \end{aligned}$$
(2.14)

be any two section of \(J^{\infty }_{{\mathbb {P}}^1}\). The required inner product is defined as:

$$\begin{aligned} \left\langle \cdot \,|\,\cdot \right\rangle :J^{\infty }_{{\mathbb {P}}^1}\times J^{\infty }_{{\mathbb {P}}^1}\rightarrow {\mathbb {C}}, \qquad \left\langle f|g\right\rangle :=\sum _{a=1}^{\infty }\epsilon ^{\alpha (2a)\beta (2a)}\,f_{\alpha (2a)}\,g_{\beta (2a)}. \end{aligned}$$
(2.15)

Here, \(\epsilon ^{\alpha (2a)\beta (2a)}=\epsilon ^{\alpha _{1}\beta _{1}}\cdots \epsilon ^{\alpha _{2a}\beta _{2a}}\) with symmetrization over \(\alpha \) and \(\beta \) groups of indices, respectively.

Armed with this, the HS-YM action is given by

$$\begin{aligned} S[{\textbf{A}}]=-\frac{1}{\textrm{g}^2}\int _{\mathbb {M}}\textrm{d}^{4}x\,\textrm{tr}\left\langle {\textbf{F}}_{\alpha \beta }|{\textbf{F}}^{\alpha \beta }\right\rangle = -\frac{1}{\textrm{g}^2}\,\sum _{s=1}^{\infty }\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left( F_{\alpha (2s)}\,F^{\alpha (2s)}\right) , \end{aligned}$$
(2.16)

where \(\textrm{g}\) is the dimensionless coupling constant and \(\textrm{tr}(\cdots )\) denotes the trace in \({\mathfrak {g}}\) (i.e., over the adjoint of the gauge group). Restricting to the \(s=1\) sector returns a chiral action which is perturbatively equivalent to standard Yang–Mills theory, as they differ only by a topological term [54].

There is a nice property of HS-YM which is easily observed from this classical action, namely, that it admits a perturbative expansion around the self-dual sector [38]

$$\begin{aligned} F_{\alpha (2s)}=0, \quad \text{ for } \text{ all } s=1,\ldots ,\infty . \end{aligned}$$
(2.17)

To see this, the action (2.16) can be re-written by introducing a set of higher-spin Lagrange multiplier fields

$$\begin{aligned} {\textbf{B}}_{\alpha \beta }(x|\lambda ):=\sum _{s=1}^{\infty }B_{(\gamma (2s-2)|\alpha \beta )}(x)\,\lambda ^{\gamma (2s-2)}\,\partial _0^{s-1}\in \Omega ^{2}_{-}({\mathbb {M}})\otimes {\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}, \end{aligned}$$
(2.18)

as

$$\begin{aligned} S[{\textbf{A}},{\textbf{B}}]=\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left\langle {\textbf{B}}_{\alpha \beta }|{\textbf{F}}^{\alpha \beta }\right\rangle +\frac{\textrm{g}^2}{4}\,\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left\langle {\textbf{B}}_{\alpha \beta }|{\textbf{B}}^{\alpha \beta }\right\rangle . \end{aligned}$$
(2.19)

Note that we do not include any terms which are not totally symmetric in the expansion (2.18) of \({\textbf{B}}_{\alpha \beta }\); this is because such terms decouple from the action when the gauge potential has the form (2.11). At the level of field components, the action (2.19) is simply

$$\begin{aligned} S[{\textbf{A}},{\textbf{B}}]=\sum _{s=1}^{\infty }\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,F^{\alpha (2s)}\right) +\frac{\textrm{g}^2}{4}\,\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,B^{\alpha (2s)}\right) , \end{aligned}$$
(2.20)

with field equations

$$\begin{aligned} {\textbf{F}}_{\alpha \beta }=-\frac{\textrm{g}^2}{2}\,{\textbf{B}}_{\alpha \beta }, \qquad {\textbf{D}}^{\alpha {{\dot{\alpha }}}}{\textbf{B}}_{\alpha \beta }=0. \end{aligned}$$
(2.21)

Note that the Lagrange multipliers \({\textbf{B}}_{\alpha \beta }\) can be integrated out to return (2.16). When \(\textrm{g}^2\rightarrow 0\) these equations reduce to the self-duality equations (2.17), along with a set of linear non-SD fields (given by \(B_{\alpha (2s)}\)) propagating on the SD HS-YM background. In Sect. 4, we will study the SD sector of HS-YM in some detail, showing that it is classically integrable and admits a twistor correspondence, as well as deriving action functionals (in twistor space and on spacetime) for the purely SD sector.

2.2 Linear theory

To get a better feel for the structure of HS-YM, it is instructive to look at the linear theory, which is described by the series of free spin-s actions

$$\begin{aligned} S_{\textrm{free}}[{\textbf{A}}]=\frac{1}{\textrm{g}^2}\,\sum _{s=1}^{\infty }\int _{{\mathbb {M}}}A^{\alpha (2s-1){{\dot{\alpha }}}}\,\Box A_{\alpha (2s-1){{\dot{\alpha }}}}, \end{aligned}$$
(2.22)

after an integration-by-parts, where \(\Box \) is the wave operator. At the linear level, the action is preserved by the gauge transformations

$$\begin{aligned} A_{\alpha (2s-1){{\dot{\alpha }}}}\rightarrow A_{\alpha (2s-1){{\dot{\alpha }}}}+\partial _{(\alpha _{1}|{{\dot{\alpha }}}|}\xi _{\alpha (2s-2))}, \end{aligned}$$
(2.23)

and one can proceed to count the on-shell degrees of freedom by imposing a Lorenz gauge condition

$$\begin{aligned} \partial ^{\alpha _1{{\dot{\alpha }}}}A_{\alpha _1\beta (2s-2){{\dot{\alpha }}}}=0. \end{aligned}$$
(2.24)

This removes \(2s-1\) degrees of freedom from the 4s initially present in \(A_{\alpha (2s-1){{\dot{\alpha }}}}\), leaving \(2s+1\). However, residual gauge transformations which obey \(\Box \xi _{\alpha (2s-2)}=0\) leave the Lorenz gauge (2.24) intact, so this removes a further \(2s-1\) degrees of freedom, leaving only two on-shell degrees of freedom for HS-YM at each spin \(s\ge 1\).

This means that rather than working with on-shell polarizations, we can label free HS-YM fields by their helicity. However, the underlying chirality of HS-YM means that there is an asymmetry in the definition of positive and negative helicity. A positive helicity, spin-s HS-YM free field is a gauge potential \(A^{(+)}_{\alpha (2s-1){{\dot{\alpha }}}}\) whose linearized ASD curvature vanishes:

$$\begin{aligned} \partial _{(\alpha _1}{}^{{{\dot{\gamma }}}}A^{(+)}_{\alpha (2s-1)){{\dot{\gamma }}}}=0. \end{aligned}$$
(2.25)

On the other hand, a negative helicity, spin-s HS-YM free field is defined by a linearized ASD curvature \(F^{(-)}_{\alpha (2s)}\) which obeys the negative helicity zero-rest-mass (z.r.m.) equation:

$$\begin{aligned} \partial ^{\alpha {{\dot{\alpha }}}}F^{(-)}_{\alpha \beta (2s-1)}=0. \end{aligned}$$
(2.26)

It should be noted that this sort of asymmetric definition is similar to what is encountered when characterizing helicity states in chiral background fields [75,76,77].

Momentum eigenstate representations for positive and negative helicity HS-YM fields will be useful when studying the scattering amplitudes of the theory. Let \(k^{\alpha {{\dot{\alpha }}}}=\kappa ^{\alpha }\tilde{\kappa }^{{{\dot{\alpha }}}}\) be an on-shell, massless (complex) 4-momentum. It is natural to follow the pattern for \(s=1\) helicity states in the spinor-helicity formalism and define [38]:

$$\begin{aligned} A^{(+)}_{\alpha (2s-1){{\dot{\alpha }}}}=\frac{\zeta _{\alpha (2s-1)}\,\tilde{\kappa }_{{{\dot{\alpha }}}}}{\kappa ^{\alpha _1}\cdots \kappa ^{\alpha _{2s-1}}\,\zeta _{\alpha (2s-1)}}\,\textrm{e}^{\textrm{i}\,k\cdot x}, \qquad A^{(-)}_{\alpha (2s-1){{\dot{\alpha }}}}=\frac{\kappa _{\alpha _1}\cdots \kappa _{\alpha _{2s-1}}\,\tilde{\zeta }_{{{\dot{\alpha }}}}}{[\tilde{\kappa }\,\tilde{\zeta }]}\,\textrm{e}^{\textrm{i}\,k\cdot x}, \end{aligned}$$
(2.27)

where \(\zeta _{\alpha (2s-1)},\tilde{\zeta }_{{{\dot{\alpha }}}}\) are constant spinors which obey \([\tilde{\zeta }\,\tilde{\kappa }]\ne 0\) and \(\zeta ^{\beta \alpha (2s-2)} \kappa _{\beta }\ne 0\). It is easy to show that these states obey (2.25) and (2.26), respectively.Footnote 6

For the negative helicity state, it is obvious that the choice of \(\tilde{\zeta }_{{{\dot{\alpha }}}}\) is pure gauge, as it drops out of the linearized ASD field strength:

$$\begin{aligned} F_{\alpha (2s)}[A^{(-)}]=\textrm{i}\,\kappa _{\alpha _1}\cdots \kappa _{\alpha _{2s}}\,\textrm{e}^{\textrm{i}\,k\cdot x}. \end{aligned}$$
(2.28)

For the positive helicity state, it is clear that \(A^{(+)}\) is independent of the scale of \(\zeta \), and along with the non-degeneracy condition (\(\zeta ^{\beta \alpha (2s-2)}\kappa _{\beta }\ne 0\)), this leaves exactly the residual gauge freedom contained in (2.23) after fixing Lorenz gauge (cf., [78]). In particular, this means that the choice of \(\zeta _{\alpha (2s-1)}\) is not pure gauge.

Indeed, it is easy to show that the difference between two \(A^{(+)}\)s with the same momentum but different choices of \(\zeta _{\alpha (2s-1)}\) is not a gauge transformation (2.23). Furthermore, the only gauge-invariant that can be formed from \(A^{(+)}\) vanishes, by the positive helicity condition (2.25). The only exception to these facts is when \(s=1\), in which case the field is a positive helicity gluon and the choice of \(\zeta _{\alpha }\) is pure gauge.

This has important consequences for scattering amplitudes of the theory: in general, the requirement of gauge invariance means that only spin-1 positive helicity states can be involved, whereas negative helicity states are well defined for arbitrary spin. Once again, this imbalance arises from the intrinsic chirality of the theory: in HS-YM, the only gauge-covariant field strength for \(s>1\) is the ASD one, \(F_{\alpha (2s)}\).

However, the fact that the action (2.16) is gauge-invariant, with two on-shell degrees of freedom for arbitrary spin, makes the \(s=1\) constraint for positive helicity fields somewhat puzzling. Concretely, this is linked with the explicit choice of helicity basis (2.27). One could imagine that this is simply not the most general choice of helicity polarizations, and that there is a better choice which extends in a gauge-covariant way to all spins and helicities. Unfortunately, it is hard to see how (2.27) could be altered or improved. The choice of the negative helicity polarization seems to be the only one which is consistent with little group scaling and matches the \(s=1\) case. The normalization constraint \(\epsilon ^{(+)}_{\alpha (2s-1){{\dot{\alpha }}}}\epsilon ^{(-)\,\alpha (2s-1){{\dot{\alpha }}}}=-1\), needed to recover the completeness relation for the polarization basis (see, e.g., the earlier work [79] where this unbalanced representation of polarization vectors was introduced), then essentially fixes the positive helicity polarization to be that given by (2.27). While not a proof excluding some alternative helicity basis which allows for higher-spin positive helicity degrees of freedom in the theory, this line of reasoning does seem very constraining.

Finally, to compute exchanges it will be necessary to have the propagator for HS-YM fields. With the Lorenz gauge condition (2.24), the only propagator is between positive and negative helicity states:

$$\begin{aligned} \langle A^{(+)}_{\alpha (2s-1){{\dot{\alpha }}}}(k)\,A^{(-)\,\beta (2s'-1){{\dot{\beta }}}}(k')\rangle= & {} \delta ^{4}(k+k')\,\delta _{s,s'}\,\delta _{s,1}\frac{\delta _{(\alpha _1}^{(\beta _1}\cdots \delta _{\alpha _{2s-1})}^{\beta _{2s-1})}\,\delta _{{{\dot{\alpha }}}}^{{{\dot{\beta }}}}}{k^2}\nonumber \\= & {} \delta ^{4}(k+k')\,\delta _{s,1}\,\delta _{s',1}\frac{\delta _{\alpha _1}^{\beta _1}\,\delta _{{{\dot{\alpha }}}}^{{{\dot{\beta }}}}}{k^2}, \end{aligned}$$
(2.29)

where the trivial color structure (given by the Killing form on \({\mathfrak {g}}\)) is suppressed. The constraint that the positive helicity particle has spin-1 (a consequence of gauge invariance) of course collapses the propagator to the usual gluon propagator.

3 Scattering amplitudes

Armed with the spacetime action of HS-YM (2.16) and a helicity basis of momentum eigenstates for the external fields, we can now proceed to investigate the structure of scattering amplitudes for this theory. Since the Lagrangian itself is not real-valued in Lorentzian signature, it makes sense for us to work with complex kinematics, leading to non-vanishing tree-level 3-point amplitudes. The vertex structure of the theory and gauge invariance constrains the exchanges to have only spin one at higher points, although the negative helicity external particles can have arbitrary spin. The complexity of the action combined with the fact that interactions are always at most single-derivative means that various no-go theorems prohibiting scattering amplitudes with higher-spin external legs can be evaded.

3.1 3-point amplitudes

As the external legs of any tree-level scattering amplitude in HS-YM are labeled by a helicity, these amplitudes can be denoted by \({\mathcal {M}}_{n}(1_{s_1}^{h_1},\ldots ,n_{s_n}^{h_n})\), where \(h_{i}=\pm \) denotes the helicity (positive or negative) of the \(i^{\textrm{th}}\) external particle. This means that tree amplitudes can be helicity-graded by the number of, say, negative helicity external particles. At 3-points, this means that there are four possible helicity configurations: \((+,+,+)\), \((-,+,+)\), \((-,-,+)\) and \((-,-,-)\). For unitary theories with Lorentzian kinematics, it follows that all tree-level 3-point amplitudes vanish for the trivial reason that

$$\begin{aligned} \sum _{i=1}^{3}k_{i}=\sum _{i=1}^{3}\kappa _{i}^{\alpha }\,\bar{\kappa }_{i}^{{{\dot{\alpha }}}}=0 \quad \Rightarrow \quad \langle i\,j\rangle \,[i\,j]=0, \,\, \forall i,j\in \{1,2,3\}, \end{aligned}$$
(3.1)

so all possible kinematic invariants vanish. Note that for complex kinematics, where the momenta \(k_{i}^{\alpha {{\dot{\alpha }}}}=\kappa _{i}^{\alpha }\tilde{\kappa }_{i}^{{{\dot{\alpha }}}}\) and \(\tilde{\kappa }_{i}^{{{\dot{\alpha }}}}\) is not the complex conjugate of \(\kappa _{i}^{\alpha }\), 3-particle momentum conservation only requires that one chirality of kinematical invariants vanish: namely all contractions of the form \(\langle i\,j\rangle \), or all of the contractions of the form \([i\,j]\). As a consequence, this allows for potentially non-vanishing 3-point scattering amplitude configurations (cf., [80]). For instance, in ordinary Yang–Mills, one has non-vanishing \((-,+,+)\) (i.e., “\(\overline{\text{ MHV }}\)”) and \((-,-,+)\) (i.e., “MHV”) 3-point amplitudes with complex kinematics. For Lorentzian-real theories this analytic continuation plays an important role by giving data with which to seed recursion relations and construct higher-multiplicity scattering amplitudes with real kinematics [72, 81, 82]. However, in a complex theory like HS-YM such complex kinematics are natural from the outset.

With this in mind, the tree-level 3-point amplitudes of HS-YM are given by evaluating the cubic terms in the classical action (2.16) with on-shell external wavefunctions; this cubic interaction is given by

$$\begin{aligned} \tilde{\delta }(s_1-s_2-s_3+1)\,\textrm{g}\,\int \textrm{d}^{4}x\,\textrm{tr}\left( \partial _{(\alpha _1}{}^{{{\dot{\gamma }}}} A_{\alpha (2s_1-1)){{\dot{\gamma }}}}\,\left[ A^{\alpha (2s_2-1){{\dot{\beta }}}},\,A^{\alpha (2s_3-1)}{}_{{{\dot{\beta }}}}\right] \right) , \nonumber \\ \end{aligned}$$
(3.2)

with

$$\begin{aligned} \tilde{\delta }(x):=\left\{ \begin{array}{ll} 0 &{} \text{ if } x\ne 0 \\ 1 &{} \text{ if } x=0 \end{array}\right. , \end{aligned}$$
(3.3)

a Kronecker delta. The constraint on the spins is required for the integrand to be well defined. Evaluating this cubic interaction with the momentum eigenstates (2.27)—and recalling that the constant spinors associated with negative helicity particles can be chosen arbitrarily—it is easy to see that both \({\mathcal {M}}_{3}(1^+,2^+,3^+)\) and \({\mathcal {M}}_{3}(1^-,2^-,3^-)\) vanish for the same reasons as in pure Yang–Mills theory.

This leaves only the MHV and \(\overline{\text{ MHV }}\) configurations as non-vanishing 3-point amplitudes. Although gauge invariance dictates that in general only spin-1 positive helicity states are allowed, for now we keep the spins arbitrary. In the \(\overline{\text{ MHV }}\) case, evaluating the cubic vertex on the momentum eigenstates leads in the first instance to:

$$\begin{aligned} {\mathcal {M}}_{3}(1_{s_1}^-,2_{s_2}^+,3_{s_3}^+)=\textrm{i}\,\textrm{g}\,f^{{\textsf{a}}_1{\textsf{a}}_2{\textsf{a}}_3}\,\tilde{\delta }(s_1-s_2-s_3+1)\,\frac{[2\,3]\,\langle \zeta _2\,1\rangle ^{2s_2-1}\,\langle \zeta _3\,1\rangle ^{2s_3-1}}{\langle \zeta _2\,2\rangle ^{2s_2-1}\,\langle \zeta _3\,3\rangle ^{2s_3-1}},\nonumber \\ \end{aligned}$$
(3.4)

where \(f^{\textsf{abc}}\) are the structure constants of the gauge group, the overall momentum conserving delta function has been stripped off and (without loss of generality) we have decomposed the positive helicity reference spinors as

$$\begin{aligned} \zeta _{2}^{\alpha (2s_2-1)}=\zeta _2^{\alpha _1}\cdots \zeta ^{\alpha _{2s_2-1}}_{2}, \end{aligned}$$

etc. Now, on the support of (complex) momentum conservation, it follows that

$$\begin{aligned} \langle \zeta _2\,1\rangle \,[1\,3]+\langle \zeta _2\,2\rangle \,[2\,3]=0, \qquad \langle \zeta _3\,1\rangle \,[1\,2]+\langle \zeta _3\,3\rangle \,[3\,2]=0, \end{aligned}$$
(3.5)

which means that the \(\overline{\text{ MHV }}\) amplitude is equal to

$$\begin{aligned} {\mathcal {M}}_{3}(1_{s_1}^-,2_{s_2}^+,3_{s_3}^+)=\textrm{i}\,\textrm{g}\,f^{{\textsf{a}}_1{\textsf{a}}_2{\textsf{a}}_3}\,\tilde{\delta }(s_1-s_2-s_3+1)\,\frac{[2\,3]^{2s_1+1}}{[1\,2]^{2s_3-1}\,[3\,1]^{2s_2-1}}, \end{aligned}$$
(3.6)

matching the formula found in [38] for the self-dual sector of HS-YM.

Observe that the highly constraining 3-point kinematics mean that the result is manifestly gauge-invariant for all spins satisfying the constraint. This is an accident, unique to 3-point amplitudes (as we will soon see). Imposing the constraints \(s_2=s_3=1\) from the start, the remaining spin constraint in (3.6) sets \(s_1=1\) and the whole \(\overline{\text{ MHV }}\) 3-point amplitude collapses to that of pure Yang–Mills.

The 3-point MHV amplitude is evaluated along similar lines, leading to

$$\begin{aligned} {\mathcal {M}}_{3}(1_{s_1}^{-},2_{s_2}^{-},3_{s_3}^{+})= & {} \frac{\textrm{i}\,\textrm{g}}{2}\,f^{{\textsf{a}}_1{\textsf{a}}_2{\textsf{a}}_3}\,\frac{\langle 1\,2\rangle ^{2s_3}}{\langle 2\,3\rangle ^{2s_3-1}\,\langle 3\,1\rangle ^{2s_3-1}}\nonumber \\{} & {} \quad \Big [\langle 1\,2\rangle ^{2s_2-1}\,\langle 3\,1\rangle ^{2s_3-2}\,\tilde{\delta }(s_1-s_2-s_3+1) \nonumber \\{} & {} \quad + \langle 1\,2\rangle ^{2s_1-1}\,\langle 3\,2\rangle ^{2s_3-2}\,\tilde{\delta }(s_2-s_1-s_3+1)\Big ], \end{aligned}$$
(3.7)

where the constraint \(s_3=1\) has been temporarily ignored. Here, the two terms arise from the need to symmetrize over the location of the positive helicity particle in the cubic vertex (3.2). Once again, the constant spinor used to define the positive helicity polarization drops out of the amplitude, leaving an “accidentally” gauge-invariant result for all external spins. A striking thing about this MHV amplitude is that it is not, for generic spins, the helicity conjugate of its \(\overline{\text{ MHV }}\) counterpart (3.6). This is, of course, an unavoidable consequence of the chirality of the theory, which leads to a violation of parity invariance.

When \(s_3=1\) is imposed (as it should have been from the start), (3.7) simplifies to

$$\begin{aligned} {\mathcal {M}}_{3}(1_{s_1}^{-},2_{s_2}^{-},3_{1}^{+})= & {} \frac{\textrm{i}\,\textrm{g}}{2}\,f^{{\textsf{a}}_1{\textsf{a}}_2{\textsf{a}}_3}\,\frac{\langle 1\,2\rangle ^{2}}{\langle 2\,3\rangle \,\langle 3\,1\rangle }\Big [\langle 1\,2\rangle ^{2s_2-1}\,\tilde{\delta }(s_1-s_2) + \langle 1\,2\rangle ^{2s_1-1}\,\tilde{\delta }(s_2-s_1)\Big ] \nonumber \\= & {} \textrm{i}\,\textrm{g}\,f^{{\textsf{a}}_1{\textsf{a}}_2{\textsf{a}}_3}\,\tilde{\delta }(s_1-s_2)\,\frac{\langle 1\,2\rangle ^{2s_1+1}}{\langle 2\,3\rangle \,\langle 3\,1\rangle }, \end{aligned}$$
(3.8)

where both negative helicity external particles must have identical —but otherwise arbitrary—spin. When \(s_1=s_2=s_3=1\) the formula reduces to the 3-point MHV amplitude of pure Yang–Mills, which is the parity conjugate of the \(\overline{\text{ MHV }}\) with all spin-one external fields. The reason for this is that when restricted to spin-one gauge fields, the action (2.16) differs from the Yang–Mills action only by a topological term, so parity invariance holds perturbatively despite the chirality of the Lagrangian [54]. The same cannot be said of the chiral action of full HS-YM theory, which is clearly not perturbatively equivalent to any parity-invariant theory.

3.2 4-point amplitudes

Now, let us turn to the computation of 4-point tree-level scattering amplitudes in HS-YM. The cubic interactions are extended off-shell and linked together with the propagator (2.29), with the appropriate spin constraints at each vertex in any given Feynman diagram. In addition, we have contributions from the quartic contact interaction

$$\begin{aligned} \tilde{\delta }(s_1+s_2-s_3-s_4)\,\textrm{g}^2\,\int \textrm{d}^{4}x\,\textrm{tr}\left( \Big [A_{(\alpha (2s_1-1)}{}^{{{\dot{\gamma }}}},\,A_{\alpha (2s_2-1)){{\dot{\gamma }}}}\Big ]\,\left[ A^{\alpha (2s_3-1){{\dot{\delta }}}},\,A^{\alpha (2s_4-1)}{}_{{{\dot{\delta }}}}\right] \right) , \nonumber \\ \end{aligned}$$
(3.9)

with the spin constraint ensuring that the spinor contractions are well defined. Unlike the 3-point amplitudes, at this stage gauge invariance requires all positive helicity particles to have spin-1.

Once again, we can proceed by helicity-grading the amplitudes, but the calculation is further simplified by restricting our attention to color-ordered partial amplitudes. In particular, it is easy to show that tree-level scattering amplitudes decompose as

$$\begin{aligned} {\mathcal {M}}_{n}(1^{h_1}_{s_1},\ldots ,n^{h_n}_{s_n})=\textrm{g}^{n-2}\,\delta ^{4}\!\left( \sum _{i=1}^{n}k_i\right) \sum _{\sigma \in S_{n}\setminus {\mathbb {Z}}_n}\textrm{tr}({\textsf{T}}^{{\textsf{a}}_{\sigma (1)}}\cdots {\textsf{T}}^{{\textsf{a}}_{\sigma (n)}})\,{\mathcal {A}}_{n}(\sigma (1^{h_1}_{s_1}),\ldots ,\sigma (n_{s_n}^{h_n})),\nonumber \\ \end{aligned}$$
(3.10)

in terms of a sum over distinct (i.e., non-cyclically related) color-orderings; here \({\textsf{T}}^{{\textsf{a}}}\) are generators of the gauge group and \(h_i=\pm \) is the helicity of the \(i^{\textrm{th}}\) particle. The functions of the kinematic data \({\mathcal {A}}_n\) are the color-ordered partial amplitudes—knowing \({\mathcal {A}}_n\) in any color-ordering thus determines the full amplitude \({\mathcal {M}}_n\).

Given that we only have non-vanishing 3-point MHV and \(\overline{\text{ MHV }}\) amplitudes, simple factorization arguments immediately indicate that \({\mathcal {A}}_{4}(1^+,2^+,3^+,4^+)=0\), since the exchanges involved in such an amplitude vanish while the 4-point contact contribution can be eliminated by making appropriate gauge choices for the constant reference spinors in the external states. However, the next helicity configuration, \({\mathcal {A}}_{4}(1^-,2^+,3^+,4^+)\), does not a priori vanish. In this color-ordering, the amplitude receives contributions from exchange diagrams in the s- and t-channels, as well as a contact termFootnote 7:

$$\begin{aligned} {\mathcal {A}}_{4}(1_{s_1}^-,2_{1}^+,3_{1}^+,4_{1}^+)= {\widehat{{\mathcal {A}}}}_{4}^s+{\widehat{{\mathcal {A}}}}_{4}^{t}+{\widehat{{\mathcal {A}}}}_{4}^{\textrm{cont}}. \end{aligned}$$

We first compute the s-channel exchange:

$$\begin{aligned} {\widehat{{\mathcal {A}}}}_4^s=(-1)^{3-s_1}\,\frac{[12]^{2-s_1}\,[34]\,f(\zeta _2,\zeta _3,\zeta _4)}{(k_1+k_2)^2}\,\tilde{\delta }(1-s_1), \end{aligned}$$
(3.11)

where the rational function f depends on the auxiliary spinors of the positive helicity fields:

$$\begin{aligned} f(\zeta _2,\zeta _3,\zeta _4):=\frac{\langle \zeta _2\,1\rangle }{\langle \zeta _2\,2\rangle }\,\frac{\langle \zeta _3\,4\rangle }{\langle \zeta _3\,3\rangle }\,\frac{\langle \zeta _4\,3\rangle }{\langle \zeta _4\,4\rangle }\,\frac{\langle \zeta _3\,1\rangle \langle \zeta _4\,2\rangle }{\langle \zeta _3\,3\rangle \langle \zeta _4\,4\rangle }\,\left( \frac{\langle \zeta _4\,1\rangle }{\langle \zeta _4\,2\rangle }\right) ^{s_1}, \end{aligned}$$
(3.12)

which is homogeneous of weight zero in the reference spinors, as required.

Now, as the external positive helicity states are spin-1, choice of the reference spinors is just residual gauge freedom and we can set \(\zeta _{2}^{\alpha }=\zeta _{3}^{\alpha }=\zeta _{4}^{\alpha }=\kappa _{1}^{\alpha }\), from which it immediately follows that \(f(\zeta _2,\zeta _3,\zeta _4)=0\), and thus the s-channel contribution vanishes \({\widehat{{\mathcal {A}}}}^s_4=0\). A similar calculation shows that the t-channel contribution also vanishes: \({\widehat{{\mathcal {A}}}}^t_4=0\). The only remaining contributions are from the contact interaction; in this color-ordering the contact contributions are of the form

$$\begin{aligned} \tilde{\delta }(s_1-1)\,\frac{\langle 1\,\zeta _2\rangle ^{2s_1-1}\,\langle \zeta _4\,\zeta _3\rangle \,\langle \zeta _2\,\zeta _3\rangle ^{2(1-s_1)}\,[\tilde{\zeta }_1\,2]\,[3\,4]}{[1\,\tilde{\zeta }_1]\,\langle 2\,\zeta _2\rangle \,\langle 3\,\zeta _3\rangle \,\langle 4\,\zeta _4\rangle } -\,(2\leftrightarrow 3). \end{aligned}$$
(3.13)

Clearly, this contribution is always proportional to contractions of the form \(\langle 1\,\zeta _i\rangle \) (for \(i\ne 1\)), which are killed with the residual gauge fixing \(\zeta _i=\kappa _1\).

Thus, it follows that the amplitude in this helicity configuration vanishes:

$$\begin{aligned} {\mathcal {A}}_{4}(1_{s_1}^-,2_{1}^+,3_{1}^+,4_{1}^+)=0, \end{aligned}$$
(3.14)

regardless of the spins of the external fields. Since the only vertices contributing to this amplitude are the \(\overline{\text{ MHV }}\) 3-point ones, the computation of this amplitude is the same as in the purely self-dual theory, and the vanishing of the amplitude is in agreement with light-cone results for the self-dual sector [83,84,85].

Next, we come to the 4-point MHV helicity configuration, with two negative and two positive helicity external fields. Let us begin by computing \({\mathcal {A}}_{4}(1_{s_1}^-,2_{s_2}^-,3_{1}^+,4_{1}^+)\). Once again, in this color-ordering the exchanges are in the s- and t-channels; partially fixing the residual gauge symmetry so that

$$\begin{aligned} \zeta _{3}^{\alpha }=\zeta _{4}^{\alpha }=\zeta ^{\alpha }, \end{aligned}$$
(3.15)

subject to \(\langle \zeta \,3\rangle \ne 0\ne \langle \zeta \,4\rangle \), the s-channel contribution is given by:

$$\begin{aligned} {\widehat{{\mathcal {A}}}}_{4}^{s}=\frac{(-1)^{2-s_1-s_2}}{2}\,\frac{\langle 1\,2\rangle ^{2s_1-2}\,\langle \zeta \,2\rangle \,[\tilde{\zeta }_1|k_1+k_2|\zeta \rangle \,[3\,4]}{[2\,1]\,[1\,\tilde{\zeta }_1]\,\langle 3\,\zeta \rangle \,\langle 4\,\zeta \rangle }\,\tilde{\delta }(s_1-s_2)\,+\,(1\leftrightarrow 2). \nonumber \\ \end{aligned}$$
(3.16)

Here, the remaining spin constraint fixes the two negative helicity particles to have identical spin, \(s_1=s_2\), but otherwise their spin is unconstrained. Similar expressions arise for the t-channel exchange and contact diagram, all with the same spin constraint.

Upon further fixing the gauge redundancy by setting

$$\begin{aligned} \zeta ^{\alpha }=\kappa _{1}^{\alpha }, \qquad \tilde{\zeta }_{1}^{{{\dot{\alpha }}}}=\tilde{\kappa }_{4}^{{{\dot{\alpha }}}}, \end{aligned}$$
(3.17)

and exploiting 4-momentum conservation, the t-channel and contact contributions are easily seen to vanish, while s-channel contribution collapses to give the full amplitude

$$\begin{aligned} {\mathcal {A}}_{4}(1_{s}^-,2_{s}^-,3_{1}^+,4_{1}^+)=\frac{\langle 1\,2\rangle ^{2s+1}}{\langle 2\,3\rangle \,\langle 3\,4\rangle \,\langle 4\,1\rangle } \end{aligned}$$
(3.18)

for the 4-point MHV amplitude in this color-ordering.

The fact that (3.18) is non-vanishing for generic higher spins \(s>1\) raises the alarm: aren’t we violating well-known no-go theorems constraining S-matrices with higher-spin external states? As alluded to above, the basic properties of HS-YM theory mean that no-go theorems (e.g., Weinberg’s low energy theorem [7], Weinberg–Witten [9], Coleman-Mandula [8], etc.) simply do not apply. In particular, the theory is purely massless, contains no scalars, is parity-violating, non-unitary and its interactions have at most one derivative. Furthermore, the exchanges themselves are spin-1, so in effect the MHV amplitude is corresponding to two negative helicity higher-spin fields interacting with a positive helicity pure gluon background. Various subgroups of these properties violate the assumptions of all no-go theorems constraining the tree-level S-matrix.

For completeness, we provide the expression for the MHV amplitude of HS-YM in the color-ordering where the negative helicity particles are not consecutive. Following similar steps to above, one arrives at the formula

$$\begin{aligned} {\mathcal {A}}_{4}(1_{s_1}^-,2_{1}^+,3_{s_3}^-,4_{1}^+)=\tilde{\delta }(s_1-s_3)\,\frac{\langle 1\,3\rangle ^{2s_1+2}}{\langle 1\,2\rangle \langle 2\,3\rangle \,\langle 3\,4\rangle \,\langle 4\,1\rangle }. \end{aligned}$$
(3.19)

Once again, the spins of the external negative helicity particles are identical but otherwise arbitrary.

3.3 n-point MHV amplitudes

Based on the pattern observed at 4-points, it is tempting to conjecture an all-multiplicity formula for the tree-level scattering amplitudes of HS-YM in the MHV helicity configuration (i.e., two negative helicity higher-spin particles and arbitrarily many positive helicity external gluons). The natural conjecture is:

$$\begin{aligned} {\mathcal {A}}_{n}(1^{+}_{1},\ldots ,i^{-}_{s_i},\ldots ,j^{-}_{s_j},\ldots ,n^{+}_{1})=\tilde{\delta }(s_i-s_j)\,\frac{\langle i\,j\rangle ^{2s_i+2}}{\langle 1\,2\rangle \,\langle 2\,3\rangle \cdots \langle n\,1\rangle }, \end{aligned}$$
(3.20)

where particles ij have negative helicity. This formula passes several basic consistency checks: it reduces to the well-known Parke–Taylor formula for n-gluon MHV scattering [86] when \(s_1=\cdots =s_n=1\), carries the correct little group weight in each external particle and has only the usual collinear poles of ordinary Yang–Mills theory.

While directly computing this formula from the Feynman rules of HS-YM is clearly not tractable, there are other ways of confirming that it is correct. In “Appendix A,” we confirm (3.20) using BCFW recursion [72] after first showing that HS-YM can indeed be constructed via on-shell recursion. This is possible because of the inherent chirality of the theory, which allows it to evade no-go factorization arguments for higher-spin theories [80,81,82, 87]. In Sect. 5, we derive (3.20) directly from the HS-YM action using twistor theory.

Before concluding this section, it is worth illustrating, in practical terms, why the restriction to spin-1 for positive helicity external particles is necessary. One way to test this is to calculate 4-point amplitudes using the assumption that the reference spinors \(\zeta _{\alpha (2s-1)}\) can be arbitrarily chosen for \(s>1\); that is, by assuming that gauge invariance will be respected. Performing the 4-point MHV calculation and then extrapolating to higher-multiplicity leads to the formula:

$$\begin{aligned}{} & {} {\mathcal {A}}_{n}(1^{+}_{s_1},\ldots ,i^{-}_{s_i},\ldots ,j^{-}_{s_j},\ldots ,n^{+}_{s_n})=\frac{\langle i\,j\rangle ^{4}}{\langle 1\,2\rangle \,\langle 2\,3\rangle \cdots \langle n\,1\rangle } \nonumber \\{} & {} \quad \times \left[ \tilde{\delta }\!\left( s_j-s_i-n+2+\sum _{a\ne i,j}s_a\right) \,\left( \frac{\langle i\,j\rangle ^{s_j-n+1+\sum _{a\ne i,j}s_a}}{\prod _{b\ne i,j} \langle j\,b\rangle ^{s_b-1}}\right) ^2\right. \nonumber \\{} & {} \quad \left. +\tilde{\delta }\!\left( s_i-s_j-n+2+\sum _{a\ne i,j}s_a\right) \,\left( \frac{\langle i\,j\rangle ^{s_i-n+1+\sum _{a\ne i,j}s_a}}{\prod _{b\ne i,j} \langle i\,b\rangle ^{s_b-1}}\right) ^2\right] . \end{aligned}$$
(3.21)

At first, this may seem like a reasonable formula: it carries the correct little group weights, obeys the symmetries imposed by the color-ordering, and collapses to (3.20) when \(s_a=1\) for all \(a\ne i,j\). Furthermore, when the two negative helicity particles are gluons (\(s_i=1=s_j\)), the spin constraints become

$$\begin{aligned} \sum _{a\ne i,j}s_a=n-2 \quad \Rightarrow \quad s_a=1, \end{aligned}$$
(3.22)

for all \(a\ne i,j\), since each \(s_a\ge 1\).

However, the formula (3.21) now has higher-order poles whenever the negative helicity momenta become collinear with any of the positive helicity momenta, regardless of their position in the color-ordering. These are not physical for a colored, two-derivative local field theory, and the root of these spurious singularities can be traced back precisely to identifying the positive helicity reference spinors with some external momenta.

It would, of course, be interesting to explore whether (3.21) can be understood as a valid scattering amplitude in some non-local context, but this is beyond the scope of the current paper.

4 Self-dual sector and integrability

We have already identified a self-dual (SD) sector of HS-YM theory, corresponding to the condition

$$\begin{aligned} F_{\alpha (2s)}=0, \qquad \text{ for } \text{ all } s\ge 1. \end{aligned}$$
(4.1)

For the \(s=1\) truncation of the theory, these are the familiar self-duality equations of Yang–Mills theory, which are known to be classically integrable using a variety of different perspectives (cf., [88,89,90,91]). It is natural to ask if the SD sector of HS-YM is likewise classically integrable.

The scattering amplitude calculations of the previous section hint that this should be true, as we found that \({\mathcal {A}}_{4}(1^+,2^+,3^+,4^+)\) and \({\mathcal {A}}_{4}(1^-,2^+,3^+,4^+)\) vanish for HS-YM. This is indicative of a self-dual sector which is consistent and classically integrable, respectively.

In this section, we answer the question of the classical integrability of HS-YM in the affirmative using twistor theory. In particular, we generalize Ward’s theorem [53] for SD Yang–Mills theory to the higher-spin setting, proving an equivalence between all SD HS-YM fields and certain integrable holomorphic structures in twistor space. We then use this construction and holomorphic Chern–Simons theory in twistor space to arrive at a spacetime description for SD HS-YM as a four-dimensional theory of an infinite tower of adjoint-valued scalars.

4.1 Twistor theory

Penrose’s twistor theory gives a non-local description of spacetime physics in terms of complex projective geometry [52] and has now found many different uses across theoretical and mathematical physics. Rather than provide an extensive review of this rich subject, we give a brief recap of the features required for the study of HS-YM; in-depth reviews can be found in [91,92,93,94,95,96], and we follow the notation of [97].

Let \({\mathbb {M}}\) be complexified Minkowski spacetime; the real spacetimes of various signatures—Lorentzian \({\mathbb {R}}^{1,3}\), Euclidean \({\mathbb {R}}^4\) and Kleinian \({\mathbb {R}}^{2,2}\)—sit inside this complexified spacetime as real slices. The (projective) twistor space \(\mathbb{P}\mathbb{T}\) of \({\mathbb {M}}\) is given by an open subset of three-dimensional complex projective space \({\mathbb {P}}^3\)

$$\begin{aligned} \mathbb{P}\mathbb{T}=\left\{ Z^{A}=(\mu ^{{{\dot{\alpha }}}},\lambda _{\alpha })\in {\mathbb {P}}^3\,|\,\lambda _{\alpha }\ne 0\right\} , \end{aligned}$$
(4.2)

where

$$\begin{aligned} Z^{A}\sim r\,Z^{A}, \quad \forall r\in {\mathbb {C}}^{*}, \end{aligned}$$
(4.3)

are homogeneous coordinates on \({\mathbb {P}}^3\) defined only up to this projective rescaling. We will denote the equivalence class of such homogeneous coordinates under projective rescaling as \([Z^A]\). Since \(\lambda _{\alpha }\ne 0\) on \(\mathbb{P}\mathbb{T}\), there is a natural fibration

$$\begin{aligned} \pi :\mathbb{P}\mathbb{T}\rightarrow {\mathbb {P}}^1, \qquad [Z^{A}]\mapsto [\lambda _{\alpha }], \end{aligned}$$
(4.4)

with \(\lambda _{\alpha }\) serving as homogeneous coordinates on the \({\mathbb {P}}^1\) base of the fibration.

The correspondence between \(\mathbb{P}\mathbb{T}\) and \({\mathbb {M}}\) is given by the incidence relations

$$\begin{aligned} \mu ^{{{\dot{\alpha }}}}=x^{\alpha {{\dot{\alpha }}}}\,\lambda _{\alpha }, \end{aligned}$$
(4.5)

which state that each point \(x\in {\mathbb {M}}\) corresponds to a holomorphic, linearly embedded Riemann sphere \(X\cong {\mathbb {P}}^1\subset \mathbb{P}\mathbb{T}\). Conversely, any point \(Z^{A}=(\mu ^{{{\dot{\alpha }}}},\lambda _{\alpha })\) in twistor space corresponds to a totally null ASD 2-plane in \({\mathbb {M}}\), whose tangent vectors have the form \(\lambda ^{\alpha }v^{{{\dot{\alpha }}}}\) for fixed \(\lambda ^{\alpha }\) (given by the choice of \(Z^A\)) and arbitrary \(v^{{{\dot{\alpha }}}}\). These totally null ASD 2-planes are called \(\alpha \)-planes in \({\mathbb {M}}\).

There are many interesting results which follow from this basic non-local geometric correspondence between \(\mathbb{P}\mathbb{T}\) and \({\mathbb {M}}\). For our purposes, there are two classic results which will prove most important. The first of these is the Penrose transform, which gives an equivalence between solutions of the massless free field (or zero-rest-mass) equations on \({\mathbb {M}}\) of any integer or half-integer spin, and cohomology classes on \(\mathbb{P}\mathbb{T}\) [98,99,100]. More precisely, this takes the form of an isomorphism:

$$\begin{aligned} \left\{ \text{ massless } \text{ free } \text{ fields } \text{ on } {\mathbb {M}} \text{ of } \text{ helicity } h\right\} \cong H^{0,1}(\mathbb{P}\mathbb{T},{\mathcal {O}}(2h-2)), \end{aligned}$$
(4.6)

where it is assumed that the set of massless free fields comes with some suitable regularity conditions and \(H^{0,1}(\mathbb{P}\mathbb{T},{\mathcal {O}}(2h-2))\) denotes the Dolbeault cohomologyFootnote 8 group of (0, 1)-forms on \(\mathbb{P}\mathbb{T}\) valued in \({\mathcal {O}}(2h-2)\), the sheaf of holomorphic homogeneous functions of weight \(2h-2\).

The second major result of twistor theory that is crucial for us is the Ward correspondence, which gives a one-to-one correspondence between solutions of the SD Yang–Mills equations on \({\mathbb {M}}\) and certain holomorphic vector bundles over \(\mathbb{P}\mathbb{T}\) [53]. Our first result is to generalize this correspondence to the SD sector of HS-YM theory.

4.2 Twistor construction of self-dual HS-YM

As one might expect, there is a higher-spin version of the Ward correspondence [53] for the SD sector of HS-YM:

Theorem 1

There is a one-to-one correspondence between:

  • self-dual HS-YM connections with gauge group GL\((N,{\mathbb {C}})\), and

  • holomorphic bundles \(V=E\otimes J^{\infty }_{{\mathbb {P}}^1}\rightarrow \mathbb{P}\mathbb{T}\), where E is a rank N bundle which is topologically trivial on restriction to any line in \(\mathbb{P}\mathbb{T}\) and \(J^{\infty }_{{\mathbb {P}}^1}\) is identified with the infinite jet bundle of the bundle of horizontal vectors of the fibration \(\pi :\mathbb{P}\mathbb{T}\rightarrow {\mathbb {P}}^1\).

Proof

First, suppose that we are given a self-dual, GL\((N,{\mathbb {C}})\) HS-YM field on \({\mathbb {M}}\); this is characterized by the equations (4.1) for each \(s\ge 1\). Let \(\alpha _Z\) denote the \(\alpha \)-plane \({\mathbb {M}}\) corresponding to some point \(Z\in \mathbb{P}\mathbb{T}\); tangent vectors of \(\alpha _Z\) have the form \(\lambda ^{\alpha }v^{{{\dot{\alpha }}}}\) for fixed \(\lambda ^{\alpha }\). Due to the chirality of HS-YM, only the ASD part of the field strength has a gauge invariant definition, and this is set to zero by SD equations. However, on restriction to \(\alpha _Z\), it follows that any SD HS-YM connection obeys

$$\begin{aligned}{}[\![{\textbf{D}}_{\alpha {{\dot{\alpha }}}},\,{\textbf{D}}_{\beta {{\dot{\beta }}}}]\!]\Big |_{\alpha _Z}=\lambda ^{\alpha }\,\lambda ^{\beta }\,v^{{{\dot{\alpha }}}}\,w^{{{\dot{\beta }}}}\,\frac{\epsilon _{\alpha \beta }}{2}\,[\![{\textbf{D}}^{\gamma }{}_{{{\dot{\alpha }}}},\,{\textbf{D}}_{\gamma {{\dot{\beta }}}}]\!]=0. \end{aligned}$$
(4.7)

In other words, on restriction to an \(\alpha \)-plane, the SD HS-YM connection is totally flat—there is no ambiguity in defining the SD part of the field strength because its restriction to the \(\alpha \)-plane vanishes. Thus, the set of covariantly constant sections valued in the fundamental representation of GL\((N,{\mathbb {C}})\) is a set of constant functions.

Next, define the vector space

$$\begin{aligned} V|_{Z}=\left\{ {\mathfrak {s}}(x|\lambda ) \text{ valued } \text{ in } {\mathbb {C}}^N\otimes J^{\infty }_{{\mathbb {P}}^1}\,|\,{\textbf{D}}_{\alpha {{\dot{\alpha }}}}{\mathfrak {s}}=0\right\} \cong {\mathbb {C}}^N\otimes J^{\infty }_{{\mathbb {P}}^1}, \end{aligned}$$
(4.8)

and making the identification between the auxiliary projective spinor \(\lambda _{\alpha }\) on \({\mathbb {M}}\) and the coordinate on the base of the twistor fibration \(\pi :\mathbb{P}\mathbb{T}\rightarrow {\mathbb {P}}^1\). This provides a holomorphic construction of the fibers of a vector bundle \(V=E\otimes J^{\infty }_{{\mathbb {P}}^1}\rightarrow \mathbb{P}\mathbb{T}\) of appropriate rank, and by construction this bundle will be topologically trivial upon restriction to any twistor line X.

For the converse, given the vector bundle \(V\rightarrow \mathbb{P}\mathbb{T}\), the condition of holomorphicity is equivalent to the bundle being endowed with a partial connection

$$\begin{aligned} \bar{D}: \Omega ^0(\mathbb{P}\mathbb{T}, V)\rightarrow \Omega ^{0,1}(\mathbb{P}\mathbb{T}, V), \qquad \bar{D}^2=0. \end{aligned}$$
(4.9)

Locally, this partial connection can be written in terms of a potential \(\textsf{a}\in \Omega ^{0,1}(\mathbb{P}\mathbb{T},\textrm{End}\,V)\), with

$$\begin{aligned} \bar{D}={\bar{\partial }}+\textsf{a}, \qquad \textsf{a}=\sum _{s=1}^{\infty }a^{(s)}\,\partial _0^{s-1}, \qquad a^{(s)}\in \Omega ^{0,1}(\mathbb{P}\mathbb{T},\textrm{End}\,E\otimes {\mathcal {O}}(2s-2)), \nonumber \\ \end{aligned}$$
(4.10)

subject to

$$\begin{aligned} F^{(0,2)}[{\textsf{a}}]={\bar{\partial }}\textsf{a}+[\![\textsf{a},\,\textsf{a}]\!]=0. \end{aligned}$$
(4.11)

This is the condition that the partial connection on V is holomorphic. Locally, the action on sections of V is given by

$$\begin{aligned} \bar{D}\phi ={\bar{\partial }}\phi +[\![\textsf{a},\,\phi ]\!], \end{aligned}$$
(4.12)

for all \(\phi \in \Omega ^0(\mathbb{P}\mathbb{T},V)\).

The assumption of topological triviality on any twistor line X, combined with a “sufficient smallness” assumption on the data \(\textsf{a}\) implies that \(V|_{X}\) can also be holomorphically trivialized (cf., [102, 103]). This implies the existence of a holomorphic frame

$$\begin{aligned} H(x,\lambda ,\bar{\lambda }):V|_{X}\rightarrow {\mathbb {C}}^{N}\otimes J^{\infty }_{{\mathbb {P}}^1}, \qquad \bar{D}|_{X} H=0. \end{aligned}$$
(4.13)

Any such holomorphic frame is only unique up to transformations of the form

$$\begin{aligned} H\rightarrow H\,{\textbf{g}}(x,\lambda ), \qquad {\textbf{g}}(x,\lambda )=\sum _{s=1}^{\infty }g_{\alpha (2s-2)}(x)\,\lambda ^{\alpha (2s-2)}\,\partial _0^{s-1}, \end{aligned}$$
(4.14)

where the coefficient functions \(g_{\alpha (2s-2)}(x)\) are valued in the Lie algebra \(\mathfrak {gl}_N\).

In terms of the parametrization (4.10), the condition on the holomorphic frame reads

$$\begin{aligned} {\bar{\partial }}|_{X}H+\textsf{a}|_{X}\,H=0. \end{aligned}$$
(4.15)

Since \(\textsf{a}\) is defined on twistor space, the incidence relations (4.5) ensure that \(\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\textsf{a}|_X=0\). Furthermore, since \(\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\) is a holomorphic vector field, it follows that

$$\begin{aligned} {\bar{\partial }}|_{X}\left( H^{-1}\,\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\,H\right) =0. \end{aligned}$$
(4.16)

Thus, \(H^{-1}\,\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\,H\) is a holomorphic section of \({\mathcal {O}}(1)\otimes \mathfrak {gl}_N\otimes J^{\infty }_{{\mathbb {P}}^1}\). By a straightforward extension of Liouville’s theorem to weighted, bundle-valued functions, it follows that

$$\begin{aligned} H^{-1}\,\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\,H=\lambda ^{\alpha }\,{\textbf{A}}_{\alpha {{\dot{\alpha }}}}(x|\lambda ), \end{aligned}$$
(4.17)

where

$$\begin{aligned} {\textbf{A}}_{\alpha {{\dot{\alpha }}}}(x|\lambda ):=\sum _{s=1}^{\infty } A_{{{\dot{\alpha }}}(\alpha \beta (2s-2))}(x)\,\lambda ^{\beta (2s-2)}\,\partial _0^{s-1}. \end{aligned}$$
(4.18)

Under a change of holomorphic frame (4.14), it is easy to see that \({\textbf{D}}_{\alpha {{\dot{\alpha }}}}=\partial _{\alpha {{\dot{\alpha }}}}+{\textbf{A}}_{\alpha {{\dot{\alpha }}}}\) transforms as (2.8).Footnote 9 Thus, we recover the field content of HS-YM in terms of the usual higher-spin gauge connection.

The self-duality condition arises as a consequence of the integrability of the partial connection: \(\bar{D}^2=0\) imposes a constraint on the Lax pair \(\lambda ^{\alpha }{\textbf{D}}_{\alpha {{\dot{\alpha }}}}\), which is simply

$$\begin{aligned}{}[\![\lambda ^{\alpha }{\textbf{D}}_{\alpha {{\dot{\alpha }}}},\,\lambda ^{\beta }{\textbf{D}}_{\beta {{\dot{\beta }}}}]\!]=0 \quad \Leftrightarrow \quad F_{\alpha (2s)}=0 \,\,\, \forall s\ge 1. \end{aligned}$$
(4.19)

Thus, we obtain the SD HS-YM equations from the holomorphic bundle \(V\rightarrow \mathbb{P}\mathbb{T}\), as desired. \(\square \)

It is easy to adapt this theorem to other gauge groups, following the usual prescriptions (cf., [95]). For instance, to get gauge group SU(N), one must supplement the conditions on V with the requirement that it admit a positive real form and have trivial determinant line bundle. The theorem also descends to the real slices \({\mathbb {R}}^4\) and \({\mathbb {R}}^{2,2}\) (but not to real-valued fields on \({\mathbb {R}}^{1,3}\) due to the chirality of the theory and the SD sector). For Euclidean reality conditions, one requires the real form to be preserved under the anti-holomorphic involution which acts as the antipodal map on twistor lines, whereas for split signature the real form must descend to the \(\mathbb{R}\mathbb{P}^3\) real slice of twistor space.

This theorem also implies that:

Corollary 4.1

The SD sector of HS-YM theory is classically integrable.

Proof

This follows straightforwardly from the proof of Theorem 1, which equates the SD equations (4.1) with the integrability condition for an elliptic operator \(\bar{D}^2=0\) on twistor space. Equivalently, there is an integrable Lax pair associated with the SD sector, given by \(\lambda ^{\alpha }{\textbf{D}}_{\alpha {{\dot{\alpha }}}}\). \(\square \)

4.3 Action functional for the self-dual sector

The self-dual sector of pure Yang–Mills theory has many descriptions in terms of action functionals in four-dimensions which translate the classical integrability into a constraint on some auxiliary degrees of freedom. These formulations include the Chalmers-Siegel action (written in terms of an adjoint-valued scalar) [54] and a four-dimensional Wess–Zumino–Witten (WZW) model [104]. Using the Ward correspondence, it turns out that these and many other spacetime actions for SD Yang–Mills can be derived by performing dimensional reductions from holomorphic Chern–Simons theories on twistor space [46, 105, 106]; these theories require certain choices of boundary conditions to be well defined. Different gauge choices in twistor space induce different spacetime descriptions.

In light of Theorem 1, it is natural to ask if similar constructions hold for self-dual HS-YM. As self-duality is equated with integrability of the partial connection \(\bar{D}\) on a bundle \(E\otimes J^{\infty }_{{\mathbb {P}}^1}\rightarrow \mathbb{P}\mathbb{T}\), the natural starting point is an action functional on \(\mathbb{P}\mathbb{T}\) whose only equation of motion is (4.11): \(F^{(0,2)}[\textsf{a}]=0\). These are precisely the equations of motion of a holomorphic Chern–Simons theory for the partial connection \(\bar{D}\) [107, 108]. In general, these theories are only well defined on Calabi–Yau manifolds, where there is a global section of the canonical bundle to wedge against the holomorphic Chern–Simons form; since \(\mathbb{P}\mathbb{T}\) is not Calabi–Yau, making sense of the theory requires choosing some boundary conditions. Here, we only make one such choice; there are many others which would be interesting to investigate further.

Let us restrict our attention to Euclidean reality conditions, for which \(\mathbb{P}\mathbb{T}\cong {\mathbb {R}}^4\times {\mathbb {P}}^1\) and the incidence relations can be inverted

$$\begin{aligned} x^{\alpha {{\dot{\alpha }}}}=\frac{\hat{\mu }^{{{\dot{\alpha }}}}\,\lambda ^{\alpha }-\mu ^{{{\dot{\alpha }}}}\,\hat{\lambda }^{\alpha }}{\langle \lambda \,\hat{\lambda }\rangle }, \end{aligned}$$
(4.20)

where \(\hat{\lambda }^{\alpha }=(\bar{\lambda }^1,\,-\bar{\lambda }^0)\) and \(\hat{\mu }^{{{\dot{\alpha }}}}=(\bar{\mu }^{\dot{1}},\,-\bar{\mu }^{\dot{0}})\). Useful bases for the holomorphic and anti-holomorphic tangent and cotangent bundles of twistor space are provided with these reality conditions by [101]:

$$\begin{aligned} \partial _0=\frac{\hat{\lambda }_{\alpha }}{\langle \lambda \,\hat{\lambda }\rangle }\,\frac{\partial }{\partial \lambda _{\alpha }}, \quad \partial _{{{\dot{\alpha }}}}=-\frac{\hat{\lambda }^{\alpha }\,\partial _{\alpha {{\dot{\alpha }}}}}{\langle \lambda \,\hat{\lambda }\rangle }, \quad e^{0}=\langle \lambda \,\textrm{d}\lambda \rangle , \quad e^{{{\dot{\alpha }}}}=\lambda _{\alpha }\,\textrm{d}x^{\alpha {{\dot{\alpha }}}}, \end{aligned}$$
(4.21)

and

$$\begin{aligned} {\bar{\partial }}_0=-\langle \lambda \,\hat{\lambda }\rangle \,\lambda _{\alpha }\frac{\partial }{\partial \hat{\lambda }_{\alpha }}, \quad {\bar{\partial }}_{{{\dot{\alpha }}}}=\lambda ^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}, \quad \bar{e}^0=\frac{\langle \hat{\lambda }\,\textrm{d}\hat{\lambda }\rangle }{\langle \lambda \,\hat{\lambda }\rangle ^2}, \quad \bar{e}^{{{\dot{\alpha }}}}=\frac{\hat{\lambda }_{\alpha }\,\textrm{d}x^{\alpha {{\dot{\alpha }}}}}{\langle \lambda \,\hat{\lambda }\rangle }, \end{aligned}$$
(4.22)

respectively. With this in mind, we define a holomorphic Chern–Simons form

$$\begin{aligned} \textrm{hCS}[\textsf{a}]:= & {} \textrm{tr}\left( \textsf{a}\wedge {\bar{\partial }}\textsf{a}+\frac{2}{3}\,\textsf{a}\wedge [\![\textsf{a}\wedge \textsf{a}]\!]\right) \nonumber \\= & {} \sum _{s=1}^{\infty }\textrm{tr}\left( a^{(s)}\wedge {\bar{\partial }}a^{(s)}+\frac{2}{3}\,a^{(s)}\wedge \sum _{r+t=s+1}a^{(r)}\wedge a^{(t)}\right) \partial _0^{2s-2}, \end{aligned}$$
(4.23)

which takes values in \(\Omega ^{0,3}(\mathbb{P}\mathbb{T},(J^{\infty }_{{\mathbb {P}}^1})^2)\), where \((J^{\infty }_{{\mathbb {P}}^1})^2\) denotes the infinite jet bundle whose sections are composed of only even powers of \(\partial _0\).

To form a holomorphic Chern–Simons action, we must wedge this against a section \(\Omega \) of \(\Omega ^{3,0}(\mathbb{P}\mathbb{T},(J^{\infty \,\vee }_{{\mathbb {P}}^1})^2)\), where \(J^{\infty \,\vee }_{{\mathbb {P}}^1}\) is the dual of the infinite jet bundle, generated by \(e^0\) in (4.21). The pairing by inner product then eliminates all the generators of the infinite jet bundle and its dual, so that \(\Omega \wedge \textrm{hCS}[\textsf{a}]\) is a (3, 3)-form which makes sense to integrate over \(\mathbb{P}\mathbb{T}\).

However, since the canonical bundle of \(\mathbb{P}\mathbb{T}\) is \({\mathcal {O}}(-4)\) as a line bundle, some poles must be introduced to render \(\Omega \) weightless. There are many possible choices (cf., [105, 106]), but here we consider:

$$\begin{aligned} \Omega :=\frac{\textrm{D}^{3}Z}{\langle a\,\lambda \rangle ^{4}}\,\sum _{s=1}^{\infty }\left( \frac{e^0}{\langle a\,\lambda \rangle ^2}\right) ^{2s-2}, \end{aligned}$$
(4.24)

for \(\textrm{D}^{3}Z:=\epsilon _{ABCD}Z^{A}\textrm{d}Z^{B}\textrm{d}Z^{C}\textrm{d}Z^{D}\) the weight \(+4\) top holomorphic form on \(\mathbb{P}\mathbb{T}\). In other words, \(\Omega \) is defined by having poles (starting at fourth-order) at \(A^A=(0,a_{\alpha })\in \mathbb{P}\mathbb{T}\) on twistor space. With this choice, the holomorphic Chern–Simons action

$$\begin{aligned} S[\textsf{a}]=\frac{1}{2\pi \textrm{i}}\int _{\mathbb{P}\mathbb{T}}\Omega \wedge \textrm{hCS}[\textsf{a}], \end{aligned}$$
(4.25)

is well defined.

Naïvely, the field equations of this action are precisely \(F^{(0,2)}[\textsf{a}]=0\), as desired. However, the poles appearing in \(\Omega \) mean that in order to have a well-defined variational problem associated with this action, the twistor gauge potential \(\textsf{a}(Z)\) must have zeros of appropriate order in each term of its infinite jet bundle expansion. In particular, we must have that

$$\begin{aligned} \textsf{a}=\sum _{s=1}^{\infty }a^{(s)}\partial _0^{s-1}=\sum _{s=1}^{\infty }\langle a\,\lambda \rangle ^{2s}\,\varphi ^{(s)}(Z)\,\partial _0^{s-1}, \end{aligned}$$
(4.26)

where

$$\begin{aligned} \varphi ^{(s)}\in \Omega ^{0,1}(\mathbb{P}\mathbb{T},{\mathcal {O}}(-2)\otimes {\mathfrak {g}}), \qquad \forall \,s\ge 1, \end{aligned}$$
(4.27)

which can be thought of as a boundary condition on \(\textsf{a}\) at the point \(A^A=(0,a_{\alpha })\in \mathbb{P}\mathbb{T}\). Likewise, infinitesimal gauge transformations of the form \({\textsf{a}}\rightarrow {\textsf{a}}+\bar{\partial }\xi +[\![{\textsf{a}},\xi ]\!]\) must obey

$$\begin{aligned} \xi =\sum _{s=1}^{\infty }\langle a\,\lambda \rangle ^{2s}\,\psi ^{(s)}(Z)\,\partial _0^{s-1}\,,\qquad \psi ^{(s)}\in \Omega ^{0}(\mathbb{P}\mathbb{T},{\mathcal {O}}(-2)\otimes {\mathfrak {g}})\,, \end{aligned}$$
(4.28)

to preserve this boundary condition.

Note that we can use this gauge freedom, and the generic existence of a holomorphic trivialization of the bundle \(V\rightarrow \mathbb{P}\mathbb{T}\), to make the partial connection restricted to any holomorphic curve X pure gauge:

$$\begin{aligned} \textsf{a}|_{X}=\textsf{a}_0\,\bar{e}^0=\hat{\sigma }^{-1}{\bar{\partial }}|_{X}{\hat{\sigma }}, \end{aligned}$$
(4.29)

where \(\hat{\sigma }:\mathbb{P}\mathbb{T}\rightarrow G\otimes J^{\infty }_{{\mathbb {P}}^1}\), for gauge group G and \(\hat{\sigma }^{-1}\) is understood to be an inverse only with respect to the gauge group factor. Now, in the expansion (4.26), we can impose that each \(\varphi ^{(s)}\) is harmonic upon restriction to any twistor line, which implies that [101]

$$\begin{aligned} \varphi ^{(s)}|_{X}=\bar{e}^{0}\,\phi ^{(s)}(x), \end{aligned}$$
(4.30)

for \(\{\phi ^{(s)}(x)\}\) an infinite tower of adjoint-valued functions on \({\mathbb {R}}^4\), one for each spin in the spectrum of HS-YM.

This allows us to solve for the gauge transformation \(\hat{\sigma }\) explicitly, taking

$$\begin{aligned} \hat{\sigma }=\exp \left[ -\sum _{s=1}^{\infty }\frac{\langle a\,\lambda \rangle ^{2s-1}\,\langle a\,\hat{\lambda }\rangle }{\langle \lambda \,\hat{\lambda }\rangle }\,\phi ^{(s)}\,\partial _0^{s-1}\right] , \end{aligned}$$
(4.31)

where the residual gauge freedom is fixed by the boundary condition \(\hat{\sigma }(x,a)=\textrm{id}_{G}\). The partial connection in this gauge can then be written as

$$\begin{aligned} \bar{D}={\bar{\partial }}+\textsf{a}=\hat{\sigma }^{-1}\left( {\bar{\partial }}+\textsf{a}'_{{{\dot{\alpha }}}}\,\bar{e}^{{{\dot{\alpha }}}}\right) \hat{\sigma }, \end{aligned}$$
(4.32)

where the components of the field equation \(F^{(0,2)}[\textsf{a}]=0\) along the \({\mathbb {P}}^1\) fibers of the bundle \(\mathbb{P}\mathbb{T}\rightarrow {\mathbb {R}}^{4}\) dictate that

$$\begin{aligned} \textsf{a}'_{{{\dot{\alpha }}}}=\lambda ^{\alpha }\,\sum _{s=1}^{\infty }A_{\beta (2s-2)\alpha {\dot{\alpha }}}(x)\,\lambda ^{\beta (2s-2)}\,\partial _0^{s-1}, \end{aligned}$$
(4.33)

for the set of \(\{A_{\alpha (2s-1){{\dot{\alpha }}}}\}\) adjoint-valued HS-YM gauge potentials on \({\mathbb {R}}^4\).

Now, from (4.32) it follows that

$$\begin{aligned} \textsf{a}_{{{\dot{\alpha }}}}=\hat{\sigma }^{-1}\left( {\bar{\partial }}_{{{\dot{\alpha }}}}+\textsf{a}'_{{{\dot{\alpha }}}}\right) \hat{\sigma }, \end{aligned}$$
(4.34)

and a straightforward calculation using (4.31) and (4.22) leads to

$$\begin{aligned}{} & {} \textsf{a}_{{{\dot{\alpha }}}}=\hat{\sigma }^{-1}\left[ \langle \lambda \,a\rangle \,\sum _{s=1}^{\infty }\left( \langle a\,\lambda \rangle ^{2s-2}\,a^{\alpha }\partial _{\alpha {{\dot{\alpha }}}}\phi ^{(s)}-\hat{a}^{\alpha }\,A_{\alpha \beta (2s-2){{\dot{\alpha }}}}\,\lambda ^{\beta (2s-2)}\right) \,\partial _0^{s-1}\right. \nonumber \\{} & {} \quad \left. +\langle \lambda \,\hat{a}\rangle \,a^{\alpha }\,\sum _{s=1}^{\infty }A_{\alpha \beta (2s-2){{\dot{\alpha }}}}\,\lambda ^{\beta (2s-2)}\,\partial _0^{s-1}\right] \hat{\sigma } + O\big (\langle a\,\lambda \rangle ^{2s}\,\partial _0^{s-1}\big ), \end{aligned}$$
(4.35)

where “\(O(\langle a\,\lambda \rangle ^{2s}\,\partial _0^{s-1})\)” denotes terms which obey the boundary conditions (4.26). Removing the terms in (4.35) which violate the boundary conditions imposes

$$\begin{aligned} a^{\alpha }\,A_{\alpha \beta (2s-2){{\dot{\alpha }}}}=0, \qquad \hat{a}^{\alpha }\,A_{\alpha \beta (2s-2){{\dot{\alpha }}}}\,\lambda ^{\beta (2s-2)}=\langle a\,\lambda \rangle ^{2s-2}\,a^{\alpha }\,\partial _{\alpha {{\dot{\alpha }}}}\phi ^{(s)}, \end{aligned}$$
(4.36)

for each \(s\ge 1\). The solution to these constraints uniquely determines each HS-YM gauge potential:

$$\begin{aligned} A_{\alpha (2s-1){{\dot{\alpha }}}}(x)= a_{\alpha (2s-1)}\,a^{\beta }\,\partial _{\beta {{\dot{\alpha }}}}\phi ^{(s)}(x), \end{aligned}$$
(4.37)

in terms of the adjoint-valued scalar function \(\phi ^{(s)}\) at each spin.

It is now possible to feed these expressions back into the holomorphic Chern–Simons action (4.25) and integrate along the \({\mathbb {P}}^1\) fibers of twistor space to obtain an action functional on \({\mathbb {R}}^4\). The details of this computation are exactly the same as in the pure Yang–Mills case, so we refer the interested reader to [106]; following the steps in that calculation leads to the action:

$$\begin{aligned} S[\phi ]= & {} \frac{1}{2}\,\sum _{s=1}^{\infty }\,\int _{{\mathbb {R}}^{4}}\textrm{tr}\left( \textrm{d}\phi ^{(s)}\wedge *\textrm{d}\phi ^{(s)}\right) \nonumber \\{} & {} \quad +\frac{1}{3}\sum _{s=1}^{\infty }\,\int _{{\mathbb {R}}^4}\mu _{a,a}\wedge \textrm{tr}\left( \phi ^{(s)}\,\sum _{r+t=s+1}\textrm{d}\phi ^{(r)}\wedge \textrm{d}\phi ^{(t)}\right) , \end{aligned}$$
(4.38)

where

$$\begin{aligned} \mu _{a,a}:=a_{\alpha }\,a_{\beta }\,\textrm{d}x^{\alpha {{\dot{\alpha }}}}\wedge \textrm{d}x^{\beta }{}_{{{\dot{\alpha }}}}. \end{aligned}$$
(4.39)

The equations of motion for each adjoint-valued scalar are

$$\begin{aligned} \Box \phi ^{(s)}=-2\,a^{\alpha }\,a^{\beta }\,\sum _{r+t=s+1}\left[ \partial _{\alpha }{}^{{{\dot{\alpha }}}}\phi ^{(r)},\,\partial _{\beta {{\dot{\alpha }}}}\phi ^{(t)}\right] , \end{aligned}$$
(4.40)

which correspond precisely to the requirements that the HS-YM fields (4.37) are self-dual.

This result can be summarized as the following:

Theorem 2

The SD sector of HS-YM on \({\mathbb {R}}^4\) is equivalent an infinite tower of coupled, adjoint-valued scalars governed by the action (4.38). Furthermore, this theory is equivalent to a holomorphic Chern–Simons theory (4.25) on twistor space with volume form (4.24) and boundary conditions (4.26), in the sense that extrema of the two actions are in one-to-one correspondence (up to gauge transformations).

Observe that once again, the truncation to \(s=1\) is self-consistent, in which case the gauge potential (4.37) and action (4.38) reduce to the Chalmers-Siegel description of SD Yang–Mills in terms of an adjoint-valued scalar field [54].

We note that (4.40) are equivalent to the light-cone gauge description of SD HS-YM obtained in [37]. To see this, one simply fixes \(a^{\alpha }=(0,-1)\) and denotesFootnote 10

$$\begin{aligned} \partial ^{0\dot{0}}=\partial ^+\,,\quad \partial ^{1\dot{1}}=\partial ^-\,,\quad \partial ^{0\dot{1}}={\bar{\partial }}\,,\quad \partial ^{1\dot{0}}=-\partial \,, \end{aligned}$$
(4.41)

so that \(\Box =\partial ^+\partial ^-+\partial \bar{\partial }\). In this spin frame, (4.40) become

$$\begin{aligned} \Box \phi ^{(s)}=2\,\sum _{r+t=s+1}\Big ({\bar{\partial }} \phi ^{(r)}\partial ^+\phi ^{(t)}-\partial ^+\phi ^{(r)}\bar{\partial }\phi ^{(t)}\Big )\,, \end{aligned}$$
(4.42)

coinciding with the light-cone description of [37]. The presence of the transverse derivative \({\bar{\partial }}\) on the right-hand side of this equation is a hallmark of locality [109].

It is worth emphasizing that the action (4.38) describes the purely SD sector of HS-YM, and as such it contains no negative helicity degrees of freedom. As in the analogous description of pure SD Yang–Mills, this means that the price for obtaining such an action is broken Lorentz invariance (cf., [54, 110, 111]). This breaking of Lorentz invariance stems from the choice (4.24) of poles for the holomorphic measure on twistor space, \(\Omega \).

5 Twistor action for HS-YM

Having established the classical integrability of the SD sector of HS-YM, we now turn to describing full HS-YM using twistor theory. This is possible because HS-YM admits a perturbative expansion around the SD sector, as evident when expressed in terms of a Lagrange multiplier field as in (2.19)–(2.20). It is fairly straightforward to construct the associated twistor action following the same recipe for pure Yang–Mills [55, 56, 112]. With the HS-YM twistor action in-hand, the tree-level MHV amplitudes are obtained by perturbatively expanding the portion of the action which encodes the non-SD interactions.

5.1 Twistor action functional

Let us now write down a twistorial description of full (classical) HS-YM by first recalling its spacetime action functional

$$\begin{aligned} S[{\textbf{A}},{\textbf{B}}]=\sum _{s=1}^{\infty }\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,F^{\alpha (2s)}\right) +\frac{\textrm{g}^2}{4}\,\int _{{\mathbb {M}}}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,B^{\alpha (2s)}\right) . \end{aligned}$$
(5.1)

The first set of terms in this action describes the SD sector (and its negative helicity perturbations), while the second set describes the linear non-SD fluctuations around SD HS-YM “background.” Using Theorem 1 and the Penrose transform, one infers that the first set corresponds to a holomorphic BF-action on twistor space:

$$\begin{aligned} S_{\textrm{SD}}[\textsf{a},\textsf{b}]=\frac{\textrm{i}}{2\pi }\int _{\mathbb{P}\mathbb{T}}\textrm{D}^{3}Z\wedge \textrm{tr}\left( \textsf{b}\wedge F^{(0,2)}[\textsf{a}]\right) , \end{aligned}$$
(5.2)

where \(\textsf{b}\in \Omega ^{0,1}(\mathbb{P}\mathbb{T},\textrm{End} E\otimes J^{\infty \,\vee }_{{\mathbb {P}}^1}\otimes {\mathcal {O}}(-4))\). Expanded in \(J^{\infty \,\vee }_{{\mathbb {P}}^1}\), the twistor field \(\textsf{b}\) is

$$\begin{aligned} \textsf{b}=\sum _{s=1}^{\infty }b^{(s)}\,(e^0)^{s-1}, \qquad b^{(s)}\in \Omega ^{0,1}(\mathbb{P}\mathbb{T},\textrm{End}E\otimes {\mathcal {O}}(-2s-2)), \end{aligned}$$
(5.3)

where we recall that \(e^0\) is “eaten” by inner product with \(\partial _0\) (i.e. ) in (5.2), so that all projective scalings are respected. The resulting equations of motion for (5.2) on twistor space read

$$\begin{aligned} F^{(0,2)}[\textsf{a}]=0, \qquad \bar{D}\textsf{b}=0. \end{aligned}$$
(5.4)

The first of these corresponds to the SD HS-YM equations, by virtue of Theorem 1, while the second corresponds to the equations of motion of negative helicity HS-YM fields in a SD background.

The latter follows from a non-abelian extension of the Penrose transform [100, 113]:

$$\begin{aligned} \left\{ {\textbf{B}}_{\alpha \beta }\in \Omega ^{2}_{-}\otimes {\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1} \text{ obeying } {\textbf{D}}^{\alpha {{\dot{\alpha }}}}{\textbf{B}}_{\alpha \beta }=0\right\} \cong H^{0,1}_{\bar{D}}(\mathbb{P}\mathbb{T},\textrm{End} V\otimes {\mathcal {O}}(-4)), \nonumber \\ \end{aligned}$$
(5.5)

where the HS-YM connection \({\textbf{D}}_{\alpha {{\dot{\alpha }}}}\) is assumed to be SD and \(H^{0,1}_{\bar{D}}\) is the Dolbeault cohomology group defined with respect to \(\bar{D}={\bar{\partial }}+\textsf{a}\) (which obeys \(\bar{D}^2=0\)). Given a cohomology class in this group, the spacetime master field is constructed by an integral formula

$$\begin{aligned} {\textbf{B}}_{\alpha \beta }(x|\lambda )=\int _{X}\textrm{D}\lambda '\wedge \lambda '_{\alpha }\,\lambda '_{\beta }\,H^{-1}(x,\lambda ')\,\textsf{b}|_{X}\,H(x,\lambda ')\,\sum _{s=1}^{\infty }\left( \langle \lambda \,\lambda '\rangle ^2\,\partial _0\,e_0'\right) ^{s-1}, \nonumber \\ \end{aligned}$$
(5.6)

whereFootnote 11\(\textrm{D}\lambda '\equiv \langle \lambda '\,\textrm{d}\lambda '\rangle \), \(\lambda '\) is the homogeneous coordinate on \(X\cong {\mathbb {P}}^1\) (which is integrated over) and \(\lambda \) plays the role of the auxiliary parameter in the spacetime master field \({\textbf{B}}_{\alpha \beta }\). To see that this indeed solves the desired equation of motion, one uses the definition of the holomorphic frame, which implies \(\lambda '_{\alpha }{\textbf{D}}^{\alpha {{\dot{\alpha }}}}H(x,\lambda ')=0\), so that

$$\begin{aligned} {\textbf{D}}^{\alpha {{\dot{\alpha }}}}{\textbf{B}}_{\alpha \beta }(x|\lambda )= & {} \int _{X}\textrm{D}\lambda '\wedge \lambda '_{\alpha }\,\lambda '_{\beta }{\textbf{D}}^{\alpha {{\dot{\alpha }}}}\,H^{-1}(x,\lambda ')\,\textsf{b}|_{X}\,H(x,\lambda ')\,\sum _{s=1}^{\infty }\left( \langle \lambda \,\lambda '\rangle ^2\,\partial _0\,e_0'\right) ^{s-1} \nonumber \\= & {} \int _{X}\textrm{D}\lambda '\wedge \lambda '_{\beta }\,H^{-1}(x,\lambda ')\left( \lambda '_{\alpha }\partial ^{\alpha {{\dot{\alpha }}}}\textsf{b}|_{X}\right) H(x,\lambda ')\,\sum _{s=1}^{\infty }\left( \langle \lambda \,\lambda '\rangle ^2\,\partial _0\,e_0'\right) ^{s-1} \nonumber \\= & {} 0, \end{aligned}$$
(5.7)

with the final equality following because \(\lambda '_{\alpha }\partial ^{\alpha {{\dot{\alpha }}}}\textsf{b}|_{X}=0\) as a consequence of the incidence relations.

The non-abelian twistor integral formula (5.6) also suggests how to formulate the non-SD interactions of HS-YM non-locally on twistor space

$$\begin{aligned} I[\textsf{a},\textsf{b}]= & {} \int \limits _{{\mathbb {M}}\times {\mathbb {P}}^1\times {\mathbb {P}}^1}\!\!\textrm{d}^{4}x\,\textrm{D}\lambda _{1}\,\textrm{D}\lambda _2\,\langle \lambda _1\,\lambda _2\rangle ^2\,{\mathcal {P}}_{12} \nonumber \\{} & {} \quad \times \,\textrm{tr}\left[ H^{-1}(x,\lambda _1)\,\textsf{b}(x,\lambda _1)\,H(x,\lambda _1)\,H^{-1}(x,\lambda _2)\,\textsf{b}(x,\lambda _2)\,H(x,\lambda _2)\right] ,\nonumber \\ \end{aligned}$$
(5.8)

where the integral is taken over two copies of the same line in twistor space along with integration over the moduli space \({\mathbb {M}}\) of these lines. The object \({\mathcal {P}}_{12}\) is a “spin projector,” valued in \(J^{\infty }_{{\mathbb {P}}^1,1}\otimes J^{\infty }_{{\mathbb {P}}^{1},2}\) whose role is to absorb the factors of \(J^{\infty \,\vee }_{{\mathbb {P}}^1}\) associated to each insertion of \(\textsf{b}\). It is defined by the requirements that it is holomorphic and has no scaling weight in \(\lambda _1\) or \(\lambda _2\).

The non-local twistor action (5.8) can be “compressed” further by denoting \(\textsf{b}_i\equiv \textsf{b}(x,\lambda _i)\) and introducing the holomorphic Wilson line [103, 114]

$$\begin{aligned} U_{X}(\lambda _i,\lambda _j):=H(x,\lambda _i)\,H^{-1}(x,\lambda _j), \end{aligned}$$
(5.9)

associated to the partial connection \(\bar{D}={\bar{\partial }}+\textsf{a}\) on the bundle \(V\rightarrow \mathbb{P}\mathbb{T}\). These holomorphic Wilson lines act by parallel transport with respect to \(\bar{D}\) for which they are formal Green’s functions on the twistor lines X:

$$\begin{aligned} U_{X}(\lambda _i,\lambda _j):\,V|_{X,\lambda _j}\rightarrow V|_{X,\lambda _i}, \qquad U_{X}(\lambda _i,\lambda _i)=\textrm{id}_{{\mathfrak {g}}}, \end{aligned}$$
(5.10)

where

$$\begin{aligned} \bar{D}|_{X_i}U_{X}(\lambda _i,\lambda _j)=\textrm{id}_{{\mathfrak {g}}}\,\bar{\delta }(\langle \lambda _i\,\lambda _j\rangle ), \qquad \bar{\delta }(z):=\frac{1}{2\pi \textrm{i}}\,{\bar{\partial }}\left( \frac{1}{z}\right) . \end{aligned}$$

Here, \(\textrm{id}_{{\mathfrak {g}}}\) is the identity in the adjoint representation of the gauge group.

In practical terms, the holomorphic Wilson loop can be represented as a path-ordered exponential

$$\begin{aligned} U_{X}(\lambda _i,\lambda _j)=P\,\exp \left( -\int _{X}\omega _{ij}\wedge \textsf{a}\right) , \end{aligned}$$
(5.11)

where \(\omega _{ij}\) is a meromorphic differential on \({\mathbb {P}}^1\) valued in \(J^{\infty \,\vee }_{{\mathbb {P}}^1}\) with an infinite series of higher-order poles:

$$\begin{aligned} \omega _{ij}(\lambda ):=\frac{\textrm{D}\lambda }{2\pi \textrm{i}}\,\frac{\langle \lambda _i\,\lambda _j\rangle }{\langle \lambda _i\,\lambda \rangle \,\langle \lambda \,\lambda _j\rangle }\,\sum _{s=1}^{\infty }\left( \frac{e^0\,\langle \lambda _i\,\lambda _j\rangle }{\langle \lambda _i\,\lambda \rangle \,\langle \lambda \,\lambda _j\rangle }\right) ^{s-1}. \end{aligned}$$
(5.12)

The path-ordering symbol P means that the holomorphic Wilson line can be expanded as an infinite series

$$\begin{aligned} U_{X}(\lambda _i,\lambda _j)=\textrm{id}_{{\mathfrak {g}}}+\sum _{m=1}^{\infty }(-1)^m\,\prod _{k=1}^{m}\frac{\langle \lambda _i\,\lambda _j\rangle \,\textrm{D}\lambda _k\,\textsf{a}_k}{\langle \lambda _{k-1}\,\lambda _k\rangle \,\langle \lambda _k\,\lambda _{k+1}\rangle }\,\sum _{s_k=1}^{\infty }\left( \frac{e_k^0\,\langle \lambda _i\,\lambda _j\rangle }{\langle \lambda _{i}\,\lambda _k\rangle \,\langle \lambda _k\,\lambda _{j}\rangle }\right) ^{s_k-1},\nonumber \\ \end{aligned}$$
(5.13)

where \(\lambda _0\equiv \lambda _i\) and \(\lambda _{m+1}\equiv \lambda _j\).

With these definitions, (5.8) becomes

$$\begin{aligned} I[\textsf{a},\textsf{b}]=\int \limits _{{\mathbb {M}}\times {\mathbb {P}}^1\times {\mathbb {P}}^1}\!\!\textrm{d}^{4}x\,\textrm{D}\lambda _{1}\,\textrm{D}\lambda _2\,\langle \lambda _1\,\lambda _2\rangle ^2\,{\mathcal {P}}_{12}\,\textrm{tr}\left[ \textsf{b}_1\,U_X(\lambda _1,\lambda _2)\,\textsf{b}_2\,U_{X}(\lambda _2,\lambda _1)\right] ,\nonumber \\ \end{aligned}$$
(5.14)

and we define the full twistor action:

$$\begin{aligned} S[\textsf{a},\textsf{b}]=S_{\textrm{SD}}[\textsf{a},\textsf{b}]+\frac{\textrm{g}^2}{4}\,I[\textsf{a},\textsf{b}]. \end{aligned}$$
(5.15)

It is easy to see that this action is invariant under gauge transformations

$$\begin{aligned} \textsf{a}\rightarrow {\textbf{g}}\,\textsf{a}\,{\textbf{g}}^{-1}-{\bar{\partial }}{\textbf{g}}\,{\textbf{g}}^{-1}, \qquad \textsf{b}\rightarrow {\textbf{g}}\,\textsf{b}\,{\textbf{g}}^{-1}, \end{aligned}$$
(5.16)

for any homogeneous function \({\textbf{g}}(Z)\) on \(\mathbb{P}\mathbb{T}\) valued in \({\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}\), since the holomorphic Wilson line transforms as

$$\begin{aligned} U_{X}(\lambda _1,\lambda _2)\rightarrow {\textbf{g}}(x,\lambda _1)\,U_{X}(\lambda _1,\lambda _2)\,{\textbf{g}}^{-1}(x,\lambda _2). \end{aligned}$$
(5.17)

The action also enjoys another local symmetry which acts only on the field \(\textsf{b}\):

$$\begin{aligned} \textsf{b}\rightarrow \textsf{b}+\bar{D}{\textbf{f}}, \qquad {\textbf{f}}\in \Omega ^0(\mathbb{P}\mathbb{T},{\mathcal {O}}(-4)\otimes {\mathfrak {g}}\otimes J^{\infty \,\vee }_{{\mathbb {P}}^1}). \end{aligned}$$
(5.18)

It is now possible to establish the following result:

Theorem 3

The twistor action (5.15) is equivalent to HS-YM theory on \({\mathbb {R}}^4\), in the sense that solutions to its field equations are in one-to-one correspondence with solutions to the field equations of HS-YM (up to spacetime gauge transformations). Furthermore, the twistor action and HS-YM actions take the same value when evaluated on corresponding field configurations.

Proof

The proof follows exactly the same steps as in the construction of the twistor action for pure Yang–Mills theory [55, 56]. The gauge freedom (5.16)–(5.18) is used to put the twistor fields \(\textsf{a},\textsf{b}\) into “harmonic” gauge

$$\begin{aligned} {\bar{\partial }}^*|_{X}\textsf{a}|_{X}=0={\bar{\partial }}^{*}|_{X}\textsf{b}|_{X}, \end{aligned}$$
(5.19)

where \({\bar{\partial }}^*|_{X}\) is the adjoint of the \({\bar{\partial }}\)-operator restricted to any twistor line. Since \(\textsf{a}\) and \(\textsf{b}\) are (0, 1)-forms on \(\mathbb{P}\mathbb{T}\), it follows on dimensional grounds that \({\bar{\partial }}|_{X}\textsf{a}|_{X}=0={\bar{\partial }}|_{X}\textsf{b}|_X\), so the gauge condition (5.19) is equivalent to

$$\begin{aligned} \Delta _{X}\textsf{a}|_X=0=\Delta _{X}\textsf{b}|_X, \end{aligned}$$
(5.20)

where \(\Delta _X\) is the Laplacian on \({\mathbb {P}}^1\). As \(H^{1}({\mathbb {P}}^1,{\mathcal {O}})=0\), this implies that \(\textsf{a}|_{X}=0\). Residual gauge transformations must then respect

$$\begin{aligned} \begin{aligned} {\bar{\partial }}^*|_{X}\,{\bar{\partial }}|_{X}{\textbf{g}}(Z)=0&\qquad \Rightarrow \quad {\textbf{g}}(Z)={\textbf{g}}(x|\lambda )\in \Omega ^0({\mathbb {R}}^4,{\mathfrak {g}}\otimes J^{\infty }_{{\mathbb {P}}^1}), \\ {\bar{\partial }}^*|_{X}\,{\bar{\partial }}|_{X}{\textbf{f}}(Z)=0&\qquad \Rightarrow \quad {\textbf{f}}(Z)=0, \end{aligned} \end{aligned}$$
(5.21)

with the last relation following from \(H^0({\mathbb {P}}^1,{\mathcal {O}}(-4))=0\). This means that the residual gauge transformations are precisely the expected spacetime gauge transformations of HS-YM.

Now, in this gauge, the twistor fields can be expanded as [101]

$$\begin{aligned} \textsf{a}&=\textsf{a}_{{{\dot{\alpha }}}}\,\bar{e}^{{{\dot{\alpha }}}}\,, \end{aligned}$$
(5.22a)
$$\begin{aligned} \textsf{b}&=\textsf{b}_{{{\dot{\alpha }}}}\,\bar{e}^{{{\dot{\alpha }}}}+\sum _{s=1}^{\infty }(2s+1)\,\frac{B_{\alpha (2s)}(x)\,\hat{\lambda }^{\alpha (2s)}}{\langle \lambda \,\hat{\lambda }\rangle ^{2s}}\,\bar{e}^0\,(e^0)^{s-1}\,, \end{aligned}$$
(5.22b)

with the components \(\textsf{a}_{{{\dot{\alpha }}}}\), \(\textsf{b}_{{{\dot{\alpha }}}}\) as yet unconstrained. First, consider the portion of the action corresponding to \(S_{\textrm{SD}}\); evaluated on the fields (5.22) in harmonic gauge this is:

$$\begin{aligned} S_{\textrm{SD}}= & {} \frac{\textrm{i}}{2\pi }\int _{\mathbb{P}\mathbb{T}}\frac{\textrm{D}^{3}Z\wedge \textrm{D}^{3}\hat{Z}}{\langle \lambda \,\hat{\lambda }\rangle ^4}\,\textrm{tr}\Bigg [\textsf{b}_{{{\dot{\alpha }}}}\,{\bar{\partial }}_0\textsf{a}^{{{\dot{\alpha }}}} \nonumber \\{} & {} \quad \left. +\sum _{s=1}^{\infty }(2s+1)\,\frac{B_{\alpha (2s)}\,\hat{\lambda }^{\alpha (2s)}}{\langle \lambda \,\hat{\lambda }\rangle ^{2s}}\left( {\bar{\partial }}_{{{\dot{\beta }}}}a^{(s)\,{{\dot{\beta }}}}-\frac{1}{2}\sum _{r+t=s+1}[a^{(r)}_{{{\dot{\beta }}}},\,a^{(t)\,{{\dot{\beta }}}}]\right) \right] .\nonumber \\ \end{aligned}$$
(5.23)

Clearly, the field components \(\textsf{b}_{{{\dot{\alpha }}}}\) enter only as Lagrange multipliers. Integrating them out imposes \({\bar{\partial }}_0\textsf{a}_{{{\dot{\alpha }}}}=0\), which by the usual extension of Liouville’s theorem implies that

$$\begin{aligned} a^{(s)}_{{{\dot{\alpha }}}}=A_{\alpha (2s-1){{\dot{\alpha }}}}(x)\,\lambda ^{\alpha (2s-1)}, \qquad \text{ for } \text{ all } \,s\ge 1. \end{aligned}$$
(5.24)

The \({\mathbb {P}}^1\) degrees of freedom in (5.23) can now be integrated out (cf., [56, 115]), leaving the desired

$$\begin{aligned} S_{\textrm{SD}}=\sum _{s=1}^{\infty }\int _{{\mathbb {R}}^4}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,F^{\alpha (2s)}\right) , \end{aligned}$$
(5.25)

for the SD part of the HS-YM action.

The non-local part of the twistor action is also easily evaluated in the harmonic gauge (5.22). Since \(\textsf{a}|_{X}=0\) in this gauge, the holomorphic Wilson lines become trivial, \(U_{X}(\lambda _1,\lambda _2)=\textrm{id}_{{\mathfrak {g}}}\), so the action reduces to

$$\begin{aligned} I=\int \limits _{{\mathbb {R}}^{4}\times {\mathbb {P}}^1\times {\mathbb {P}}^1}\textrm{d}^{4}x\,\textrm{D}\lambda _1\,\textrm{D}\lambda _2\,\langle \lambda _1\,\lambda _2\rangle ^{2}\,{\mathcal {P}}_{12}\,\textrm{tr}\left( \textsf{b}_1\,\textsf{b}_{2}\right) . \end{aligned}$$
(5.26)

Now, the requirements of homogeneity and holomorphicity uniquely fix the spin projector to be diagonal

$$\begin{aligned} {\mathcal {P}}_{12}=\sum _{s=1}^{\infty }\langle \lambda _1\,\lambda _2\rangle ^{2s-2}\,\left( \partial _{0\,1}\,\partial _{0\,2}\right) ^{s-1}, \end{aligned}$$
(5.27)

which further simplifies the action to

$$\begin{aligned} I= & {} \sum _{s=1}^{\infty }\, (2s+1)^2\,\int \limits _{{\mathbb {R}}^4\times {\mathbb {P}}^1\times {\mathbb {P}}^1} \!\!\textrm{d}^{4}x\,\frac{\textrm{D}\lambda _1\wedge \textrm{D}\hat{\lambda }_1}{\langle \lambda _1\,\hat{\lambda }_1\rangle ^{2s+2}}\,\frac{\textrm{D}\lambda _2\wedge \textrm{D}\hat{\lambda }_2}{\langle \lambda _2\,\hat{\lambda }_2\rangle ^{2s+2}}\,\langle \lambda _1\,\lambda _2\rangle ^{2s} \nonumber \\{} & {} \quad \times \,\hat{\lambda }_{1}^{\alpha (2s)}\,\hat{\lambda }_{2}^{\beta (2s)}\,\textrm{tr}\left( B_{\alpha (2s)}\,B_{\beta (2s)}\right) . \end{aligned}$$
(5.28)

Once again, the two \({\mathbb {P}}^1\) factors can be integrated out (cf., [56, 115]), leaving

$$\begin{aligned} I=\sum _{s=1}^{\infty }\int _{{\mathbb {R}}^4}\textrm{d}^{4}x\,\textrm{tr}\left( B_{\alpha (2s)}\,B^{\alpha (2s)}\right) , \end{aligned}$$
(5.29)

as desired.

This establishes that the twistor action (5.15) is literally equivalent to the spacetime HS-YM action in the harmonic gauge. Since the twistor action is itself gauge invariant, this completes the proof. \(\square \)

5.2 MHV amplitudes from twistor space

It is natural to ask what a twistor description of HS-YM theory is actually good for. There are many potential answers to this question, but one that we pursue here is that twistor actions provide an easy way to obtain all-multiplicity scattering amplitude formulae for the MHV sector, as this is the first non-trivial scattering sector as we perturb away from self-duality. In particular, the classical generating functional for the tree-level MHV amplitudes is given by the non-local term in the twistor action, considered as a multi-linear functional of on-shell (i.e., \({\bar{\partial }}\)-closed) twistor fields—see [75,76,77] for further explanation of this fact, which applies to any twistor action of the generic form \(S_{\textrm{SD}}+(\text{ coupling})^2 I\).

Using the perturbative expansion of the holomorphic Wilson line (5.13), this framework provides a twistorial formula for the n-point MHV amplitude:

$$\begin{aligned}{} & {} {\mathcal {A}}_{n}^{\textrm{MHV}}:={\mathcal {A}}_{n}(1^{+}_{s_1},\ldots ,i^{-}_{s_i},\ldots ,j^{-}_{s_j},\ldots ,n^{+}_{s_n}) \nonumber \\{} & {} \quad =\int \textrm{d}^{4}x\,{\mathcal {P}}_{ij}|_{s_i,s_j}\,\frac{\langle \lambda _i\,\lambda _j\rangle ^{6-n+\sum _{a\ne i,j}s_a}\,b^{(s_i)}_i\,b^{(s_j)}_j\,\textrm{D}\lambda _i\,\textrm{D}\lambda _j}{\langle \lambda _1\,\lambda _2\rangle \,\langle \lambda _2\,\lambda _3\rangle \cdots \langle \lambda _n\,\lambda _1\rangle } \prod _{b\ne i\,j}\frac{\textrm{D}\lambda _b\,a^{(s_b)}_{b}}{\langle \lambda _i\,\lambda _b\rangle ^{s_b-1}\,\langle \lambda _b\,\lambda _j\rangle ^{s_b-1}},\nonumber \\ \end{aligned}$$
(5.30)

where \({\mathcal {P}}_{ij}|_{s_i,s_j}\) denotes the portion of the spin projector that selects the spins \(s_i\) and \(s_j\) for the negative helicity twistor representatives, ensuring that the integrand of this expression is homogeneous of degree zero in each point on the twistor line.

Now, in the presence of additional insertions on \({\mathbb {P}}^1\), the spin projector is

$$\begin{aligned} {\mathcal {P}}_{ij}=\sum _{s_i,s_j=1}^{\infty }\partial _{0\,i}^{s_i-1}\,\partial _{0\,j}^{s_j-1}\,\tilde{\delta }(s_i-s_j)\,\tilde{\delta }\!\left( 2-n+\sum _{a\ne i,j}s_a\right) \,\langle \lambda _i\,\lambda _j\rangle ^{2s_i-2}, \end{aligned}$$
(5.31)

as dictated by homogeneity, holomorphicity and gauge invariance in each of the positive helicity insertions. Feeding this into (5.30), the spin constraints set all of the positive helicity external states to have spin-1, leaving

$$\begin{aligned} {\mathcal {A}}_{n}^{\textrm{MHV}}=\tilde{\delta }(s_i-s_j)\,\int \textrm{d}^{4}x\,\frac{\langle \lambda _i\,\lambda _j\rangle ^{2s_i+2}\,b^{(s_i)}_i\,b^{(s_j)}_j\,\textrm{D}\lambda _i\,\textrm{D}\lambda _j}{\langle \lambda _1\,\lambda _2\rangle \,\langle \lambda _2\,\lambda _3\rangle \cdots \langle \lambda _n\,\lambda _1\rangle } \prod _{b\ne i\,j}\textrm{D}\lambda _b\,a^{(1)}_{b}, \end{aligned}$$
(5.32)

as the expression of the MHV amplitude on twistor space.

Now, to obtain a formula in momentum space we can simply evaluate (5.30) on momentum eigenstate representatives [116]:

$$\begin{aligned} \begin{aligned} a^{(s_b)}_{b}&=\int _{{\mathbb {C}}^*}\frac{\textrm{d}t_b}{t_b^{2s_b-1}}\,\bar{\delta }^{2}(\kappa _b-t_b\,\lambda )\,\textrm{e}^{\textrm{i}\,t_b\,[\mu \,b]}, \qquad b\ne i,j, \\ b^{(s_c)}_{c}&=\int _{{\mathbb {C}}^*}\textrm{d}t_{c}\,t_{c}^{2s_c+1}\,\bar{\delta }^{2}(\kappa _c-t_c\,\lambda )\,\textrm{e}^{\textrm{i}\,t_c\,[\mu \,c]}, \qquad c= i,j, \end{aligned} \end{aligned}$$
(5.33)

where the holomorphic delta functions are defined by

$$\begin{aligned} \bar{\delta }^{2}(z):=\frac{1}{(2\pi \textrm{i})^2}\,\bigwedge _{\alpha =0,1}{\bar{\partial }}\left( \frac{1}{z_\alpha }\right) . \end{aligned}$$
(5.34)

Inserting these representatives into (5.30) and using the explicit form of the spin projector (5.31), all of the integrals can be performed algebraically. In particular, the scale integrals in \(t_b,t_c\) and \({\mathbb {P}}^1\) integrals are all performed against the holomorphic delta functions appearing in the twistor representatives, and the spacetime integration simply results in a momentum conserving delta function. This leaves

$$\begin{aligned} {\mathcal {A}}_{n}^{\textrm{MHV}}=(2\pi )^{4}\,\delta ^{4}\!\left( \sum _{m=1}^{n}k_{m}\right) \,\tilde{\delta }(s_i-s_j)\,\frac{\langle i\,j\rangle ^{2s_i+2}}{\langle 1\,2\rangle \,\langle 2\,3\rangle \cdots \langle n\,1\rangle }, \end{aligned}$$
(5.35)

exactly matching the earlier claim (3.20) for n-point MHV scattering in HS-YM.

6 Discussion

In this paper, we considered higher-spin Yang–Mills (HS-YM) theory: a non-abelian, chiral gauge theory with higher-spin degrees of freedom which extends previous constructions in the literature [37,38,39] away from the purely self-dual sector. The theory has a complex action in real Lorentzian Minkowski spacetime, meaning that it is non-unitary and parity-violating, and its interaction vertices never contain more than a single spacetime derivative. Remarkably, these properties are enough for the theory to have non-vanishing higher-spin tree-level scattering amplitudes. The self-dual sector of the theory is classically integrable, and we used a twistor manifestation of this fact to explicitly construct the MHV tree-amplitudes of the theory.

While the non-unitarity and parity-violation of HS-YM are physically undesirable, it is surprising that such an otherwise fairly well-behaved theory (local, with only cubic and quartic interactions) has non-trivial scattering amplitudes. This contrasts with the widespread belief that non-trivial higher-spin scattering in flat spacetime requires some element of non-locality (cf., [28, 83, 117,118,119]); it seems that the “get-out-of-jail-free” card in the case of HS-YM is the intrinsic chirality of the fields, which enables an interacting theory to be constructed with only single-derivative vertices. Metaphorically speaking, if local higher-spin theories with trivial scattering amplitudes (such as chiral higher-spin gravity [19, 20, 28] or self-dual HS-YM [37, 38]) live on a “local island” in the space of higher-spin theories, surrounded by a sea of non-local theories, then HS-YM lives on some sort of chiral buffer zone between the two. This buffer zone is characterized by taking perturbative deformations of theories which live on the island; the HS-YM studied in this paper is clearly one example of a theory in the buffer zone, and there should be other examples, such as the higher-spin gauge theory induced by the IKKT matrix model [64, 67]. It would be very interesting to generate further examples of such buffer zone theories, along with their explicitly non-vanishing scattering amplitudes.

There are many other open questions and directions to explore following on from this work. In the first instance, another perspective on the restriction to spin-1 positive helicity degrees of freedom is desirable. Formulating HS-YM on the light-cone, where the need to make explicit choices for the polarization basis is removed, could shed light on this, as well as providing an independent check on our results.

All of the considerations in this paper have been classical; we have said nothing about the quantum consistency of HS-YM. A warm-up to answering this larger question would be to consider the quantum integrability of self-dual HS-YM; it is expected that SD HS-YM will have non-vanishing all-positive helicity 1-loop scattering amplitudes that represent an anomaly to integrability, or equivalently, an anomaly in the twistor description of the self-dual sector [42, 46, 49]. In any case, this anomaly will boil down to a partition function-like calculation involving a sum over the degrees of freedom in the theory. Using zeta function regularization to treat the spectral sum (cf., [120]), this will lead to \(2\sum _{s=1}^{\infty }1=2\zeta (0)=-1\) and hence a non-vanishing anomaly. It would seem that an easy way to kill this anomaly would be to couple HS-YM with a complex scalar, with a term like \({\textbf{D}}^{\alpha {{\dot{\alpha }}}}\Phi {\textbf{D}}_{\alpha {{\dot{\alpha }}}}\bar{\Phi }\) in the Lagrangian. It would be interesting to consider this in more detail, both in spacetime and on twistor space.

It may also be interesting to explore HS-YM in the context of flat space, or celestial, holography (see [121, 122] for reviews). For pure Yang–Mills theory and gravity, it has been shown that the classical infinite-dimensional symmetry algebras associated with the self-dual sectors have a natural manifestation on the celestial sphere in terms of local operators and their operator product expansions [123, 124], and that these emerge naturally from the twistor descriptions of the self-dual sectors [125]. Recently, it has been shown that Moyal deformations of the self-dual theories lead to enhancements of these classical symmetry algebras to their “quantum” deformations—although the theories under consideration are still tree-level or 1-loop exact [126,127,128]. It has already been observed that these deformations are most naturally linked to a chiral higher-spin enhancement of the spacetime theories (rather than anything truly quantum mechanical), and self-dual HS-YM—and its twistor description—provides a first explicit realization of this fact. But more generally, it would be fascinating to explore how HS-YM and its scattering amplitudes fit into the recent proposals for higher-spin holography in asymptotically flat spacetimes [129, 130].