1 Introduction

The three-dimensional structure of the composite hadrons has widely been analyzed through the study of transverse-momentum dependent parton distribution functions (TMDs) in the framework of TMD factorization. The various TMDs can be accessed in hadronic processes with a small transverse momentum (TM), denoted by \(q_T\), of the detected final state [1,2,3]. TMDs need to be extracted from experimental data for such processes as they are intrinsically nonperturbative objects and therefore cannot be computed using perturbative QCD. So far, the majority of data allowing for the extraction of TMDs have been acquired from SIDIS and Drell–Yan measurements, two experimentally accessible processes and for which TMD factorization was proved to hold [4,5,6]. However, since such processes are primarily induced by quarks/antiquarks, they mostly provide information about the quark TMDs. Currently our knowledge of gluon TMDs is still very limited, due to the lack of data on processes that could potentially be used for extractions. More specifically, gluons inside unpolarized protons can be described at leading twist using two TMDs [7]. The first one describes unpolarized gluons, while the second one describes linearly polarized gluons. The latter correlates the spin of the gluons with their TM, and thus requires non-zero gluon TM. The presence of polarized gluons inside the unpolarized proton has effects on the cross-sections, such as modifications of the TM-spectrum and azimuthal asymmetries.

Several processes have been proposed to extract gluon TMDs, see e.g. [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36]. Associated quarkonium production (see [37] for a recent review) has in particular a great potential to probe the gluon TMDs at the LHC, e.g. quarkonium plus photon (\(\mathcal {Q} + \gamma \)) or quarkonium-pair production. They mainly originate from gluon fusion, and can be produced via a color-singlet transition, avoiding then possible TMD-factorization-breaking effects [38,39,40]. This is indeed our working hypothesis, which is based on the fact that the produced systems are very small color dipoles, produced directly at the hard scattering and thus dominated by the color-singlet configuration. Moreover, isolation cuts may be used to reduce the risk of factorization breaking problems. In any case, comparing di-\(J/\psi \) and di-\(\Upsilon \) with \(\gamma ^*\gamma ^*\) results would provide in the longer run a good way to identify possible factorization breaking effects.

Some quarkonium states, like the \(J/\psi \) meson, are easily detected and a large number of events can be recorded. Processes with two particles in the final state offer some interesting advantages compared to those with a single detected particle. Since the TM of the final state needs to be small for the cross-section to be sensitive to TMD effects, one-particle final states are bound to stay close to the beam axis and therefore difficult to detect, as the background level is high and triggering is complicated. However, two particles that are nearly back-to-back can each have large individual transverse momenta that add up to a small one. Indeed, in general a pair of particles can have a large invariant mass and a small TM. Whereas the hard scale in a one-particle final state is only its mass, and is thus constant, the invariant mass of a two-particle final state can be tuned with their individual momenta. This allows one to study the scale evolution of the TMDs. Finally, a two-particle final state allows one to define the azimuthal angle between these two particles, hence to look for various azimuthal asymmetries. These are in fact associated to specific convolutions of gluon TMDs.

It was thus recently proposed [30] to probe the gluon TMDs using quarkonium-pair production at the LHC, and more specifically \(J/\psi + J/\psi \) production. Such a process has already been measured by LHCb, CMS and ATLAS at the LHC, as well as by D0 at the Tevatron [41,42,43,44,45]. Also it has recently been considered within the parton Reggeization approach [46]. The size of some azimuthal asymmetries associated with the linearly polarized gluon distribution are nearly maximum in this process. In [30], the unpolarized-gluon distribution was modelled by a simple Gaussian as a function of the gluon TM. In order to see the maximal effect of the linearly-polarized gluons on the yields, their distribution was taken to saturate its positivity bound [7]. The size of the resulting maximum asymmetries was found to be very large, especially at large pair invariant mass, \(M_{\mathcal{Q}\mathcal{Q}}\). Yet, more realistic estimates of the asymmetries require the inclusion of higher-order corrections in \(\alpha _s\) through TMD QCD evolution [21, 47,48,49].

Very recently, a TMD-factorization proof has been established for pseudoscalar \(\eta _{c,b}\) hadro-production at low TM [50] (see also [51]). To date, this is the only one for quarkonium hadroproduction. It was pointed out that new hadronic matrix elements are involved for quarkonium production at low TM, in addition to the TMDs. These encode the soft physics of the process. It is not known how much these new hadronic matrix elements impact the phenomenology. In this context, we build on the previous work [30] by adding TMD evolution effects to the gluon TMDs. Such evolution effects are expected to play a significant role (see e.g. [21]) and should in any case be specifically analyzed. We will proceed like in previous studies for \(H^0\) production [9, 18, 21].

In this article, we first discuss the characteristics of quarkonium-pair production at the LHC within the TMD framework, as well as the associated cross-section and observables sensitive to the gluon TMDs. We then detail the evolution formalism used in our computations and the resulting expressions for the TMD convolutions. Finally, we present our results for the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum and the azimuthal asymmetries for \(J/\psi \)-pair production at the LHC as well as azimuthal asymmetries for \(\Upsilon \)-pair production.

2 \(\mathcal{Q}\)-pair production within TMD factorization

2.1 TMD factorization description of the process

TMD factorization extends collinear factorization by taking into account the intrinsic TM of the partons, usually denoted by \({\varvec{k}}_{{\scriptscriptstyle T}}\). As in collinear factorization, the hard-scattering amplitude, which can be perturbatively computed, is multiplied by parton correlators that can be parametrized in terms of parton distribution functions, but in this case \({\varvec{k}}_{{\scriptscriptstyle T}}\) dependent. The parametrization of parton correlators is an extension from that used in collinear factorization, not only because of the \({\varvec{k}}_{{\scriptscriptstyle T}}\) dependence of the distribution functions, but also because there are more distributions. The gluon correlator inside an unpolarized proton with momentum P and mass \(M_p\), denoted by \(\Phi _g^{\mu \nu }(x,{\varvec{k}}_{{\scriptscriptstyle T}})\) [7, 52, 53], can be parametrized in terms of two independent TMDs. The first one is the distribution of unpolarized gluons \(f_1^{\,g}(x,{\varvec{k}}_{{\scriptscriptstyle T}}^2)\), the second one is the distribution of linearly polarized gluons \(h_1^{\perp \,g}(x,{\varvec{k}}_{{\scriptscriptstyle T}}^2)\). Here the gluon 4-momentum is written using a Sudakov decomposition: \(k = xP + k_{{\scriptscriptstyle T}} + k^-n\) (where n is any light-like vector (\(n^2=0\)) such that \(n\cdot P \ne 0\)), where \({\varvec{k}}_{\scriptscriptstyle T}^2 = -k_{\scriptscriptstyle T}^2\) and the transverse metric is \(g^{\mu \nu }_{{\scriptscriptstyle T}} = g^{\mu \nu } - (P^{\mu } n^{\nu }+ P^\nu n^\mu )/P{\cdot } n\). For TMD factorization to hold, the hard scale of the process should be much larger than the pair TM, \(q_T\).

The process we are interested in is the fusion of two gluons coming from two colliding unpolarized protons, leading to the production of a pair of vector S-wave quarkonia: \(g(k_1) {+} g(k_2) \rightarrow \mathcal{Q}(P_{\mathcal{Q},1}) {+} \mathcal{Q}(P_{\mathcal{Q},2})\,\). The cross-section for this reaction involves the contraction of two gluon correlators [30], \(\Phi _g^{\mu \nu }(x_1,{\varvec{k}}_{1{\scriptscriptstyle T}})\) and \(\Phi _g^{\rho \sigma }(x_2,{\varvec{k}}_{2{\scriptscriptstyle T}})\), with the squared amplitude \(\mathcal {M}^{\mu \rho }(\mathcal {M}^{\nu \sigma })^*\) of the partonic scattering, integrated over the gluon momenta. The expression of the tree-level partonic amplitude \(\mathcal {M}\) is available in Ref. [54], although the earliest computations date back to 1983 [55, 56]. The hadronization process, i.e. the transition from a heavy-quark pair to a quarkonium bound state, is described in our study using the color singlet model (CSM) [57,58,59] or in this case equivalently non-relativistic QCD (NRQCD) [60] at LO in the velocity v of the heavy quarks in the bound-state rest frame. Figure 1 represents the complete reaction with a typical Feynman diagram depicting the partonic subprocess.

Fig. 1
figure 1

Representative Feynman graph for \(p(P_1) {+} p(P_2) \rightarrow {\mathcal{Q}} (P_{\mathcal{Q},1}) {+} {\mathcal{Q}} (P_{\mathcal{Q},2}) {+}X\) via gluon fusion at LO in TMD factorisation

2.2 Other contributions to quarkonium-pair production

The leading contribution to the hadronization of a \(Q\overline{Q}\) pair into a bound state in NRQCD is the color-singlet (CS) transition, for which the perturbatively-produced heavy-quark pair has the same quantum numbers as the quarkonium and directly binds without any extra soft interaction. Corrections to this leading contribution involving higher color-octet (CO) fock states are suppressed by powers of v, which is meant to be much smaller than unity for heavy quarkonia.

The CS over CO dominance normally follows from this power suppression in v encoded in the so-called NRQCD long distance matrix elements (LDMEs). More precisely one expects a relative suppression on the order of \(v^4\) [60,61,62] (see [37, 63,64,65] for reviews) per quarkonium. For di-\(J/\psi \) production with \(v^2_c\simeq 0.25\) and for which both the CO and the CS yields are produced at \(\alpha _s^4\), the CO/CS yield ratio, which thus scales as \(v_c^8\), likely lies below the percent level. Explicit computations [66,67,68,69,70] indeed show corrections from the CO states below the percent level except in some corners of the phase space (e.g. large rapidity separation \(\varDelta y\)) where some CO contributions can be kinematically enhanced, but these can safely be avoided with appropriate kinematical cuts. More details can be found in [70].

It is important for the applicability of TMD factorization that the CS contributions dominate. Soft gluon interactions between the hadrons and a colored initial or final state of the hard scattering can be encapsulated within the definition of the TMD through the use of Wilson lines. However, if both initial and final states are subject to soft gluon interactions, the resulting color entanglement may break TMD factorization [38,39,40]. The dominance of the CS contributions should therefore be ensured.

It is also important to take into account \(\alpha _s\) corrections. In the TMD region, \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\ll M_{\mathcal{Q}\mathcal{Q}}\), these introduce a renormalization scale (\(\mu \)) dependence in the TMD correlators and a rapidity scale \(\zeta \) dependence [4,5,6]. At larger \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) one has to match onto the collinear factorization expression (see e.g. [71]), which is calculated by taking real-gluon emissions into account [68, 72,73,74]. At finite \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\), such single real-gluon emissions occur at \(\alpha _s^5\) and the quarkonium pair effectively recoils against this hard gluon, increasing the pair TM. In this paper we will restrict to \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}< M_{\mathcal{Q}\mathcal{Q}}/2\) in order to stay away from the matching region.

Thus far, we have focused our discussion on the single parton scattering (SPS) case. However, since we look at a two-particle final state, we should also consider the case where the quarkonia are created in two separate hard scatterings, i.e. double parton scattering (DPS). At LHC energies, the gluon densities are typically high and the likelihood for two hard gluon fusions to take place during the same proton-proton scattering cannot be neglected.

In the case of di-\(J/\psi \), it has been already anticipated in 2011 [75] that DPS contributions may be dominant at large rapidity difference \(\varDelta y\) (thus large invariant masses with same individual TM). This was corroborated [68] by the CMS data [42] with an excess above the SPS predictions at large \(\varDelta y\). ATLAS further [45] confirmed the DPS relevance in di-\(J/\psi \) production with a dedicated DPS study. One expectsFootnote 1 [45, 68] that DPS contributions lie below 10% for \(\varDelta y \sim 0\) in the CMS and ATLAS samples (characterized by a \(P_{\mathcal{Q}T}\) cut away from the threshold \(M_{\psi \psi }\simeq 2M_\psi \)) and that they only matter at large \(\varDelta y\). However, in the LHCb acceptance (where \(M_{\psi \psi }\simeq 2M_\psi \)), DPS contributions cannot a priori be neglected, but can be subtracted [44] if one assumes that, for the DPS sample, the kinematics of both \(J/\psi \)’s is uncorrelated. This yield a precise (yet, unnormalized) prediction of the kinematical distributions.

Another source of quarkonium pairs is the feed-down from excited states. For \(J/\psi \)-pair production, the main feed-down sources are the \(\chi _c\) and \(\psi '\). The feed-down from \(\chi _c\) is expected to be small, as \(gg \rightarrow J/\psi + \chi _c\) and \(gg \rightarrow \psi ' + \chi _c\) are suppressed [68] at LO by C-parity and the vanishing of the \(\chi _c\) wave function at the origin, and \(gg \rightarrow \chi _c + \chi _c\) is suppressed by the squared \(\chi _c \rightarrow J/\psi \) branching ratio. The production of a \(\psi '\) with a \(J/\psi \) likely contributes [37] 50% of the \(J/\psi \)-pair samples, owing to the large branching ratio for \(\psi '\rightarrow J/\psi \) (\(\mathcal{O}(60)\%\)) and symmetry factors. Yet, \(\psi '+J/\psi \) pairs are produced exactly like \(J/\psi +J/\psi \) pairs and thus generate the same TMD observables.Footnote 2

In the case of \(\Upsilon \)-pair production, the main feed-down is from \(\Upsilon (2,\!3S)\). According to Ref. [88], less than 30% of the produced pairs would originate from feed-down at \(\sqrt{s}\) = 8 TeV. As in the \(J/\psi \) case, C-parity suppresses the \(\Upsilon +\chi _b\) reaction at leading order, and \(\chi _b+\chi _b\) is suppressed by the squared branching ratio. Regarding the CO contributions, the relative velocity of the quarks inside the \(\Upsilon \) is smaller than for the \(J/\psi \), meaning the NRQCD expansion used to describe the hadronization has a better convergence. Therefore it is highly unlikely that the CO channels overcome the CS ones in the reachable phase space. The fraction of DPS events is also expected to be less than 5% at low \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) and central rapidity [37], making it an overall cleaner process.

2.3 The TMD differential cross-section

The general structure of the TMD-based differential cross-section describing quarkonium-pair production from gluon fusion reads [30]:

$$\begin{aligned}&\frac{\mathrm {d}\sigma }{\mathrm {d}M_{\mathcal{Q}\mathcal{Q}} \mathrm {d}Y_{\mathcal{Q}\mathcal{Q}} \mathrm {d}^2 \varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\mathrm {d}\varOmega } = \frac{\sqrt{M_{\mathcal{Q}\mathcal{Q}}^2 - 4 M_\mathcal {Q}^2}}{(2\pi )^2 8 s\, M_{\mathcal{Q}\mathcal{Q}}^2}\nonumber \\&\quad \Bigg \{F_1(M_{\mathcal{Q}\mathcal{Q}},\theta _{\mathrm{CS}})\ \mathcal {C} \Big [f_1^gf_1^g\Big ](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}) \nonumber \\&\quad + F_2(M_{\mathcal{Q}\mathcal{Q}},\theta _{\mathrm{CS}})\ \mathcal {C} \Big [w_2h_1^{\perp g}h_1^{\perp g}\Big ](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}})\nonumber \\&\quad + \Bigg (F_3(M_{\mathcal{Q}\mathcal{Q}},\theta _{\mathrm{CS}}) \ \mathcal {C} \Big [w_3 f_1^g h_1^{\perp g}\Big ](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}) \nonumber \\&\quad +F'_3(M_{\mathcal{Q}\mathcal{Q}},\theta _{\mathrm{CS}})\ \mathcal {C} \Big [w'_3 h_1^{\perp g} f_1^g \Big ](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}})\Bigg ) \cos 2\phi _{\mathrm{CS}} \nonumber \\&\quad + F_4(M_{\mathcal{Q}\mathcal{Q}},\theta _{\mathrm{CS}})\ \mathcal {C} \left[ w_4 h_1^{\perp g}h_1^{\perp g}\right] (x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}) \cos 4\phi _{\mathrm{CS}} \Bigg \} \,, \end{aligned}$$
(1)

with \(\mathrm {d}\varOmega =\mathrm {d}\cos \theta _{\mathrm{CS}}\mathrm {d}\phi _{\mathrm{CS}}\), \(\{\theta _{\mathrm{CS}},\phi _{\mathrm{CS}}\}\) being the Collins-Soper (CS) angles [89], and \(Y_{\mathcal{Q}\mathcal{Q}}\) is the rapidity of the pair. \(x_{1,2} = M_{\mathcal{Q}\mathcal{Q}}\,e^{\pm Y_{\mathcal{Q}\mathcal{Q}}}/\sqrt{s}\), with \(s = (P_1 + P_2)^2\). Here \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) (\(\equiv q_T\)) and \(Y_{\mathcal{Q}\mathcal{Q}}\) are defined in the hadron c.m.s. The quarkonia move along (in the opposite direction) \(\mathbf {e}=(\sin \theta _{\mathrm{CS}}\cos \phi _{\mathrm{CS}}, \sin \theta _{\mathrm{CS}}\sin \phi _{\mathrm{CS}}, \cos \theta _{\mathrm{CS}})\) in the CS frame. The kinematical pre-factor is specific to the mass of the quarkonia and the considered differential cross-sections, while the hard-scattering coefficients \(F_i\) only depend on \(\theta _{\mathrm{CS}}\) and the invariant mass of the system, here \(M_{\mathcal{Q}\mathcal{Q}}\). Their expression for quarkonium-pair production can be found at tree level in Ref. [30]. When \(P_{\mathcal{Q}T} \gg M_\mathcal{Q}\), small values of \(\cos \theta _{\mathrm{CS}}\) correspond to small values of \(\varDelta y\) in the hadron c.m.s.

The TMD convolutions appearing in Eq. (1) are defined as follows:

$$\begin{aligned}&\mathcal {C}[w\, f\, g](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}) \nonumber \\&\quad \equiv \int \mathrm {d}^{2}{\varvec{k}}_{1{\scriptscriptstyle T}} \int \mathrm {d}^{2}{\varvec{k}}_{2{\scriptscriptstyle T}}\, \delta ^{2}({\varvec{k}}_{1{\scriptscriptstyle T}}+{\varvec{k}}_{2{\scriptscriptstyle T}}-{\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}}) \nonumber \\&\qquad \times \, w({\varvec{k}}_{1{\scriptscriptstyle T}},{\varvec{k}}_{2{\scriptscriptstyle T}})\, f(x_1,{\varvec{k}}_{1{\scriptscriptstyle T}}^{2})\, g(x_2,{\varvec{k}}_{2{\scriptscriptstyle T}}^{2}) \,, \end{aligned}$$
(2)

where \(w({\varvec{k}}_{1{\scriptscriptstyle T}},{\varvec{k}}_{2{\scriptscriptstyle T}})\) denotes a TMD weight. The weights in Eq. (1) are common to all gluon-fusion processes originating from unpolarized proton collisions. They can be found in Ref. [25]. Our aim in the present study is to study the impact of QCD evolution effects in the above TMD convolutions. Having at our disposal the computation of the hard-scattering coefficients, the measurements of differential yields in principle allow one to extract these TMD convolutions evolved up to the natural scale of the process, on the order of \(M_{\mathcal{Q}\mathcal{Q}}\) here.

In practice, one looks at specific observables sensitive to these convolutions. First we note that when the cross-section is integrated over the azimuthal angle \(\phi _{\mathrm{CS}}\), the terms with a \(\cos (2,\!4\phi _{\mathrm{CS}})\)-dependence drop out from Eq. (1) such that

$$\begin{aligned} \frac{1}{2\pi }&\int d\phi _{\mathrm{CS}} \frac{d\sigma }{d M_{\mathcal{Q}\mathcal{Q}} d Y_{\mathcal{Q}\mathcal{Q}} d^2 \varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}d \varOmega } \nonumber \\&=F_1\, \mathcal {C} \Big [f_1^{\,g}f_1^{\,g}\Big ]+F_2\, \mathcal {C} \Big [w_2h_1^{\perp \, g}h_1^{\perp \, g}\Big ] \,, \end{aligned}$$
(3)

giving direct access to \(\mathcal {C} \Big [f_1^{\,g}f_1^{\,g}\Big ]\) and \(\mathcal {C} \Big [w_2h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\).

Furthermore, one can define, at fixed \(\{Y,\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}},\theta _{\mathrm{CS}},M_{\mathcal{Q}\mathcal{Q}}\}\), \(\cos (n\phi _{\mathrm{CS}})\)-weighted differential cross-sections, integrated over \(\phi _{\mathrm{CS}}\) and normalized by their azimuthally-independent component:

$$\begin{aligned}&\langle \cos (n\phi _{\mathrm{CS}}) \rangle \nonumber \\&\quad =\frac{\displaystyle \int d\phi _{\mathrm{CS}} \cos (n\phi _{\mathrm{CS}})\, \frac{\displaystyle \mathrm {d}\sigma }{d M_{\mathcal{Q}\mathcal{Q}} d Y_{\mathcal{Q}\mathcal{Q}} d^2 \varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}d \varOmega }}{\displaystyle \int d\phi _{\mathrm{CS}} \frac{d\sigma }{d M_{\mathcal{Q}\mathcal{Q}} d Y_{\mathcal{Q}\mathcal{Q}} d^2 \varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}d \varOmega }}\, . \end{aligned}$$
(4)

Such a variable, computed for n = 2 or 4 in our case, corresponds to (half of) the relative size of the \(\cos (2,\!4\phi _{\mathrm{CS}})\)-modulations present in the TMD cross-section in comparison to its \(\phi _{\mathrm{CS}}\)-independent component:

$$\begin{aligned} \langle \cos 2\phi _{\mathrm{CS}}\rangle= & {} \frac{1}{2}\;\frac{F_3 \mathcal {C} \Big [w_3 f_1^{\,g} h_1^{\perp \, g}\Big ] + F'_3 \mathcal {C} \Big [w'_3 h_1^{\perp \, g} f_1^{\,g} \Big ]}{F_1\, \mathcal {C} \Big [f_1^{\,g}f_1^{\,g}\Big ]+F_2\, \mathcal {C} \Big [w_2h_1^{\perp \, g}h_1^{\perp \, g}\Big ]} \, ,\nonumber \\ \langle \cos 4\phi _{\mathrm{CS}}\rangle= & {} \frac{1}{2}\;\frac{F_4 \mathcal {C}\! \left[ w_4 h_1^{\perp \, g}h_1^{\perp \, g}\right] }{F_1\, \mathcal {C} \Big [f_1^{\,g}f_1^{\,g}\Big ]+F_2\, \mathcal {C} \Big [w_2h_1^{\perp \, g}h_1^{\perp \, g}\Big ]}\, . \end{aligned}$$
(5)

When \(\langle \cos n\phi _{\mathrm{CS}} \rangle \) is computed within a range of \(M_{\mathcal{Q}\mathcal{Q}}\), \(Y_{\mathcal{Q}\mathcal{Q}}\), \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) or \(\cos (\theta _{\mathrm{CS}})\), we define it as the ratio of corresponding integrals. Of course, the range in \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) should be such that one remains in the TMD region, i.e. \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\ll M_{\mathcal{Q}\mathcal{Q}}\).

For positive Gaussian \(h_1^{\perp \, g}\) the \(\langle \cos (2\phi _{CS}) \rangle \) asymmetry will be positive (note that in Ref. [30] the \(\langle \cos (2\phi _{CS}) \rangle \) plots miss an overall minus sign).

3 TMD evolution formalism

TMD evolution has been considered in an increasing number of TMD observables. It is usually implemented by Fourier transforming to \(b_T\)-space, with \(b_T\) being the conjugate variable to \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). When evolution effects are considered, the TMDs acquire a dependence on two scales: a renormalization scale \(\mu \) and a rapidity scale \(\zeta \) (whose evolution is governed by the Collins-Soper equation). Below we present in a simple way the results needed to perform the TMD evolution. For more details, we refer to e.g. [21, 47,48,49].

When TMD evolution is incorporated to the gluon TMDs in the tree-level result in Eq. (1), the convolutions take the form

$$\begin{aligned} \mathcal {C}&[w\, f\, g](x_{1,2},\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}};\mu ) \nonumber \\&\equiv \int \mathrm {d}^{2}{\varvec{k}}_{1{\scriptscriptstyle T}} \int \mathrm {d}^{2}{\varvec{k}}_{2{\scriptscriptstyle T}}\, \delta ^{2}({\varvec{k}}_{1{\scriptscriptstyle T}}+{\varvec{k}}_{2{\scriptscriptstyle T}}-{\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}}) \nonumber \\&\quad \times \, w({\varvec{k}}_{1{\scriptscriptstyle T}},{\varvec{k}}_{2{\scriptscriptstyle T}})\,f(x_1,{\varvec{k}}_{1{\scriptscriptstyle T}}^{2};\zeta _1,\mu )\, g(x_2,{\varvec{k}}_{2{\scriptscriptstyle T}}^{2};\zeta _2,\mu ) \,, \end{aligned}$$
(6)

where the two rapidity scales should fulfill the constraint \(\zeta _1 \zeta _2=M_{\mathcal{Q}\mathcal{Q}}^4\). While the renormalization scale \(\mu \) in the hard-scattering coefficients \(F_i\) should be set here to \(\mu \sim M_{\mathcal{Q}\mathcal{Q}}\) in order to avoid large logarithms, the TMDs should be evaluated at their natural scale \(\mu \sim \sqrt{\zeta }\sim \mu _b=b_0/b_T\) (with \(b_0=2e^{-\gamma _E}\)), in order to minimize both logarithms of \(\mu b_T\) and \(\zeta b_T^2\), and then evolved up to \(\mu \sim \sqrt{\zeta }\sim M_{\mathcal{Q}\mathcal{Q}}\). The solution of the evolution equations results in the introduction of the following Sudakov factor \(S_A\):

$$\begin{aligned}&\tilde{f}_1^{\,g}(x_1,b_T^2;\zeta ,\mu ) = e^{-\frac{1}{2}S_A(b_T;\zeta ,\mu )}\tilde{f}_1^{\,g}(x,b_T^2;\mu _b^2,\mu _b) \,, \nonumber \\&\tilde{h}_1^{\perp \, g}(x_1,b_T^2;\zeta ,\mu ) = e^{-\frac{1}{2}S_A(b_T;\zeta ,\mu )}\tilde{h}_1^{\perp \, g}(x,b_T^2;\mu _b^2,\mu _b) \end{aligned}$$
(7)

where the Fourier-transformed TMDs are

$$\begin{aligned}&\tilde{f}_1^{\,g}(x,{\varvec{b}}_T^2;\zeta ,\mu ) = \int \!d^2{\varvec{k}}_T\, e^{-i{\varvec{b}}_T \cdot \,{\varvec{k}}_T}f_1^{\,g}(x,{\varvec{k}}_T^2;\zeta ,\mu ) \,, \nonumber \\&\tilde{h}_1^{\perp \, g}(x,{\varvec{b}}_T^2;\zeta ,\mu ) = \int d^2{\varvec{k}}_T\, \frac{({\varvec{b}}_T\cdot \,{\varvec{k}}_T)^2-\frac{1}{2}{\varvec{b}}_T^2 {\varvec{k}}_T^2}{{\varvec{b}}_T^2 M_p^2} \nonumber \\&\qquad \qquad \qquad \times \, e^{-i{\varvec{b}}_T\cdot \,{\varvec{k}}_T }h_1^{\perp \, g}(x,{\varvec{k}}_T^2;\zeta ,\mu ), \end{aligned}$$
(8)

and the perturbative Sudakov factor (applicable for sufficiently small \(b_T\)) is given by

$$\begin{aligned}&S_A(b_T;\zeta ,\mu ) = 2D(\mu _b^2)\,\ln \frac{\zeta }{\mu _b^2}\nonumber \\&\qquad +2\int _{\mu _b}^{\mu }\frac{d\bar{\mu }}{\bar{\mu }} \Bigg [ \Gamma (\alpha _s(\bar{\mu }^2))\ln \frac{\zeta }{\bar{\mu }^2} + \gamma (\alpha _s(\bar{\mu }^2))\Bigg ] \,. \end{aligned}$$
(9)

We consider here the resummation at next-to-leading-logarithmic accuracy, for which the Collins-Soper kernel D and the non-cusp anomalous dimension \(\gamma \) need to be taken at leading-order, while the cusp anomalous dimension \(\Gamma \) at next-to-leading-order (see [90] for a recent detailed analysis of the two-dimensional evolution of TMDs). The perturbative Sudakov factor then takes the form

$$\begin{aligned}&S_A(b_T;\zeta ,\mu )= 2\frac{C_A}{\pi } \int _{\mu _b}^{\mu }\frac{d\bar{\mu }}{\bar{\mu }}\ln \left( \frac{\zeta }{\bar{\mu }^2}\right) \end{aligned}$$
(10)
$$\begin{aligned}&\quad \times \Bigg [ \alpha _s(\bar{\mu }^2) + \bigg (\big (\frac{67}{9}-\frac{\pi ^2}{3}\big ) -\frac{20T_fn_f}{9}\bigg )\frac{\alpha _s^2(\bar{\mu }^2)}{4\pi } \Bigg ] \nonumber \\&\quad + 2\frac{C_A}{\pi }\int _{\mu _b}^{\mu }\frac{d\bar{\mu }}{\bar{\mu }} \alpha _s(\bar{\mu }^2)\left[ -\frac{11-2n_f/C_A}{6} \right] \,, \end{aligned}$$
(11)

with \(C_A=3\), \(T_f=1/2\) and \(n_f\) the number of flavors (we will use \(n_f=4\) for di-\(J/\psi \) and \(n_f=5\) for di-\(\Upsilon \) production). The running of \(\alpha _s\) is implemented at one loop. We note that the Sudakov factor \(S_A\) is spin independent, and thus the same for all (un)polarized TMDs [21, 49].

The perturbative component of the TMDs for small \(b_T\) can be computed at a given order in \(\alpha _s\). At leading order, \(\tilde{f}_1^{\,g}\) is given by the integrated PDF:

$$\begin{aligned} \tilde{f}_1^{\,g}(x,b_T^{2};\zeta ,\mu ) = f_{g/P}(x;\mu ) +\mathcal {O}(\alpha _s) +\mathcal {O}(b_T\varLambda _\mathrm{QCD}) \,. \end{aligned}$$
(12)

As said above, \(h_1^{\perp \, g}\) describes the correlation between the gluon polarization and its TM (\(k_T\)) inside the unpolarized proton. It requires a helicity flip and therefore an additional gluon exchange. Consequently, its perturbative expansion starts at \(\mathcal {O}(\alpha _s)\) [9] (the NLO result was recently obtained in Ref. [91]):

$$\begin{aligned}&\tilde{h}_1^{\,\perp g}(x,b_T^{2};\zeta ,\mu )\nonumber \\&\quad = -\left( \frac{\alpha _s(\mu )C_A}{\pi }\int _x^1 \frac{d\hat{x}}{\hat{x}}\left( \frac{\hat{x}}{x}-1\right) f_{g/P}(\hat{x};\mu )\right. \nonumber \\&\qquad +\frac{\alpha _s(\mu )C_F}{\pi }\sum _{i=q,\bar{q}}\int _x^1 \left. \frac{d\hat{x}}{\hat{x}}\left( \frac{\hat{x}}{x}-1\right) f_{i/P}(\hat{x};\mu )\right) \nonumber \\&\qquad +\mathcal {O}(\alpha _s^2) +\mathcal {O}(b_T\varLambda _\mathrm{QCD}) \,, \end{aligned}$$
(13)

The above equations in principle allow one to derive a perturbative expression of these TMDs. However, they are strictly applicable only in a restricted \(b_T\) range, whereas we need an expression for them from small to large \(b_T\) in order to perform the corresponding Fourier transform.

For large \(b_T\), one indeed leaves the domain of perturbation theory. On the contrary, when \(b_T\) gets too small, \(\mu _b\) becomes larger than \(M_{\mathcal{Q}\mathcal{Q}}\) and the evolution should stop. The above perturbative expression for the Sudakov factor should thus not be used as it is.

One of the common solutions to continue to use the above expressions consists in replacing \(b_T\) by a function of \(b_T\) which freezes in both these limits such that one is not sensitive to the physics there. For our numerical studies we use the following \(b_T\) prescription [92]:

$$\begin{aligned} b_T^*\big (b_c(b_T)\big )&= \frac{b_c(b_T)}{\sqrt{1+\Big (\frac{b_c(b_T)}{b_{T_{\max }}}\Big )^2}} \end{aligned}$$
(14)

where

$$\begin{aligned} b_c(b_T ) = \sqrt{b_T^2+\Big (\frac{b_0}{M_{\mathcal{Q}\mathcal{Q}}}\Big )^2} \,, \end{aligned}$$
(15)

such that \(\mu _b=\frac{b_0}{b_T^*(b_c)}\) always lies between \(b_0/b_{T_{\max }}\) (reached when \(b_T\rightarrow \infty \)) and \(M_{\mathcal{Q}\mathcal{Q}}\) (reached when \(b_T\rightarrow 0\)). This prescription is of course not unique, as it entails e.g. some particular assumptions on the transition from the hard to the soft regime. The ambiguity in the choice of this prescription can however be absorbed in the nonperturbative modelling of the TMDs, anyhow needed in the large \(b_T\) region, which we discuss next.

Schematically each TMD convolution can be written in \(b_T\)-space as

$$\begin{aligned} \mathcal {C}[w\, f\, g] = \int _0^\infty \!\! \frac{db_T}{2\pi }\, b_T^n J_m(b_T q_T)\,\tilde{W}(b_T,Q)\,, \end{aligned}$$
(16)

for some integers n and m. Here \(\tilde{W}\) is a simple product of Fourier-transformed TMDs. The nonperturbative Sudakov factor \(S_{\mathrm{NP}}\) is now defined through \(\tilde{W}(b_T,Q)=\tilde{W}(b_T^*,Q)\, e^{-S_{\mathrm{NP}}(b_T,Q)}\), where by construction \(\tilde{W}(b_T^*,Q)\) is perturbatively calculable for all \(b_T\) values. The value of \(b_{T\max }\) in Eq. (14) (roughly) sets the separation between the perturbative and nonperturbative domains. Its optimal value depends on many factors, such as the functional form chosen for \(b_T^*\) and the parametrization of the nonperturbative Sudakov factor \(S_{\mathrm{NP}}\). For our numerical studies we take \(b_{T\max }=1.5\) GeV\(^{-1}\), inspired by previous fits from Drell–Yan and WZ production [93,94,95,96,97].

The functional form of \(S_{\mathrm{NP}}\) has been subject of debate, but is usually chosen to be proportional to \(b_T^2\) for all \(b_T\). By definition \(e^{-S_{\mathrm{NP}}(b_T,Q)}\) has to be equal to 1 for \(b_T=0\) and for large \(b_T\) it has to vanish, at the very least to ensure convergence of the results. It is usually assumed to be a monotonically decreasing function of \(b_T\) and its change from 1 to 0 is assumed to happen within the confinement distance. Lacking experimental constraints, here we will assume a simple Gaussian form (of varying widths). In order to assess the importance of the nonperturbative Sudakov factor for the size of the asymmetries and to perform a first error estimate, we consider several functions. For this purpose, we take a simple formula for the nonperturbative Sudakov factor that encapsulates the expected \(M_{\mathcal{Q}\mathcal{Q}}\)-dependence [98] and the assumed \(b_T\)-Gaussian behavior:

$$\begin{aligned} S_{\mathrm{NP}}\big (b_c(b_T)\big )= A \ln \Big (\frac{M_{\mathcal{Q}\mathcal{Q}}}{Q_\mathrm{NP}}\Big )\, b_c^2(b_T) \,, \qquad Q_\mathrm{NP}=1~\mathrm{GeV} \,. \end{aligned}$$
(17)

From this nonperturbative Sudakov factors a value \(b_{T\lim }\) is defined at which \(e^{-S_{\mathrm{NP}}}\) becomes negligible, to be specific, where it becomes \(\sim \) 10\(^{-3}\). From this we furthermore define a corresponding characteristic radius \(r=\tfrac{1}{2} b_{T\lim }\) (considering \(b_{T\lim }\) the diameter, since it is conjugate to \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}= {\varvec{k}}_{1T}+{\varvec{k}}_{2T}\)), which delimits the range over which the interactions occur from the center of the proton. To estimate the uncertainty associated with the largely unknown nonperturbative Sudakov factor, we will consider three cases: \(b_{T\lim } = 2\), 4 and 8 GeV\(^{-1}\). This spans roughly from \(b_{T\max }=1.5\) GeV\(^{-1}\) to the charge radius of the proton, thus giving a generous but sensible estimation of the nonperturbative uncertainty. The corresponding values of the parameter A and r for \(M_{\mathcal{Q}\mathcal{Q}}=12\) GeV are given in Table 1.

Table 1 Values of the parameter A used in Eq. (17) for \(e^{-S_{\mathrm{NP}}}\), along with the corresponding \(b_{T\lim }\) and r at \(M_{\mathcal{Q}\mathcal{Q}}=12\) GeV

The value \(M_{\mathcal{Q}\mathcal{Q}}=12\) GeV is considered because the ratio \(F_3/F_1\) peaks there (for \(J/\psi \) pair production), but we will also consider larger values later on. When \(M_{\mathcal{Q}\mathcal{Q}}\) increases, the interaction radius r decreases. Figure 2 depicts \(e^{-S_{\mathrm{NP}}}\) as a function of \(b_T\) for the three values of A previously mentioned and for \(M_{\mathcal{Q}\mathcal{Q}}\) ranging from 12 to 30 GeV.

Fig. 2
figure 2

\(e^{-S_{\mathrm{NP}}}\) from Eq. (17) vs \(b_T\) for A \(=\) 0.04 (purple), 0.16 (orange) and 0.64 (magenta) GeV\(^2\), for values of \(M_{\mathcal{Q}\mathcal{Q}}\) ranging from 12 to 30 GeV. The boundaries around the bands depict the exponential at \(M_{\mathcal{Q}\mathcal{Q}}\) = 12 GeV (solid line) and at \(M_{\mathcal{Q}\mathcal{Q}}\) = 30 GeV (dotted line)

We point out that the nonperturbative Sudakov factor as fitted by Aybat and Rogers [95] to low-energy SIDIS as well as high-energy Drell–Yan and \(Z^0\) production data, rescaled by a color factor \(C_A/C_F\) to account for the different color representation between quarks and gluons, is very close to the case \(b_{T\lim }\) \(=\) 2 GeV\(^{-1}\). It is also very close to the Fourier transform of the Gaussian model for \(f_1^g(x,k_T^2)\) with \(\langle k_T^{\, 2} \rangle = 3.3 \pm 0.8\) GeV\(^2\) as extracted in Ref. [30] from a LO fit to \(J/\psi \)-pair-production data from LHCb [44] from which the DPS contributions was however approximately subtracted.

We end this section by providing the expressions for the TMD convolutions in \(b_T\)-space, which we actually use in the numerical predictions in the next section:

$$\begin{aligned}&\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\nonumber \\&\quad = \int _0^\infty \frac{db_T}{2\pi }\, b_T J_0(b_T q_T)\, e^{-S_A(b_T^*;M_{\mathcal{Q}\mathcal{Q}}^2,M_{\mathcal{Q}\mathcal{Q}})} \,\nonumber \\&\qquad \times e^{- S_{\mathrm{NP}}(b_c)} \tilde{f}_1^{\,g}(x_1,b_T^{*\, 2};\mu _b^2,\mu _b)\, \tilde{f}_1^{\,g}(x_2,b_T^{*\, 2};\mu _b^2,\mu _b) \,,\nonumber \\&\mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ] \nonumber \\&\quad = \int _0^\infty \frac{db_T}{2\pi }\, b_T J_0(b_T q_T)\, e^{-S_A(b_T^*;M_{\mathcal{Q}\mathcal{Q}}^2,M_{\mathcal{Q}\mathcal{Q}})} \,\nonumber \\&\qquad \times e^{- S_{\mathrm{NP}}(b_c)} \tilde{h}_1^{\perp \, g}(x_1,b_T^{*\, 2};\mu _b^2,\mu _b)\, \tilde{h}_1^{\perp \, g}(x_2,b_T^{*\, 2};\mu _b^2,\mu _b) \,,\nonumber \\&\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]\nonumber \\&\quad = \int _0^\infty \frac{db_T}{2\pi }\, b_T J_2(b_T q_T)\, e^{-S_A(b_T^*;M_{\mathcal{Q}\mathcal{Q}}^2,M_{\mathcal{Q}\mathcal{Q}})} \,\nonumber \\&\qquad \times e^{- S_{\mathrm{NP}}(b_c)} \tilde{f}_1^{\,g}(x_1,b_T^{*\, 2};\mu _b^2,\mu _b)\, \tilde{h}_1^{\perp \, g}(x_2,b_T^{*\, 2};\mu _b^2,\mu _b) \,,\nonumber \\&\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ] \nonumber \\&\quad = \int _0^\infty \frac{db_T}{2\pi }\, b_T J_4(b_T q_T)\, e^{-S_A(b_T^*;M_{\mathcal{Q}\mathcal{Q}}^2,M_{\mathcal{Q}\mathcal{Q}})} \,\nonumber \\&\qquad \times e^{- S_{\mathrm{NP}}(b_c)} \tilde{h}_1^{\perp \, g}(x_1,b_T^{*\, 2};\mu _b^2,\mu _b)\, \tilde{h}_1^{\perp \, g}(x_2,b_T^{*\, 2};\mu _b^2,\mu _b) \,. \end{aligned}$$
(18)

4 The TM spectrum and the azimuthal asymmetries

4.1 \(J/\psi \)-pair production

As said, after integration over the azimuthal angle \(\phi _{CS}\), one gets to a good approximation \(d\sigma /dq_T\propto q_T\, \mathcal {C}[f_1^{\, g}f_1^{\, g}]\). In Fig. 3a we compare \(q_T\, \mathcal {C}[f_1^{\, g}f_1^{\, g}]\) evaluated using the non-evolved Gaussian TMD model of [30] with the evolved TMD computed along the lines described in the previous section for \(M_{\mathcal{Q}\mathcal{Q}}\) = 8 GeV using the range of \(b_{T\lim }\) between 2 and 8 GeV\(^{-1}\). The main difference one can observe is the broadening of the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum when including evolution effects. The curves are given as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) in the range from 0 up to \(M_{\mathcal{Q}\mathcal{Q}}\)/2, to be in the validity range of TMD factorization.

Fig. 3
figure 3

(a) The normalised \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum for \(J/\psi \)-pair production at \(M_{\psi \psi }\) = 8 GeV using two gluon TMDs. The first is a Gaussian Ansatz with \(\langle k_T^{\, 2} \rangle = 3.3 \pm 0.8\) GeV\(^2\) obtained from the LHCb data [30] (the red curve shows the central value and the gray band the associated uncertainty). The second is the result of our present study with TMD evolution. The green band results from the uncertainty on the \(b_T\)-width of the nonperturbative Sudakov factor \(S_{\mathrm{NP}}\). The estimated DPS contribution has been subtracted from the LHCb data (black crosses) which were also normalized over the interval. (b) The \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum using our evolved gluon TMDs at \(M_{\mathcal{Q}\mathcal{Q}}\) = 12, 20 and 30 GeV for the same uncertainty on the \(b_T\)-width

The momentum fractions of the initial gluons, \(x_1\) and \(x_2\), are both fixed to 10\(^{-3}\). Varying the momentum fractions does not have any significant impact on the shape of the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum or the azimuthal asymmetries. The size of the asymmetries varies by a few percent with x. As such variations do not change the conclusions of our analysis, we will keep the values \(x_1=x_2=10^{-3}\) throughout this paper. This is also convenient for an experimental study, as a binning of the data in \(Y_{\mathcal{Q}\mathcal{Q}}\) is not necessary to be able to compare them with predictions.

In Fig. 3b, we show evolved results for \(M_{\mathcal{Q}\mathcal{Q}}\) = 12, 20 and 30 GeV within the same \(b_{T\lim }\) range as Fig. 3b. The broadening of the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum for increasing \(M_{\mathcal{Q}\mathcal{Q}}\) is then explicit.

The azimuthal asymmetries presented in Eq. (5) depend on a rather complex ratio of TMD convolutions and hard-scattering coefficients. In the case of \(J/\psi \)-pair production, these expressions simplify for several reasons. The first one, already mentioned previously, is that because \(F_2\) is small, the denominator can be approximated to be \(F_1 \mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\). Moreover, because of the symmetry of the final state, one finds the coefficients \(F_3\) and \(F_3'\) to be equal, simplifying the numerator of \(\langle \cos (2\phi _{CS}) \rangle \) to be \(F_3\Big (\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]+\mathcal {C}\Big [w_3'\,h_1^{\perp \, g}f_1^{\, g}\Big ]\Big )\). Finally, when one takes the initial-parton-momentum fractions to be equal, i.e. \(x_1=x_2\), these two convolutions become equal as well. Since the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-dependence of the cross-section is contained inside the convolutions, the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-dependence of the asymmetries can be studied via the convolution ratios \(\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]/\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\) and \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]/\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\) for \(\langle \cos (2\phi _{CS}) \rangle \) and \(\langle \cos (4\phi _{CS}) \rangle \), respectively.

The difference between both convolutions depends on the kind of TMDs they contain, but also the type of Bessel function generated by the angular integral and the weights. Because \(\tilde{h}_1^{\perp \, g}\) is of order \(\alpha _s\), it is naturally suppressed in comparison to \(f_1^{\, g}\). Moreover, \(\alpha _s(\mu _b)\) is growing with \(b_T\) (up to its bound \(\alpha _s(b_0/b_{T\max })\)) and \(\tilde{h}_1^{\perp \, g}\) is also broader in \(b_T\) than \(f_1^{\, g}\). The presence of \(\tilde{h}_1^{\perp \, g}\) in a given convolution therefore contributes to reduce the magnitude of the integrand, and to its \(b_T\)-broadening. These effects contribute to strongly suppress \(\mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\)with respect to \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\). \(\mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) is of order \(\alpha _s^2\) and its integrand is significantly broadened in \(b_T\), meaning it falls faster than \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\)with increasing \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). Indeed, as a consequence of the \(b_T\)-broadening, more oscillations of the \(J_0\) Bessel function occur in the integrand of \(\mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) than of \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\), before being dampened by the Sudakov factors at large \(b_T\). Each additional oscillation in the integrand brings the convolution value closer to zero. More oscillations are packed in a given \(b_T\)-range when \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) increases, widening the gap between the two convolutions, and effectively making the ratio fall with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). This additional effect renders the \(F_2 \, \mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) term truly negligible in the cross-section for \(J/\psi \)-pair production. It also means that in other processes where the hard-scattering coefficient \(F_2\) may be large, the convolution itself would remain relatively small at scales larger than a few GeV. Besides, its influence on the cross-section will be strongest at the smallest TM.

The situation is different for the azimuthal asymmetries, which involve convolutions in the numerator that contain either the \(J_2\) or \(J_4\) Bessel functions. Such functions are 0 at \(b_T\)=0 and then grow in magnitude. The consequence is that the \(b_T\)-integrals containing such functions benefit from unsuppressed intermediate \(b_T\) values. At some point, undampened large-\(b_T\) oscillations will bring the integral value down toward 0 in a similar way as for \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\) and \(\mathcal {C}\Big [w_2\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\). Therefore, the \(\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]\) and \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) convolutions first grow with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) up to a peak maximum, and then decrease in value like \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\) does. Another crucial difference is that the envelopes of \(J_2\) and \(J_4\) tend slower toward 0 than the \(J_0\) one with increasing \(b_T\). The consequence is that \(\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]\) and \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) fall slower than \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\) with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). Hence the convolution ratios, and the azimuthal asymmetries, always grow with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\), as can be seen in Fig. 4. In addition, as the large \(b_T\) values are less suppressed than in \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\), the azimuthal asymmetries are also more sensitive to the variations of the nonperturbative Sudakov \(S_{\mathrm{NP}}\). The effect is more pronounced for \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) since it contains \(\tilde{h}_1^{\perp \, g}\) twice and a broader Bessel function.

Fig. 4
figure 4

The azimuthal asymmetries for di-\(J/\psi \) production as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). The different plots show \(2\langle \cos (2\phi _{CS}) \rangle \) (a,b) and \(2\langle \cos (4\phi _{CS}) \rangle \) (c,d), at \(|\cos (\theta _{CS})|<0.25\) (a,c) and at \(0.25<|\cos (\theta _{CS})|<0.5\) (b,d). Results are presented for \(M_{\psi \psi }\) = 12, 21 and 30 GeV, and for \(b_{T\lim }\) = 2, 4 and 8 GeV\(^{-1}\)

Figure 4b displays the \(\cos (2\phi _{CS})\) asymmetry as a function of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) in the forward single \(J/\psi \) rapidity region (larger \(\cos (\theta _{CS})\)) while 4c displays the \(\cos (4\phi _{CS})\) asymmetry in the central rapidity region (small \(\cos (\theta _{CS})\) with \(x_1\simeq x_2\)). Such choices maximize the size of the asymmetries as the associated hard-scattering coefficients are larger in these regions, without modifying the shapes of the asymmetries in \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) (see [30] for a comparison between the two rapidity regions for each asymmetry). The uncertainty band associated with the width of \(S_{\mathrm{NP}}\) narrows with increasing \(M_{\mathcal{Q}\mathcal{Q}}\) as in Fig. 3; the uncertainty remains larger for \(\langle \cos (4\phi _{CS}) \rangle \) as \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) is more affected by \(S_{\mathrm{NP}}\). The curves for \(b_{T\lim }\) = 8 GeV\(^{-1}\) (large dashes) are quite close to the ones using \(b_{T\lim }\) = 4 GeV\(^{-1}\) (solid line). Indeed, when \(S_{\mathrm{NP}}\) is already significantly wider than \(S_A\), an additional increase in its width will not affect the asymmetries anymore. Both convolutions in the ratios are larger with a wide nonperturbative Sudakov factor, yet this benefits the numerator \(\Big (\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]\) or \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\Big )\) more than the denominator \(\Big (\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\Big )\), and the asymmetries are of a greater size for a wider \(S_{\mathrm{NP}}\).

Fig. 5
figure 5

The azimuthal asymmetries for di-\(J/\psi \) production as functions of \(M_{\psi \psi }\). The different plots show \(2\langle \cos (2\phi _{CS}) \rangle \) (a,b) and \(2\langle \cos (4\phi _{CS}) \rangle \) (c,d), at \(|\cos (\theta _{CS})|<0.25\) (a,c) and at \(0.25<|\cos (\theta _{CS})|<0.5\) (b,d). Results are presented for \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) = 4, 7 and 10 GeV, and for \(b_{T\lim }\) = 2, 4 and 8 GeV\(^{-1}\)

We recall that the size of the asymmetries is also influenced by the ratio of the hard-scattering coefficients which are \(M_{\mathcal{Q}\mathcal{Q}}\)-dependent. \(F_3/F_1\) peaks around \(M_{\mathcal{Q}\mathcal{Q}}\) = 12 GeV which explains why the \(\cos (2\phi _{CS})\) asymmetry is largest near this value. As discussed in Ref. [30], the ratio \(F_4/F_1\) keeps growing with \(M_{\mathcal{Q}\mathcal{Q}}\), approaching 1 at sufficiently large values. Yet the \(\cos (4\phi _{CS})\) asymmetry gets smaller with larger \(M_{\mathcal{Q}\mathcal{Q}}\). This can be better seen in Fig. 5 which depicts the same asymmetries as functions of \(M_{\mathcal{Q}\mathcal{Q}}=M_{\psi \psi }\).

One first observes that, at large \(M_{\mathcal{Q}\mathcal{Q}}\), the growth of the asymmetries with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) is slower. Indeed, in such a situation, the Sudakov factors broaden the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-shapes of the convolutions, hence the ratio varies slower. This slower increase is compensated by the fact that larger values of \(M_{\mathcal{Q}\mathcal{Q}}\) allow for an extended growth of the asymmetry over a greater \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-range of validity for the TMD formalism. Secondly, the convolution ratios at a fixed value of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) also evolve with \(M_{\mathcal{Q}\mathcal{Q}}\). The computable \(M_{\mathcal{Q}\mathcal{Q}}\)-dependence is encoded in the perturbative Sudakov factor \(S_A\), while \(S_{\mathrm{NP}}\) is also logarithmically varying with \(M_{\mathcal{Q}\mathcal{Q}}\) [98]. Both \(S_A\) and \(S_{\mathrm{NP}}\) get narrower in \(b_T\) with increasing \(M_{\mathcal{Q}\mathcal{Q}}\), leading to a decrease of the value of the convolutions. \(\mathcal {C}\Big [w_3\,f_1^{\, g}h_1^{\perp \, g}\Big ]\) and \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\) are more sensitive to the large \(b_T\)-value dampening and therefore fall faster with \(M_{\mathcal{Q}\mathcal{Q}}\) than \(\mathcal {C}\Big [f_1^{\, g}f_1^{\, g}\Big ]\). This results in decreasing convolution ratios, with a steeper fall for \(\mathcal {C}\Big [w_4\,h_1^{\perp \, g}h_1^{\perp \, g}\Big ]\). However the azimuthal asymmetries also depend on the evolution with \(M_{\mathcal{Q}\mathcal{Q}}\) of the hard-scattering coefficients ratios. Since \(F_4/F_1\) keeps growing while \(F_3/F_1\) falls after peaking at \(M_{\mathcal{Q}\mathcal{Q}} \simeq \) 12 GeV, \(\langle \cos (4\phi _{CS}) \rangle \) will actually decrease slower than \(\langle \cos (2\phi _{CS}) \rangle \).

The large variations of the width of \(S_{\mathrm{NP}}\) generate moderate uncertainties on the size of the asymmetries. The latter, although consequently smaller than when computed in a bound-saturating model [30], still reach reasonable sizes, up to 5%-10%. We used the same nonperturbative Sudakov factor for all TMD convolutions in these computations, but the \(M_{\mathcal{Q}\mathcal{Q}}\)-independent part is actually expected to be non-universal. We checked that individually changing the width of \(S_{\mathrm{NP}}\) within the \(b_{T\lim }\)-ranges used in this study inside the different types of convolutions, does not bring any significant modification on the observables.

So far, there are still no experimental data allowing for an extraction of the gluon TMDs inside unpolarized protons. We believe that the numerous \(J/\psi \)-pair-production events recorded at the LHC can give us access to information about the nonperturbative components of \(f_1^{\, g}\) and \(h_1^{\perp \, g}\), provided the events are selected with kinematics within the validity range of TMD factorization, \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}<M_{\mathcal{Q}\mathcal{Q}}/2\).

4.2 \(\Upsilon \)-pair production

It is also of interest to look at \(\Upsilon \)-pair production. The partonic subprocess is identical to that of di-\(J/\psi \) production. In the non-relativistic limit, where \(M_\Upsilon =2 m_b\), the main difference comes from the mass of the heavy quark. We note that the value of the non-relativistic wave function at the origin (or equivalently the NRQCD LDME for the CS transition) also differs but cancels in the ratios which we consider. The feed-down pattern is also clearly different. However, as announced, we will neglect the resulting (small) feed-down effects.

Owing to this larger mass, such a process probes the evolution at generally higher scales. The coupling constant \(\alpha _s\) is also smaller which increases the precision of the perturbative expansion. Higher scales also mean that the process is less sensitive to the (large \(b_T\)) nonperturbative behavior of the gluon TMDs. Hence, it is also less affected by the uncertainties associated with this unconstrained component.

On the experimental side, \(\Upsilon \)-pair production is admittedly a rare process. Yet, it starts to be accessible at the LHC. The first analysis by the CMS collaboration at \(\sqrt{s}\) = 8 TeV only comprised a 40-event sample [99] but a second one is forthcoming. During the future high luminosity LHC runs, it will definitely be possible to record a sufficient number of events for a TMD analysis of both the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) and azimuthal dependences of the yield. Figure 6 depicts the azimuthal modulations for \(\Upsilon \)-pair production as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) up to \(M_{\mathcal{Q}\mathcal{Q}}/2\), for values of \(M_{\mathcal{Q}\mathcal{Q}}\) of 30, 40 and 50 GeV.

Fig. 6
figure 6

The azimuthal asymmetries for di-\(\Upsilon \) production as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). The different plots show \(2\langle \cos (2\phi _{CS}) \rangle \) (top) and \(2\langle \cos (4\phi _{CS}) \rangle \) (bottom), at \(|\cos (\theta _{CS})|<0.25\) (left) and at \(0.25<|\cos (\theta _{CS})|<0.5\) (right). Results are presented for \(M_{\Upsilon \Upsilon }\) = 30, 40 and 50 GeV, and for \(b_{T\lim }\) = 2, 4 and 8 GeV\(^{-1}\). Results for \(M_{\Upsilon \Upsilon }\) = 30 GeV are not included in (d) as they are below percent level

The uncertainty bands associated with the width of \(S_{\mathrm{NP}}\) are clearly narrower than in the \(J/\psi \) case. The \(\cos (2\phi _{CS})\) asymmetry in Fig. 6b reaches 10% at \(M_{\mathcal{Q}\mathcal{Q}}\) = 40 GeV, which is the value for which the corresponding hard-scattering coefficient ratio \(F_3/F_1\) peaks for \(\Upsilon \)-pair production. Moreover, the decrease of the hard-scattering coefficient past the peak is slower, allowing the asymmetry to remain of similar size at \(M_{\mathcal{Q}\mathcal{Q}}\) = 40 and 50 GeV.

5 Conclusions

In this paper we discussed the potential of double \(J/\psi \) and \(\Upsilon \) production for the study of the gluon TMDs inside unpolarized protons at the LHC. We presented the advantages of quarkonia as probes of these TMDs. We improved on previous results [30] by including TMD evolution effects, rendering the results more realistic and effectively taking into account QCD corrections that describe the evolution with the invariant mass \(M_{\mathcal{Q}\mathcal{Q}}\) of the quarkonium pair. We used a simple \(b_T\)-Gaussian of variable width to parametrize the nonperturbative Sudakov factor \(S_{\mathrm{NP}}\) in order to estimate how important its impact is on the predicted yield and asymmetries, as it currently remains unconstrained in the gluon case.

We discussed the broadening of the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum due to the evolution in the case of double-\(J/\psi \) production, as well as the uncertainty associated with a variation of the width of \(S_{\mathrm{NP}}\) between 2 and 8 GeV\(^{-1}\). As expected, we found that its influence decreases at large \(M_{\mathcal{Q}\mathcal{Q}}\) as the perturbative component of the TMDs becomes dominant. We also computed the \(\langle \cos (2,\!4\phi _{CS}) \rangle \) asymmetries as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) and \(M_{\mathcal{Q}\mathcal{Q}}\). We found a notable suppression of the asymmetries in comparison to [30], caused by the fact that \(h_1^{\perp \, g}\) appears at order \(\alpha _s\) in the evolution formalism. We nevertheless found that such asymmetries still reach reasonable sizes for larger \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) values and could be observed in the events already collected and to be recorded in the future. We found that the size of the asymmetries increases with \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\). Such a behavior is explained by the relative slower fall in \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) of the TMD convolutions containing \(h_1^{\perp \, g}\).

TMD factorization needs to be matched onto its collinear counterpart when \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) approaches \(M_{\mathcal{Q}\mathcal{Q}}\). Since the latter generates no asymmetries at leading twist, a Y-term becomes necessary at some point in order to neutralize the growth of the asymmetries and force them toward zero. We also observed that, in spite of the hard-scattering coefficient ratio \(F_4/F_1\) approaching 1 at large energy, the \(\cos (4\phi )\) asymmetry actually falls with \(M_{\mathcal{Q}\mathcal{Q}}\).

Overall we conclude that \(J/\psi \)-pair production is a promising process to measure azimuthal asymmetries related to gluon TMDs as well as the effect of the evolution on the \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\)-spectrum. The energy threshold for this process is relatively low, making it sensitive to the nonperturbative component of the TMDs. The large event sample to be collected by the different collaborations at the LHC should give enough statistics to constrain them. \(\Upsilon \)-pair production presents the interesting opportunity to measure sizeable asymmetries at scales where perturbative contributions dominate, with a reduced necessity to include higher-order corrections. We also presented predictions for the asymmetries as functions of \(\varvec{P}_{\mathcal{Q}\mathcal{Q}{\scriptscriptstyle T}}\) for \(\Upsilon \)-pair production. With sufficient data to come, it would allow for a complementary extraction of the gluon TMDs, while the expected size of asymmetries remain similar. Although \(\Upsilon \) pairs remain extremely rare at the LHC, the future high-luminosity runs will make it possible to acquire enough statistics.

Accessing information about the gluon TMDs can thus already be done at the LHC using quarkonium production, although more efforts in the direction of Ref. [50] are needed in order to obtain rigorous factorization theorems and expressions beyond tree level. It would give us a preview of what we can expect to find at a future electron-ion collider [100] or fixed-target experiments at the LHC [101,102,103,104,105], where these distributions should be accessible through different reactions. Because of the fundamental differences in these experimental setups, it is of great interest to measure the same TMDs using both of them, in order to be able to check fundamental predictions of the formalism such as the evolution and the universality.