1 Introduction

The study of heavy quarkonium bound states (\(c\bar{c}\) and \(b\bar{b}\)) in ultra-relativistic heavy-ion collisions [1, 2] has been a subject of intense theoretical and experimental efforts since it was initially proposed by Matsui and Satz [3] as a probe to study a deconfined quark–gluon plasma (QGP) created in nucleus–nucleus (A+A) collisions. In order to understand quarkonium yields in A+A collisions it is necessary to disentangle effects due to interaction between quarkonium and the QGP medium from those that can be ascribed to cold nuclear matter (CNM). In proton (deuteron)–nucleus collisions, p(d)+A, the formation of a large region of deconfined and hot QGP matter was not expected to occur. Therefore, the observed suppression of quarkonium yields in these systems with respect to pp collisions [4,5,6,7] has traditionally been attributed to CNM effects.

Among the CNM effects, three primary initial-state effects are: modifications of the nuclear parton distribution functions [8,9,10,11], parton saturation effects in the incident nucleus [12], and parton energy loss through interactions with the nuclear medium [13, 14]. On the other hand, the absorption of the heavy quark–antiquark pair through interactions with the co-moving nuclear medium [15,16,17,18] is considered to be a final-state effect. In proton–lead (\(p\)+Pb) collisions, the modification of quarkonium production with respect to that in pp collisions may be quantified by the nuclear modification factor, \(R_{p\mathrm {Pb}}\), which is defined as the ratio of the quarkonium production cross section in \(p\)+Pb collisions to the cross section measured in pp collisions at the same centre-of-mass energy, scaled by the number of nucleons in the lead nucleus:

$$\begin{aligned} R_{p\mathrm {Pb}} = \frac{1}{208}\frac{\sigma _{p+\text{ Pb }}^{\mathcal {O}(n\text {S})} }{\sigma _{pp}^{\mathcal {O}(n\text {S})} }, \end{aligned}$$

where \(\mathcal {O}(n\text {S})\) represents one of five measured quarkonium states, \(J/\psi \), \(\psi (\text {2S}) \), \(\varUpsilon (\text {1S}) \), \(\varUpsilon (\text {2S}) \) and \(\varUpsilon (\text {3S}) \). Several measurements of CNM effects in quarkonium production were performed with \(p\)+Pb data collected in 2013 at the LHC at a centre-of-mass energy per nucleon pair \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\). Measurements of the \(J/\psi \) nuclear modification factor and forward (p beam direction) to backward (\(\mathrm {Pb}\) beam direction) cross-section ratio by the ALICE [19, 20] and LHCb [21] experiments show strong suppression at large rapidity and low transverse momentum. However, no strong modification of \(J/\psi \) production is observed at small rapidities and high transverse momentum by the ATLAS [22] or CMS [23] experiments indicating that the CNM effects have strong rapidity and/or transverse momentum dependence. The CNM effects in excited quarkonium states with respect to the ground state can be quantified by the double ratio, \(\rho _{p\mathrm {Pb}}^{\mathcal {O}(n\text {S})/\mathcal {O}(\text {1S})}\), defined as:

$$\begin{aligned} \rho _{p\mathrm {Pb}}^{\mathcal {O}(n\text {S})/\mathcal {O}(\text {1S})} = \frac{ R_{p\mathrm {Pb}} (\mathcal {O}(n\text {S})) }{ R_{p\mathrm {Pb}} (\mathcal {O(\text {1S})}) } = \frac{ \sigma _{p+\text{ Pb }}^{\mathcal {O}(n\text {S})} }{\sigma _{p+\text{ Pb }}^{\mathcal {O(\text {1S})}}} / \frac{ \sigma _{pp}^{\mathcal {O}(n\text {S})} }{\sigma _{pp}^{\mathcal {O(\text {1S})}}}, \end{aligned}$$

where \(n = 2\) for charmonium and \(n = 2~\text {or}~3\) for bottomonium. In the double ratio, most sources of detector systematic uncertainty cancel out, and measurements of this quantity by different experiments can easily be compared. The initial-state effects are expected to be largely cancelled out in double ratio due to the same modifications affecting partons before the formation of the quarkonium state, so measuring the relative suppression of different quarkonium states should help in understanding the properties of the final-state effects separately from the initial ones. The PHENIX experiment at RHIC has presented measurements of \(\psi (\text {2S}) \) suppression at mid-rapidity for d+Au interactions at \(\sqrt{s_{_\text {NN}}} = 200~\text {GeV}\), showing that the charmonium double ratio is smaller than unity, and decreases from peripheral to central collisions [24]. At the LHC, inclusive \(J/\psi \) [19] and \(\psi (\text {2S}) \) [25] production has been measured by the ALICE experiment in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\) at forward rapidity. Those measurements show a significantly larger suppression of the \(\psi (\text {2S}) \) compared to that measured for \(J/\psi \).

The \(\varUpsilon (n\text {S}) \) (\(n=2,3\)) to \(\varUpsilon (\text {1S}) \) double ratios are both found to be less than unity by the CMS experiment in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\) [26]. A double ratio which is smaller than unity suggests the presence of final-state interactions that affect the excited states more strongly than the ground state, since initial-state effects are expected to cancel. The CMS \(p\)+Pb results indicate that the CNM effect partially contributes to the strong relative suppression found in previous CMS measurements [27,28,29] of Pb+Pb collisions at \(\sqrt{s_{_\text {NN}}} = 2.76~\text {TeV}\).

In this paper, four classes of experimental measurements are presented. The first class of measurements is differential production cross sections of \(J/\psi \), \(\psi (\text {2S}) \), and \(\varUpsilon (n\text {S}) \) (\(n = 1, 2, 3\)) in pp collisions at \(\sqrt{s}=5.02~\mathrm {TeV}\) and \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\). The second is the centre-of-mass rapidity dependence and transverse momentum dependence of \(J/\psi \) and \(\varUpsilon (\text {1S}) \) nuclear modification factors, \(R_{p\mathrm {Pb}} \). The third is the evolution of the quarkonium yields with \(p\)+Pb collision centrality [30] studied using ratios of the yields of quarkonia to that of Z bosons and the correlation between quarkonium yields and event activity, where both are normalised by their average values over all events. The fourth is the charmonium and bottomonium double ratios, \(\rho _{p\mathrm {Pb}}^{\mathcal {O}(n\text {S})/\mathcal {O}(\text {1S})}\), presented as a function of centre-of-mass rapidity and centrality.

2 ATLAS detector

The ATLAS experiment [31] at the LHC is a multi-purpose detector with a forward-backward symmetric cylindrical geometry and a nearly \(4\pi \) coverage in solid angle.Footnote 1 It consists of an inner tracking detector (ID) surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadron calorimeters, and a muon spectrometer (MS). The ID covers the pseudorapidity range \(|\eta | < 2.5\). It consists of silicon pixel, silicon micro-strip, and gaseous transition radiation tracking detectors. A new innermost insertable B-layer [32, 33] installed during the first LHC long shutdown (2013 to 2015) has been operating as a part of the silicon pixel detector since 2015. The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta | < 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) electromagnetic calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\) to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by a steel/scintillator tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter (FCal) modules optimised for electromagnetic and hadronic measurements respectively. The MS comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. Monitored drift tubes and cathode strip chambers are designed to provide precise position measurements in the bending plane in the range \(|\eta | < 2.7\). Resistive plate chambers (RPCs) and thin gap chambers (TGCs) with a coarse position resolution but a fast response time are used primarily to trigger on muons in the ranges \(|\eta | < 1.05\) and \(1.05< |\eta | < 2.4\) respectively.

The ATLAS trigger system [34, 35] is separated into two levels: the hardware-based level-1 (L1) trigger and the software-based high level trigger (HLT), which reduce the proton–proton/lead collision rate to several-hundred Hz of events of interest for data recording to mass storage. The L1 muon trigger requires coincidences between hits on different RPC or TGC planes, which are used as a seed for the HLT algorithms. The HLT uses dedicated algorithms to incorporate information from both the MS and the ID, achieving position and momentum resolution close to that provided by the offline muon reconstruction, as shown in Ref. [34]. During the first LHC long shutdown additional RPCs were installed to cover the acceptance holes at the bottom of the MS and additional TGC coincidence logic was implemented for the region \(1.3< |\eta | < 1.9\) to reduce backgrounds. More details about the improvement in the trigger system during the long shutdown can be found in Ref. [35].

3 Datasets and Monte Carlo samples

This analysis includes data from \(p\)+Pb collisions recorded at the LHC in 2013 and \(pp\) collisions recorded in 2015, both at a centre-of-mass energy of \(5.02~\text {TeV}\) per nucleon pair. These data samples correspond to a total integrated luminosity of \(28~\text{ nb }^{-1}\) and \(25~\text{ pb }^{-1}\) for \(p\)+Pb and pp collisions respectively.

The \(p\)+Pb collisions result from the interactions of a proton beam with an energy of \(4~\text {TeV}\) and a lead beam with an energy of \(1.58~\text {TeV}\) per nucleon. The usual rapidity, y, in the laboratory frame is defined as \(y = 0.5 \ln [(E + p_z) / (E - p_z)]\), where E and \(p_z\) refer to energy and longitudinal momentum respectively. In the \(p\)+Pb collision configuration, the proton–nucleon centre-of-mass rapidity, \(y^*\), had a shift of \(\Delta y = 0.465\) with respect to y in the laboratory frame. After \(60\%\) of the data were recorded the directions of the proton and lead beams were reversed. In this paper, all data from both periods are presented in \(y^*\), using an additional convention that the proton beam always travels in the direction of positive \(y^*\).

Monte Carlo (MC) simulations [36] of \(p\)+Pb and pp collision events are used to study muon trigger and reconstruction efficiencies, and quarkonium signal yields extraction. Events were generated using Pythia 8 [37] with the CTEQ61L [38] parton distribution functions. In each event, one of the five quarkonium states, \(J/\psi \), \(\psi (\text {2S}) \), and \(\varUpsilon (n\text {S}) \) (\(n = 1, 2, 3\)), was produced unpolarised, as motivated by previous measurements at the LHC energy [39,40,41], and forced to decay via the dimuon channel. The response of the ATLAS detector was simulated using Geant 4 [42]. The simulated events were reconstructed with the same algorithms used for data.

4 Event selection

Candidate events in \(p\)+Pb collisions were collected with a dimuon trigger which requires one muon to pass the identification requirement at L1. In addition, the L1 muon candidate must be confirmed in the HLT as a muon with \(p_{\text {T}} > 2~\text {GeV}\), and at least one more muon candidate with \(p_{\text {T}} > 2~\text {GeV}\) must be found in a search over the full MS system. In pp collisions the candidate events were collected with a different dimuon trigger which requires at least two L1 muon candidates with \(p_{\text {T}} > 4~\text {GeV}\). Subsequently in the HLT, the two L1 candidates must be confirmed as muons from a common vertex with opposite-sign charges.

In the offline analysis, events are required to have at least one reconstructed primary vertex with at least four tracks and at least two muons originating from a common vertex, each with \(p_{\text {T}} > 4~\text {GeV}\) and matching an HLT muon candidate associated with the event trigger. The selected muons are required to be Combined [43] and Tight [44] in \(p\)+Pb and \(pp\) collisions respectively, where Combined implies that a muon results from a track in the ID which can be combined with one in the MS, and Tight requires a strict compatibility between the two segments. To ensure high-quality triggering and accurate track measurement, each muon track is further restricted to \(|\eta | < 2.4\). Pairs of muon candidates satisfying these quality requirements, and with opposite charges, are selected as quarkonia candidate pairs. All candidate pairs that satisfy the criteria discussed above, including those events with additional interactions in the same bunch crossing (known as “pile-up” events), are used for the cross-section measurements.

In order to characterise the \(p\)+Pb collision geometry, each event is assigned to a centrality class based on the total transverse energy measured in the FCal on the Pb-going side (backwards). Collisions with more (fewer) participating nucleons are referred to as central (peripheral). Following Ref. [30], the centrality classes used for this analysis, in order from most central to most peripheral, are 0–5, 5–10, 10–20, 20–30, 30–40, 40–60, and 60–\(90\%\).

5 Analysis

5.1 Cross-section determination

The double-differential cross section multiplied by the dimuon decay branching fraction is calculated for each measurement interval as:

$$\begin{aligned} \frac{\mathrm {d}^2\sigma _{\mathcal {O}(n\text {S})}}{\mathrm {d}p_{\text {T}} \mathrm {d}y^*}\times B(\mathcal {O}(n\text {S})\rightarrow \mu ^+\mu ^-) = \frac{N_{\mathcal {O}(n\text {S})}}{\Delta p_{\text {T}} \times \Delta y\times L}, \end{aligned}$$
(1)

where L is the integrated luminosity, \(\Delta p_{\text {T}} \) and \(\Delta y\) are the interval sizes in terms of dimuon transverse momentum and centre-of-mass rapidity respectively, and \(N_{\mathcal {O}(n\text {S})}\) is the observed quarkonium yield in the kinematic interval under study, extracted from fits and corrected for acceptance, trigger and reconstruction efficiencies. The total correction weight assigned to each selected dimuon candidate is given by:

$$\begin{aligned} w_{\mathrm {total}}^{-1} = \mathcal {A}(\mathcal {O}(n\text {S})) \cdot \varepsilon _{\mathrm {reco}}\cdot \varepsilon _{\mathrm {trig}}, \end{aligned}$$
(2)

where \(\mathcal {A}(\mathcal {O}(n\text {S}))\) is the acceptance of the dimuon system for one of the five quarkonium states, \(\varepsilon _{\mathrm {reco}}\) is the dimuon reconstruction efficiency and \(\varepsilon _{\mathrm {trig}}\) is the trigger efficiency.

5.2 Acceptance

The acceptance of quarkonium decays into muon pairs is defined as the probability of both muons from the decay falling in the fiducial region (\(p_{\text {T}} (\mu ^{\pm }) > 4~\text {GeV}, |\eta (\mu ^{\pm })|<2.4\)). The acceptance depends on transverse momentum, rapidity, invariant mass and the spin-alignment of the quarkonium state. The invariant mass of each state is taken to be the generator-level mass. Previous measurements in pp collisions [39,40,41] indicate that decays of quarkonia produced at LHC energies are consistent with the assumption that they are unpolarised. Based on this assumption, and with a further assumption that the nuclear medium does not modify the average polarisation of produced quarkonia, all quarkonium states in both the \(p\)+Pb and pp collisions are considered to be produced unpolarised in this paper. The acceptance, \(\mathcal {A}(\mathcal {O}(n\text {S}))\), for each of \(J/\psi \), \(\psi (\text {2S}) \), and \(\varUpsilon (n\text {S}) \) (\(n = 1, 2, 3\)) as a function of quarkonium transverse momentum and |y| is calculated using generator-level MC, applying cuts on the \(p_{\text {T}} \) and \(\eta \) of the muons to emulate the fiducial volume as described in Refs. [45, 46]. The reconstructed dimuon transverse momentum, \(p_{\text {T}} ^{\mu \mu }\), is used for obtaining the acceptance correction for a given event. However, the reconstructed dimuon transverse momentum and the quarkonium transverse momentum could be different due to final-state radiation from muons. Corrections for final-state radiation are obtained by comparing acceptances calculated from generator-level muons with those after full detector simulation. The final-state radiation corrections as a function of \(p_{\text {T}} ^{\mu \mu }\) are applied to the acceptance corrections. The correction factors are different for charmonium and bottomonium states but are the same for ground and excited states. Finally, the same correction factors are used in \(pp\) and \(p\)+Pb data.

5.3 Muon reconstruction and trigger efficiency

The single muon reconstruction efficiency in \(p\)+Pb data is determined directly from data using \(J/\psi \rightarrow \mu ^{+} \mu ^{-} \) tag-and-probe method as used in Ref. [22, 43], in which the tag muon is required to match with the trigger used to select the sample such that the probe muon is unbiased from the sample selection trigger, and the purity of the probe is guaranteed by background subtraction based on \(J/\psi \rightarrow \mu ^{+} \mu ^{-} \) decay. The dimuon trigger efficiency in \(p\)+Pb data is factorised into single-muon trigger efficiencies for reconstructed muons at the L1 and HLT, with the correlation between the L1 and HLT trigger algorithms taken into account. The single-muon trigger efficiencies are obtained from data, based on \(J/\psi \rightarrow \mu ^{+} \mu ^{-} \) tag-and-probe method as described in Ref. [22], in intervals of \(p_{\text {T}} (\mu )\) and \(q\times \eta (\mu )\), where q is the charge of the muon.

For the pp data, the same \(J/\psi \rightarrow \mu ^{+} \mu ^{-} \) tag-and-probe technique as for \(p\)+Pb data is used to determine single muon reconstruction and trigger efficiencies. The dimuon trigger efficiency in the pp data consists of two components. The first part represents the trigger efficiency for a single muon in intervals of \(p_{\text {T}} (\mu )\) and \(q\times \eta (\mu )\). The second part is a dimuon correction term to account for reductions in the trigger efficiency due to close-by muon pairs identified as single muon candidates at L1. The dimuon correction term, which is determined separately for charmonium and bottomonium candidates, also accounts for inefficiency due to the vertex-quality requirement and opposite-sign charge requirement on the two online candidates. The efficiency and the dimuon correction term obtained from the MC simulation are used to correct data in order to suppress the statistical fluctuations of measured corrections. The measured average single-muon trigger efficiency is about \(80\%\) (\(95\%\)) in the range \(|\eta (\mu )| < 1.05\) (\(1.05< |\eta (\mu )| < 2.4\)). In addition to the main corrections derived from simulation, data-to-simulation scale factors, which are simple linear factors to account for the differences between data and MC simulation, are also applied. The resulting scale factor is found to be about \(92\%\) in the range \(|\eta (\mu )| < 1.05\) and about \(98\%\) in the range \(1.05< |\eta (\mu )| < 2.4\) without apparent \(p_{\text {T}} \) dependence in both \(\eta \) regions.

5.4 Yield extraction

Charmonium

The charmonium yield determination decomposes the yields into two sources of muon pairs referred to as “prompt” and “non-prompt”. The prompt \(J/\psi \) and \(\psi (\text {2S}) \) signal originates from the strong production of short-lived particles, including feed-down from other short-lived charmonium states, while non-prompt refers to \(J/\psi \) and \(\psi (\text {2S}) \) mesons which are the decay products of b-hadrons. To distinguish between these prompt and non-prompt processes, the pseudo-proper lifetime, \(\tau _{\mu \mu } = (L_{xy}m_{\mu \mu })/p_{\text {T}} ^{\mu \mu }\), is used. The transverse displacement, \(L_{xy}\), is the distance of the dimuon secondary vertex from the primary vertex along the dimuon momentum direction in the transverse plane. Two-dimensional unbinned maximum-likelihood fits, as used in a previous ATLAS measurement [47], are performed on weighted distributions of the dimuon invariant mass (\(m_{\mu \mu }\)) and pseudo-proper lifetime (\(\tau _{\mu \mu }\)) to extract prompt and non-prompt signal yields, in intervals of \(p_{\text {T}} ^{\mu \mu }\), rapidity and centrality. The event weight is given by Eq. (2). To obtain the acceptance corrections, \(J/\psi \) acceptance is applied to events with \(m_{\mu \mu } < 3.2~\text {GeV}\), \(\psi (\text {2S}) \) acceptance is applied to events with \(m_{\mu \mu } > 3.5~\text {GeV}\) and a linear interpolation of the two acceptances is used for events with \(3.2< m_{\mu \mu } < 3.5~\text {GeV}\). Each interval of \(p_{\text {T}} ^{\mu \mu }\), rapidity and centrality is fitted independently in the RooFit framework [48]. The two-dimensional probability density function (PDF) in \(m_{\mu \mu }\) and \(\tau _{\mu \mu }\) for the fit model is defined as:

$$\begin{aligned} \text {PDF}(m_{\mu \mu },\tau _{\mu \mu }) = \sum ^7_{i=1}\kappa _i f_i(m_{\mu \mu }) \cdot h_i(\tau _{\mu \mu }) \otimes g(\tau _{\mu \mu }), \end{aligned}$$

where \(\otimes \) implies a convolution, \(\kappa _i\) is the normalisation factor of each component and \(g(\tau _{\mu \mu })\) is a double Gaussian \(\tau _{\mu \mu }\) resolution function. The two Gaussian components share a fixed mean at \(\tau _{\mu \mu } = 0\). One of the two widths in the resolution function is free, while the other width is fixed at the first one multiplied by a constant factor, determined from MC simulation. The relative fraction of the two Gaussian components is a free parameter. The details of the seven components in the nominal fit model are summarised in Table 1 and described below.

Table 1 Probability density functions for individual components in the central fit model used to extract the prompt and non-prompt contributions for charmonium signals and backgrounds. The composite PDF terms are defined as follows: CB Crystal Ball; G Gaussian; E Exponential; F constant distribution; \(\delta \) delta function. The parameter \(\omega _i\) is the fraction of CB component in signal

The signal charmonium line shape in \(m_{\mu \mu }\) is described by the sum of a Crystal Ball shape (CB) [49] and a single Gaussian function with a common mean. The width parameter in the CB function is free, while the Gaussian width is fixed with respect to the CB width by a constant factor motivated by the ratio of muon transverse momentum resolutions in different parts of the detector. The rest of the parameters in the CB function are fixed to values obtained from MC simulation. The mean and width of the \(\psi (\text {2S}) \) are fixed to those of the \(J/\psi \) multiplied by a factor equal to the ratio of the measured masses of the \(\psi (\text {2S}) \) and the \(J/\psi \) [50]. The relative fraction of the CB and Gaussian components is considered to be a free parameter, but one that is common to both the \(J/\psi \) and \(\psi (\text {2S}) \). The prompt charmonium line shapes in \(\tau _{\mu \mu }\) are described by a \(\delta \) function convolved with the resolution function \(g(\tau _{\mu \mu })\), whereas the non-prompt charmonium signals have pseudo-proper lifetime line shapes given by an exponential function convolved with \(g(\tau _{\mu \mu })\).

The background contribution contains one prompt component and two non-prompt components. The prompt background is given by a \(\delta \) function convolved with \(g(\tau _{\mu \mu })\) in \(\tau _{\mu \mu }\) and a constant distribution in \(m_{\mu \mu }\). One of the non-prompt background contributions is described by a single-sided exponential function convolved with \(g(\tau _{\mu \mu })\) (for positive \(\tau _{\mu \mu }\) only), and the other non-prompt background contribution is described by a double-sided exponential function convolved with \(g(\tau _{\mu \mu })\) accounting for misreconstructed or combinatoric dimuon pairs. The two non-prompt backgrounds are parameterised as two independent exponential functions in \(m_{\mu \mu }\).

There are in total seventeen free parameters in the charmonium fit model. The normalisation factor \(\kappa _i \) of each component is extracted from each fit. From these parameters, and the weighted sum of events, all measured values are calculated. Figure 1 shows examples of charmonium fit projections onto invariant mass and pseudo-proper lifetime axes. The fit projections are shown for the total prompt signal, total non-prompt signal and total background contributions.

Fig. 1
figure 1

Projections of the charmonium fit results onto dimuon invariant mass \(m_{\mu \mu }\) (left) and pseudo-proper lifetime \(\tau _{\mu \mu }\) (right) for pp collisions at \(\sqrt{s}=5.02~\mathrm {TeV}\) (top) for the kinematic ranges \(10< p_{\text {T}} ^{\mu \mu } < 11~\text {GeV}\) and \(|y| < 2.0\), and \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\) (bottom) for the kinematic ranges \(10< p_{\text {T}} ^{\mu \mu } < 11~\text {GeV}\) and \(-2.0< y^* < 1.5\). The goodnesses of the invariant mass fits with \(\text {ndof} = 63\) and the pseudo-proper lifetime fits with \(\text {ndof} = 153\) are also presented

Bottomonium

The yields of bottomonium states are obtained by performing unbinned maximum likelihood fits of the weighted invariant mass distribution, in intervals of \(p_{\text {T}} ^{\mu \mu }\), rapidity and centrality. Due to overlaps between the invariant mass peaks of different bottomonium states, the linear acceptance interpolation used for the charmonium states is not appropriate to the bottomonium states. Instead, each interval is fitted three times to extract the corrected yields of the three different bottomonium states, each time with the acceptance weight of one of the three states assigned to all candidates. Each of the fits in each interval of \(p_{\text {T}} ^{\mu \mu }\), rapidity and centrality is independent of all the others.

The bottomonium signal invariant mass model is essentially the same as the charmonium model. The mean and width of the \(\varUpsilon (\text {1S}) \) is free, while the means and widths of \(\varUpsilon (\text {2S}) \) and \(\varUpsilon (\text {3S}) \) are fixed with respect to parameters of \(\varUpsilon (\text {1S}) \) with a constant scaling factor equal to the \(\varUpsilon (n\text {S}) \) to \(\varUpsilon (\text {1S}) \) mass ratio taken from Ref. [50]. The bottomonium background parameterisation varies with \(p_{\text {T}} ^{\mu \mu }\). At low \(p_{\text {T}} ^{\mu \mu }\) (\(p_{\text {T}} ^{\mu \mu } < 6~\text {GeV}\)), and for all rapidity intervals, an error function multiplied by an exponential function is used to model the \(m_{\mu \mu }\) turn-on effects due to decreasing acceptance with decreasing invariant mass, which originates from the \(p_{\text {T}}\) selection applied to each muon. At low \(p_{\text {T}} ^{\mu \mu }\), the background model’s parameters are constrained by using a background control sample. The control sample is selected from dimuon events in which at least one of the muons has a transverse impact parameter with respect to the primary vertex larger than \(0.2~\mathrm {mm}\). This criterion causes the control sample to be dominated by muon pairs from the decay of b-hadrons. For candidates with higher \(p_{\text {T}} ^{\mu \mu }\), a second-order polynomial is used to describe the background contribution. At low \(p_{\text {T}} ^{\mu \mu }\), the background model is first fitted to the control sample, then the parameters of the error function are fixed at their fitted values, and finally the full fit model with the constrained background is applied to the data sample. Some selected bottomonium fits are shown in Fig. 2.

Fig. 2
figure 2

Bottomonium fit results in dimuon invariant mass \(m_{\mu \mu }\) for pp collisions at \(\sqrt{s}=5.02~\mathrm {TeV}\) (top) and \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\) (bottom) for one typical low \(p_{\text {T}} ^{\mu \mu }\) interval of \(1.5<p_{\text {T}} ^{\mu \mu }<3~\text {GeV}\) (left) and one high \(p_{\text {T}} ^{\mu \mu }\) interval of \(14<p_{\text {T}} ^{\mu \mu }<20~\text {GeV}\) (right). For all the shown fits, \(\varUpsilon (\text {1S}) \) acceptance weights are assigned. The goodness of the bottomonium fit with \(\text {ndof} = 24\) is also presented

6 Systematic uncertainties

The sources of systematic uncertainty in the quarkonium yields include acceptance, muon reconstruction and trigger efficiency corrections, the fit model parameterisation and bin migration corrections and the luminosity. For the ratio measurements the systematic uncertainties are assessed in the same manner as for the yields, except that in the ratios the correlated systematic uncertainties, such as the luminosity uncertainty, cancel out.

Luminosity

The uncertainty in integrated luminosity is \(2.7\%\) (\(5.4\%\)) for 2013 \(p\)+Pb (2015 pp) data-taking. The luminosity calibration is based on data from dedicated beam-separation scans, also known as van der Meer scans, as described in Ref. [51].

Acceptance

A systematic uncertainty for the final-state radiation corrections is assigned to cover the differences between correction factors obtained for ground and excited states of quarkonium and for different rapidity slices. The systematic uncertainties fully cancel out in ratio measurements in the same datasets and between different datasets.

Muon reconstruction and trigger efficiencies in \(p\)+Pb collisions

The dominant source of uncertainty in the muon reconstruction and trigger efficiency in \(p\)+Pb collisions is statistical. Therefore, the uncertainty in each bin is treated as uncorrelated and the corresponding uncertainties are propagated to the measured observables by using pseudo-experiments as in previous ATLAS measurements [47]. For each pseudo-experiment a new efficiency map is created by varying independently the content of each bin according to a Gaussian distribution. The mean and width parameters of the Gaussian distribution are respectively the value and uncertainty of the bin in the original map. In each pseudo-experiment, the total weight is recalculated for each dimuon kinematic interval of the analysis. A distribution of total weight is obtained from repeating pseudo-experiments for 200 times, which is sufficient to suppress the statistical fluctuation of the sample used in each experiment. For each efficiency type, the RMS of the total weight distributions is assigned as the systematic uncertainty.

An additional uncertainty of \(1\%\) is applied to cover the small muon reconstruction inefficiency in the inner detector in \(p\)+Pb collisions. The dimuon trigger efficiency factorisation is tested in simulation, and a bias of at most \(4\%\) is found in yield observables. The bias stems from the imperfect approximation of the correlation between trigger algorithms at different levels in the dimuon trigger factorisation. An additional correlated uncertainty of \(4\%\) is added to cover this bias. This uncertainty is applied to quarkonium yields in \(p\)+Pb collisions, but is assumed to cancel in ratios measured in the same datasets.

Muon reconstruction and trigger efficiencies in \(pp \) collisions

For the pp measurements, the efficiency maps are determined from MC simulation and corrected with measured data-to-simulation scale factors as detailed in Ref. [44]. The statistical uncertainty associated with efficiency scale factor is evaluated using random replicas of the efficiency maps as for \(p\)+Pb and the different sources of uncertainty described below are treated as correlated. The systematic uncertainty in reconstruction efficiency is obtained by varying the signal and background models in the fits used to extract the efficiency in data, and taking the difference between the reconstruction efficiency calculated using generator-level information and the value obtained with the tag-and-probe method in MC simulation. An additional \(1\%\) correlated uncertainty is added to cover a systematic variation due to a small misalignment in the ID. For the trigger efficiency, the following variations of the analysis are studied and the effects are combined in quadrature:

  • variations of signal and background fit model used to extract the data efficiency;

  • variations of the matching criteria between a muon and a trigger element;

  • using dimuon correction terms determined at positive (or negative) rapidity for whole rapidity range.

A test of the approximation of muon–muon correlation at L1 in the \(pp\) dimuon trigger factorisation in MC simulation results in a bias of at most \(4\%\), which is the same size as the factorisation bias of the \(p\)+Pb trigger but with totally different origins. An additional \(4\%\) correlated uncertainty is added to quarkonium yields to cover the bias. This uncertainty cancels out in ratio observables that are measured in the same datasets.

Bin migrations

Corrections due to bin migration factors were evaluated in Refs. [46, 47] and are determined to be less than \(0.5\%\) of the measured values. For this reason, bin migration correction factors and their uncertainties are neglected in this analysis.

Charmonium fit

The uncertainty from the signal and background line shapes is estimated from variations of the fit model. To remove the statistical component, each variation is repeated with pseudo-experiments, generated using the bootstrap method [52]. First, for each toy sample, every event from the original data is filled into the toy sample n times, where n is a random integer obtained from a Poisson distribution with a mean of one. Then the central model and a set of ‘variation’ models are fitted to the toy sample, and all measured quantities are recalculated. The difference between the central model and a given variation model is extracted and recorded. After repeating the pseudo-experiment 100 times, the systematic uncertainty due to the line shape is defined as the mean difference of a given variation model from the nominal model. Up to ten variation models are considered for the charmonium fit model, categorised into four groups:

  • Signal tail due to final-state radiation Evaluated by replacing the CB plus Gaussian model with a double Gaussian function, and varying the tail parameters of the CB model, which are originally fixed.

  • Pseudo-proper lifetime resolution Evaluated by replacing the double Gaussian function with a single Gaussian function to model pseudo-proper lifetime resolution.

  • Signal pseudo-proper lifetime shape Evaluated by using a double exponential function to describe the pseudo-proper lifetime distribution of the signal.

  • Background mass shapes Evaluated by using a second-order Chebyshev polynomial to describe the prompt, non-prompt and double-sided background terms.

The total systematic uncertainty from the line shape fit is determined by combining the maximum variation found in each of the four groups in quadrature. In order to estimate the possible bias introduced by the line shape assumptions in the nominal fit model parameterisation, the nominal model is tested using the \(J/\psi \rightarrow \mu ^{+} \mu ^{-} \) MC sample in which random numbers of prompt and non-prompt \(J/\psi \) are mixed. About \(1\%\) difference between the random input and fit model output is found for the yield and non-prompt fraction in the MC test. An additional systematic uncertainty of \(1\%\) is assigned to charmonium yields and non-prompt fractions to cover the nominal fit model parameterisation bias. This uncertainty cancels out in the \(\psi (\text {2S}) \) to \(J/\psi \) yield ratio.

Bottomonium fit

The systematic uncertainty from varying the fit model is estimated based on the same method as used for the charmonium fit uncertainty, and there are six variation models for bottomonium categorised into three groups:

  • Signal resolution Evaluated by replacing the CB plus Gaussian model with a single CB function and a triple Gaussian function, and varying the constant width scaling term between the CB function and the Gaussian function.

  • Signal tail due to final-state radiation Evaluated by replacing the CB plus Gaussian model with a double Gaussian function, and treating the tail parameters in the CB function as free parameters.

  • Background shapes Evaluated by replacing the low \(p_{\text {T}} \) background distribution with a fourth-order Chebyshev polynomial, and replacing the high \(p_{\text {T}} \) distribution by an exponential function or a second-order Chebyshev polynomial.

The total systematic uncertainty from the line shape fit is given by combining the maximum variation found in each of the three groups in quadrature. An additional systematic uncertainty of \(1.5\%\) is assigned to \(\varUpsilon (\text {1S}) \) yields and \(2\%\) for \(\varUpsilon (n\text {S}) \) yields and \(\varUpsilon (n\text {S}) \) to \(\varUpsilon (\text {1S}) \) ratios (\(n=2,3\)), in order to cover the bias of the nominal model found in MC tests similar to those for the charmonium fit model.

Table 2 summaries the systematic uncertainties in the ground-state and excited-state yields and their ratio. The dominant sources of systematic uncertainty for the yields are the fit model and muon trigger efficiency. The ranges of uncertainties shown in the table indicate the minimum and maximum values found in all \(p_{\text {T}} ^{\mu \mu }\), rapidity and centrality intervals. The large range of bottomonium fit systematic uncertainty is due to the different modelling of the background at low \(p_{\text {T}} ^{\mu \mu }\) (\(p_{\text {T}} ^{\mu \mu } < 6~\text {GeV}\)) and high \(p_{\text {T}} ^{\mu \mu }\). The systematic uncertainty from fit model variations is much larger at low \(p_{\text {T}} ^{\mu \mu }\) than at high \(p_{\text {T}} ^{\mu \mu }\). For the ratios measured in the same datasets, most sources of systematic uncertainty including the trigger efficiency largely cancel out.

Table 2 Summary of systematic uncertainties in the charmonium and bottomonium ground-state and excited-state yields and their ratio. The ranges of uncertainties indicate the minimum and maximum values found in all kinematic slices. Symbol “−” in the ratio observable column indicates the uncertainty fully cancels out

The luminosity systematic uncertainties in \(pp\) and \(p\)+Pb collisions are considered to be totally uncorrelated. The acceptance systematic uncertainties in \(pp\) and \(p\)+Pb collisions are fully correlated. The reconstruction efficiency systematic uncertainties are treated as uncorrelated due to different muon selection criteria in \(pp\) and \(p\)+Pb collisions. The uncertainties in the trigger efficiencies are also treated as uncorrelated since different efficiency determination strategies are used in \(pp\) and \(p\)+Pb collisions and the factorisation biases originate from different types of trigger correlations. The fit model variation systematic uncertainties are found to be partially correlated and their effects on \(R_{p\mathrm {Pb}} \) and \(\rho _{p\mathrm {Pb}}^{\mathcal {O}(n\text {S})/\mathcal {O}(\text {1S})} \) are determined by studying these ratios obtained from simultaneous fits to \(pp\) and \(p\)+Pb collision data for each variation.

7 Results

7.1 Production cross sections

Following the yield correction and signal extraction, the cross sections of five quarkonium states are measured differentially in transverse momentumFootnote 2 and rapidity, as described in Eq. (1).

The results for non-prompt \(J/\psi \) and \(\psi (\text {2S}) \) cross sections in pp collisions at \(\sqrt{s}=5.02~\mathrm {TeV}\) compared to fixed-order next-to-leading-logarithm (FONLL) predictions [53] are shown in intervals of \(p_{\text {T}} \) for different rapidity slices in Fig. 3. The FONLL uncertainties include renormalisation and factorisation scale variations, charm quark mass and parton distribution functions uncertainties as detailed in Ref. [53]. The measured non-prompt charmonium production cross sections agree with the FONLL predictions within uncertainties over the measured \(p_{\text {T}} \) range.

Fig. 3
figure 3

The differential non-prompt production cross section times dimuon branching fraction of \(J/\psi \) (left) and \(\psi (\text {2S}) \) (right) as a function of transverse momentum \(p_{\text {T}} \) for three intervals of rapidity y in pp collisions at \(5.02~\text {TeV}\). For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity. The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties. The FONLL theory predictions (see text) are also shown, and the error bands in the prediction correspond to the combined factorisation scale, quark mass and parton distribution functions uncertainties

The measured prompt \(J/\psi \) and \(\psi (\text {2S}) \) cross sections in pp collisions at \(\sqrt{s}=5.02~\mathrm {TeV}\) are shown in Fig. 4 in \(p_{\text {T}}\) and rapidity intervals, compared with non-relativistic QCD (NRQCD) predictions. The theory predictions are based on the long-distance matrix elements (LDMEs) from Refs. [54, 55], with uncertainties originating from the choice of scale, charm quark mass and LDMEs (see Refs. [54, 55] for more details). Figures 5 and 6 show the production cross section of \(\varUpsilon (n\text {S}) \) in pp collisions compared to similar NRQCD model calculations [56]. As stated in Ref. [56], the LDMEs for bottomonium production are only extracted from fitting experiment data at \(p_{\text {T}} > 15~\text {GeV}\). At lower \(p_{\text {T}} \), there might be non-perturbative effects which break the NRQCD factorization and perturbation expansion. As a consequence of its construction, the bottomonium NRQCD model gives a relatively good description of the measured \(\varUpsilon (n\text {S}) \) production cross section at \(p_{\text {T}} > 15~\text {GeV}\), while overestimates the production cross section at lower \(p_{\text {T}} \).

Fig. 4
figure 4

The differential prompt production cross section times dimuon branching fraction of \(J/\psi \) (left) and \(\psi (\text {2S}) \) (right) as a function of transverse momentum \(p_{\text {T}} \) for three intervals of rapidity y in pp collisions at \(5.02~\text {TeV}\). For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity. The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties. The NRQCD theory predictions (see text) are also shown, and the error bands in the prediction correspond to the combined scale, quark mass and LDMEs uncertainties

Fig. 5
figure 5

The differential production cross section times dimuon branching fraction of \(\varUpsilon (\text {1S}) \) as a function of transverse momentum \(p_{\text {T}} \) for three intervals of rapidity y in pp collisions at \(5.02~\text {TeV}\). For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity. The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties. The NRQCD theory predictions (see text) are also shown, and the error bands in the prediction correspond to the combined scale, quark mass and LDMEs uncertainties

Fig. 6
figure 6

The differential production cross section times dimuon branching fraction of \(\varUpsilon (\text {2S}) \) (left) and \(\varUpsilon (\text {3S}) \) (right) as a function of transverse momentum \(p_{\text {T}} \) for three intervals of rapidity y in pp collisions at \(5.02~\text {TeV}\). For each increasing rapidity slice, an additional scaling factor of 10 is applied to the plotted points for visual clarity. The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties. The NRQCD theory predictions (see text) are also shown, and the error bands in the prediction correspond to the combined scale, quark mass and LDMEs uncertainties

Fig. 7
figure 7

The differential cross section times dimuon branching fraction of prompt and non-prompt \(J/\psi \) (left) and \(\psi (\text {2S}) \) (right) as a function of transverse momentum \(p_{\text {T}} \) in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\). The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties

Fig. 8
figure 8

The differential cross section times dimuon branching fraction of prompt and non-prompt \(J/\psi \) (left) and \(\psi (\text {2S}) \) (right) as a function of centre-of-mass rapidity \(y^*\) in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\). The horizontal position of each data point indicates the mean of the weighted \(y^*\) distribution. The vertical error bars correspond to statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the systematic uncertainties, and the horizontal sizes of coloured boxes represent the range of \(y^*\) for the bin

Fig. 9
figure 9

Left: the differential cross section times dimuon branching fraction of \(\varUpsilon (n\text {S}) \) as a function of transverse momentum \(p_{\text {T}} \) in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\). The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The horizontal bars represent the range of \(p_{\text {T}} \) for the bin, and the vertical error bars correspond to the combined statistical and systematic uncertainties. Right: the differential cross section times dimuon branching fraction of \(\varUpsilon (n\text {S}) \) as a function of centre-of-mass rapidity \(y^*\) in \(p\)+Pb collisions. The horizontal position of each data point indicates the mean of the weighted \(y^*\) distribution. The vertical error bars correspond to statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the systematic uncertainties, and the horizontal sizes of coloured boxes represent the range of \(y^*\) for the bin

The results for prompt and non-prompt production cross sections of \(J/\psi \) and \(\psi (\text {2S}) \) in \(p\)+Pb collisions at \(\sqrt{s_{_\text {NN}}} =5.02~\mathrm {TeV}\) are shown in intervals of \(p_{\text {T}} \) in Fig. 7. The results for prompt and non-prompt production cross sections of \(J/\psi \) and \(\psi (\text {2S}) \) in \(p\)+Pb collisions in intervals of \(y^*\) are shown in Fig. 8. Compared to the previous ATLAS measurement [22], improved muon trigger corrections which are smaller by \(6\%\) in central value and \(4\%\) in uncertainty and a more comprehensive fit model involving a wider mass range are used in the \(J/\psi \) cross-section measurements. The measured \(J/\psi \) cross sections are consistent with previous results within uncertainties. The measured differential production cross section of \(\varUpsilon (n\text {S}) \) in \(p\)+Pb collisions is shown in Fig. 9. Due to difficulties in separating \(\varUpsilon (\text {2S}) \) and \(\varUpsilon (\text {3S}) \) at forward and backward \(y^*\) intervals in \(p\)+Pb collisions, they are combined to obtain stable rapidity dependence of the production cross section.

7.2 Nuclear modification factor

The \(p_{\text {T}} \) dependence of \(R_{p\mathrm{Pb}}\) for the prompt and non-prompt \(J/\psi \) is shown in Fig. 10. Taking into account the correlated and uncorrelated uncertainties, both the prompt and non-prompt \(J/\psi \) \(R_{p\mathrm{Pb}}\) are consistent with unity across the \(p_{\text {T}} \) range from 8 to \(40~\text {GeV}\). The rapidity dependence of prompt and non-prompt \(J/\psi \) \(R_{p\mathrm{Pb}}\) is shown in Fig. 11. No significant rapidity dependence is observed. The \(p_{\text {T}} \) and rapidity dependence of \(\varUpsilon (\text {1S}) \) \(R_{p\mathrm{Pb}}\) is shown in Fig. 12. The \(\varUpsilon (\text {1S}) \) production in \(p\)+Pb collisions is found to be suppressed compared to \(pp\) collisions at low \(p_{\text {T}} \) (\(p_{\text {T}} < 15~\text {GeV}\)), and increases with \(p_{\text {T}}\). Low \(p_{\text {T}}\) \(\varUpsilon (\text {1S}) \) can probe smaller Bjorken-x region compared to \(J/\psi \) measured in \(8< p_{\text {T}} < 40~\text {GeV}\) [57], so the observed suppression of \(\varUpsilon (\text {1S}) \) production at low \(p_{\text {T}}\) may come from the reduction of hard-scattering cross sections due to stronger nPDF shadowing at smaller Bjorken-x. No significant rapidity dependence is observed, which qualitatively agrees with a prediction of weak rapidity dependence for central rapidities.

Fig. 10
figure 10

The nuclear modification factor, \(R_{p\mathrm {Pb}} \), as a function of transverse momentum \(p_{\text {T}} \) for prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right). The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) distribution. The vertical error bars correspond to the statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties, and the horizontal sizes of coloured boxes represent the \(p_{\text {T}} \) bin sizes. The vertical sizes of the leftmost grey boxes around \(R_{p\mathrm {Pb}} = 1\) represent the correlated systematic uncertainty

Fig. 11
figure 11

The nuclear modification factor, \(R_{p\mathrm {Pb}}\), as a function of centre-of-mass rapidity \(y^*\) for prompt \(J/\psi \) (left) and non-prompt \(J/\psi \) (right). The horizontal position of each data point indicates the mean of the weighted \(y^*\) distribution. The vertical error bars correspond to the statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties, and the horizontal sizes of coloured boxes represent the \(y^*\) bin sizes. The vertical sizes of the leftmost grey boxes around \(R_{p\mathrm {Pb}} = 1\) represent the correlated systematic uncertainty

Fig. 12
figure 12

The nuclear modification factor, \(R_{p\mathrm {Pb}}\), as a function of transverse momentum \(p_{\text {T}} \) (left) and centre-of-mass rapidity \(y^*\) (right) for \(\varUpsilon (\text {1S}) \). The horizontal position of each data point indicates the mean of the weighted \(p_{\text {T}} \) or \(y^*\) distribution. The vertical error bars correspond to the statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties, and the horizontal sizes of coloured boxes represent the bin sizes. The vertical size of the rightmost (left) and leftmost (right) grey boxes around \(R_{p\mathrm {Pb}} = 1\) represent the correlated systematic uncertainty

The Z boson does not interact with the nuclear medium via the strong interaction, so it is considered a good reference process in \(p\)+Pb collisions for studying the centrality dependence of quarkonium production in a model-independent way. The quarkonium yield is compared to the Z boson yield from Ref. [58] in intervals of centrality. The ratio of quarkonium to the Z boson yield is defined as:

$$\begin{aligned} R_{p\mathrm {Pb}} ^{Z} (\mathcal {O}(n\text {S})) = \dfrac{ N_{\mathcal {O}(n\text {S})}^{\text {cent}} / N_Z^{\text {cent}} }{ N_{\mathcal {O}(n\text {S}) }^{0-90\%} / N_Z^{0-90\%} } \end{aligned}$$

where \(N_{\mathcal {O}(n\text {S})}^{\mathrm {cent}}\) (\(N_Z^{\text {cent}}\)) is the corrected quarkonium (Z boson) yield for one centrality class. The resulting \(R_{p\mathrm {Pb}} ^{Z} (\mathcal {O}(n\text {S}))\) is shown in Fig. 13 for the different quarkonium states in intervals of centrality. In each centrality interval, \(R_{p\mathrm {Pb}} ^{Z} (\mathcal {O}(n\text {S}))\) is normalised to the ratio integrated in the centrality range 0–\(90\%\) such that the normalised yield ratio is independent of production cross sections of the different quarkonium states. The prompt and non-prompt \(J/\psi \) are found to behave in a way very similar to the Z boson. The Z boson production is found to scale with the number of binary collisions in \(p\)+Pb collisions after applying the centrality bias correction factor [58, 59]. The centrality bias correction factor proposed in Ref. [59] does not depend on the physics process, so the measured \(R_{p\mathrm {Pb}} ^{Z} (J/\psi )\) suggests that centrality-bias-corrected \(J/\psi \) production also scales with the number of binary collisions. The measured \(R_{p\mathrm {Pb}} ^{Z} (\varUpsilon (\text {1S}))\) is consistent with being a constant except for the measurement in the most peripheral (60–\(90\%\)) \(p\)+Pb collisions, which is about two to three standard deviations away from the value observed in more central collisions. The current precision of \(R_{p\mathrm {Pb}} ^{Z} (\psi (\text {2S}))\) does not allow conclusions to be drawn about the centrality dependence of prompt \(\psi (\text {2S}) \) production with respect to Z bosons.

Fig. 13
figure 13

Prompt and non-prompt \(J/\psi \), prompt \(\psi (\text {2S}) \) and \(\varUpsilon (\text {1S}) \) to Z boson yield ratio, \(R_{p\mathrm {Pb}} ^Z\), as a function of event centrality in \(p\)+Pb collisions. The ratio is normalised to the ratio integrated in the centrality range 0–\(90\%\). The error bars represent statistical uncertainties. The vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties, and the vertical size of the leftmost grey box around \(R_{p\mathrm {Pb}} ^Z = 1\) represents the correlated systematic uncertainty

Quarkonium self-normalised yields, \({ \mathcal {O}(n\text {S}) }/{\langle \mathcal {O}(n\text {S}) \rangle }\), are defined as the per-event yields of quarkonium in each centrality class normalised by the yield in the 0–\(90\%\) centrality interval. The correlation of quarkonium production with the underlying event is traced by comparing the self-normalised quarkonium yields with the respective self-normalised event activity. The event activity is characterised by the total transverse energy deposition in the backward FCal (\(3.1< |\eta | < 4.9\)), \(\Sigma E_{\text {T}} ^{\mathrm {Backwards}}\), on the Pb-going side, and it is determined in a minimum-bias data sample as used in Ref. [30]. The self-normalised quantities \({ \mathcal {O}(n\text {S}) }/{\langle \mathcal {O}(n\text {S}) \rangle }\) and \({\Sigma E_{\text {T}} ^{\mathrm {Backwards}}}/{\langle \Sigma E_{\text {T}} ^{\mathrm {Backwards}}\rangle }\) are defined as:

$$\begin{aligned} \frac{\mathcal {O}(n\text {S})}{\langle \mathcal {O}(n\text {S}) \rangle } \equiv \frac{N_{\mathcal {O}(n\text {S})}^{\mathrm {cent}}/N_{\mathrm {evt}}^{\mathrm {cent}} }{ N_{\mathcal {O}(n\text {S})}^{~0-90\%}/N_{\mathrm {evt}}^{~0-90\%} },\\ \frac{ \Sigma E_{\text {T}} ^{\mathrm {Backwards}} }{\langle \Sigma E_{\text {T}} ^{\mathrm {Backwards}} \rangle } = \frac{\langle \Sigma E_{\text {T}} ^{\mathrm {Backwards}} \rangle _{~\mathrm {cent}}}{\langle \Sigma E_{\text {T}} ^{\mathrm {Backwards}} \rangle _{~0-90\%}}, \end{aligned}$$

where \(N_{\mathrm {evt}}^{\mathrm {cent}}\) is the number of events in the minimum-bias sample for one centrality class. The measured self-normalised yields for prompt \(J/\psi \), non-prompt \(J/\psi \) and \(\varUpsilon (\text {1S}) \) in \(p\)+Pb collisions are shown in Fig. 14 in comparison with the same observable for \(\varUpsilon (\text {1S}) \) in a previous CMS measurement [26]. The event activity is determined in the range \(4.0< |\eta | < 5.2\) in CMS. The \(\varUpsilon (\text {1S}) \) self-normalised yields from ATLAS and CMS show a consistent trend. In the events with the highest event activity, a two-standard-deviation departure from the linear trend is observed. Since the same centrality dependence is found for ground-state quarkonium states and Z bosons as seen in Fig. 13, the deviation at highest event activity may suggest that the \(\Sigma E_{\text {T}} ^{\mathrm {Backwards}}\) characterised event activity is not a robust scale parameter, but a more complicated geometry model is needed for instance as discussed in Ref. [58].

Fig. 14
figure 14

Self-normalised yield for prompt \(J/\psi \), non-prompt \(J/\psi \) and \(\varUpsilon (\text {1S}) \), compared to \(\varUpsilon (\text {1S}) \) self-normalised yield ratio measured by CMS [26]. The horizontal position of each data point represents the normalised mean value of the \(\Sigma E_{\text {T}} ^{\mathrm {Backwards}}\) distribution in minimum-bias data sample. The error bar represents statistical uncertainties, and the vertical size of box underneath the data point represents the systematic uncertainties. The dotted line is a linear function with a slope equal to unity

7.3 Double ratio

The prompt \(\psi (\text {2S}) \) to \(J/\psi \) production double ratio, \(\rho _{p\mathrm {Pb}}^{\psi (\text {2S})/J/\psi }\), is shown in Fig. 15 in intervals of \(y^*\). A decreasing trend with one-standard-deviation significance of the double ratio is observed from backward to forward centre-of-mass rapidity. The \(p_{\text {T}} \) and \(y^*\) integrated bottomonium double ratios, \(\rho _{p\mathrm {Pb}}^{\varUpsilon (n\text {S})/\varUpsilon (\text {1S})}\) (\(n=2,3\)) are shown in Fig. 16. Both the integrated \(\rho _{p\mathrm {Pb}}^{\varUpsilon (\text {2S})/\varUpsilon (\text {1S})}\) and \(\rho _{p\mathrm {Pb}}^{\varUpsilon (\text {3S})/\varUpsilon (\text {1S})}\) are found to be less than unity by two standard deviations, and they are consistent with each other within the uncertainties. The double ratio as a function of centrality is shown in Fig. 17. Both \(\rho _{p\mathrm {Pb}}^{\psi (\text {2S})/J/\psi }\) and \(\rho _{p\mathrm {Pb}}^{\varUpsilon (\text {2S})/\varUpsilon (\text {1S})}\) are found to decrease slightly with increasing centrality at the significance level of one-standard-deviation, while conclusions about \(\rho _{p\mathrm {Pb}}^{\varUpsilon (\text {3S})/\varUpsilon (\text {1S})}\) are precluded by the size of the statistical uncertainties.

Fig. 15
figure 15

The prompt charmonium production double ratio, \(\rho _{p\mathrm {Pb}}^{\psi (\text {2S})/J/\psi } \), as a function of the centre-of-mass rapidity, \(y^*\). The vertical error bars correspond to the statistical uncertainties. The horizontal position of each data point indicates the mean of the weighted \(y^*\) distribution. The vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties, and horizontal sizes of the coloured boxes represent the bin sizes

Fig. 16
figure 16

The bottomonium double ratio, \(\rho _{p\mathrm {Pb}}^{\varUpsilon (n\text {S})/\varUpsilon (\text {1S})}\), integrated in the whole measured \(p_{\text {T}} \) and \(y^*\) range. The vertical error bars correspond to the statistical uncertainties. The vertical sizes of boxes around the data points represent the systematic uncertainties

Fig. 17
figure 17

The prompt charmonium double ratio, \(\rho _{p\mathrm {Pb}}^{\psi (\text {2S})/J/\psi } \), (left) and the bottomonium double ratio, \(\rho _{p\mathrm {Pb}}^{\varUpsilon (n\text {S})/\varUpsilon (\text {1S})} \), (right) as a function of event centrality in \(p\)+Pb collisions . The vertical error bars correspond to the statistical uncertainties, and the vertical sizes of coloured boxes around the data points represent the uncorrelated systematic uncertainties in \(p\)+Pb collisions. The vertical size of the leftmost yellow box around \(\rho _{p\mathrm {Pb}}^{\mathcal {O}(n\text {S})/\mathcal {O}(\text {1S})} = 1\) represents the total uncertainty of the pp reference which is the same for all centralities

8 Summary

The double-differential production cross sections of five quarkonium states, \(J/\psi \), \(\psi (\text {2S}) \), \(\varUpsilon (n\text {S}) \) (\(n = 1, 2, 3\)) are measured using using \(p\)+Pb (\(pp\)) collision data corresponding to an integrated luminosity of \(28~\text{ nb }^{-1}\) (\(25~\text{ pb }^{-1}\)) at a centre-of-mass energy per nucleon pair of \(5.02~\text {TeV}\) collected by the ATLAS experiment at the LHC. The measured prompt charmonium production cross section in the range of \(8< p_{\text {T}} < 40~\text {GeV}\) is found to be compatible with non-relativistic QCD predictions, while only the bottomonium results at \(p_{\text {T}} > 15~\text {GeV}\) can be described by the non-relativistic QCD predictions. The measured non-prompt production cross sections of \(J/\psi \) and \(\psi (\text {2S}) \) in \(pp\) collisions are found to be consistent with fixed-order next-to-leading-logarithm calculations.

The nuclear modification factors of prompt and non-prompt \(J/\psi \) in \(p\)+Pb collisions, \(R_{p\mathrm {Pb}} \), measured for \(8< p_{\text {T}} < 40~\text {GeV}\) are found to be consistent with unity, and no apparent dependence on \(p_{\text {T}} \) or rapidity is observed in the measured range, which indicates weak modification of \(J/\psi \) production due to cold nuclear matter effects at central rapidity and high \(p_{\text {T}} \). The \(R_{p\mathrm {Pb}} \) for \(\varUpsilon (\text {1S}) \) is measured for \(p_{\text {T}} < 40~\text {GeV}\) and is found to be smaller than unity at \(p_{\text {T}} < 15~\text {GeV}\), increasing with \(p_{\text {T}} \) and becoming compatible with unity at high \(p_{\text {T}} \). The observed suppression of \(\varUpsilon (\text {1S}) \) production in \(p\)+Pb collisions at low \(p_{\text {T}} \) suggests that the nuclear parton distribution functions are modified relative to those of the nucleon. No apparent rapidity dependence of \(\varUpsilon (\text {1S}) ~R_{p\mathrm {Pb}} \) is observed. The production ratios of prompt and non-prompt \(J/\psi \) to Z boson are found to be constant in bins of centrality. As the Z boson production in \(p\)+Pb collisions was found to scale with the number of binary collisions after applying centrality bias correction factors, the same conclusion can be drawn for \(J/\psi \) production in \(p\)+Pb collisions. The self-normalised yields of ground-state quarkonium states in \(p\)+Pb collisions are found to correlate linearly with self-normalised event activity expected for events with the highest event activity where the self-normalised yields show two-standard-deviation departure from the linear correlation trend.

The prompt charmonium double ratio is found to decrease slightly from the backward to the forward centre-of-mass rapidity. The prompt \(\psi (\text {2S})\) production is suppressed with respect to prompt \(J/\psi \) production in \(p\)+Pb collisions with a significance of one standard deviation. The production of excited bottomonium states, \(\varUpsilon (\text {2S}) \) and \(\varUpsilon (\text {3S}) \), is found to be suppressed with respect to \(\varUpsilon (\text {1S}) \) in the integrated kinematic ranges of \(p_{\text {T}} < 40~\text {GeV}\) and \(-2< y^* < 1.5\) in \(p\)+Pb collisions with significance at the level of two standard deviations. Both the prompt \(\psi (\text {2S}) \) to \(J/\psi \) and \(\varUpsilon (\text {2S}) \) to \(\varUpsilon (\text {1S}) \) double ratios show decreasing behaviour in more central collisions. The decreasing trends from peripheral to central are at the significance level of one standard deviation. A stronger cold nuclear matter effect is observed in excited quarkonium states compared to that in ground states.

This work expands the kinematic range of measured charmonium and bottomonium cross sections in pp and \(p\)+Pb collisions. It thus serves as an additional dataset for constraining different models of cold nuclear matter effects and quantifying heavy quarkonium production. In particular, the behaviour of the ground-state yields as a function of centrality is found to match that of Z bosons, while excited states are relatively suppressed in more central collisions.