1 Introduction

Measurements of two-particle correlations (2PC) in relative azimuthal angle, \(\Delta \phi =\phi ^{{\mathrm {a}}}-\phi ^{{\mathrm {b}}}\), and pseudo-rapidity separationFootnote 1 \(\Delta \eta =\eta ^{{\mathrm {a}}}-\eta ^{{\mathrm {b}}}\) in proton–proton (pp) collisions show the presence of correlations in \(\Delta \phi \) at large \(\eta \) separation [1,2,3,4].Footnote 2 Recent studies by the ATLAS Collaboration demonstrate that these long-range correlations are consistent with the presence of a cosine modulation of the single-particle azimuthal angle distributions [2, 3], similar to that seen in nucleus-nucleus (A+A) [5,6,7,8,9,10,11,12,13,14] and proton-nucleus (\((p\hbox {+A})\) ) collisions [3, 15,16,17,18,19,20]. The modulation of the single-particle azimuthal angle distributions is typically characterized using a set of Fourier coefficients \(v_n\), also called flow harmonics, that describe the relative amplitudes of the sinusoidal components of the single-particle distributions:

$$\begin{aligned} \frac{{\mathrm {d}}N}{{\mathrm {d}}\phi }\propto \left( 1+2\sum _{n=1}^{\infty }v_{n}\cos \Bigl (n(\phi -\Phi _{n})\Bigr )\right) , \end{aligned}$$
(1)

where the \(v_n\) and \(\Phi _{n}\) denote the magnitude and orientation of the \(n{\mathrm {th}}\)-order single-particle anisotropies.

The \(v_n\) in \(\hbox {A+A}\) collisions result from anisotropies of the initial collision geometry, which are subsequently transferred to the azimuthal distributions of the produced particles by the collective evolution of the medium. This transfer of the spatial anisotropies in the initial collision geometry to anisotropies in the final particle distributions is well described by relativistic hydrodynamics [21,22,23,24,25]. The ATLAS measurements  [3] show that the \(p_{\mathrm{T}}\) dependence of the second-order harmonic, \(v_2\), in pp collisions is similar to the dependence observed in \((p\hbox {+A})\) and \(\hbox {A+A}\) collisions. Additionally, the \(v_2 (p_{\mathrm{T}}\,)\) in pp collisions shows no dependence on the centre-of-mass collision energy, \(\sqrt{s}\), from 2.76 TeV to 13 TeV, similar to what is observed in \((p\hbox {+A})\) and \(\hbox {A+A}\) collisions [5,6,7]. The observation that the \(p_{\mathrm{T}}\) and \(\sqrt{s}\) dependences of \(v_2\) are each strikingly similar between pp collisions and \((p\hbox {+A})\) and \(\hbox {A+A}\) collisions indicates the possibility of collective behaviour developing in pp collisions, although alternative models exist that qualitatively reproduce the features observed in the pp 2PC [26,27,28,29,30,31,32,33,34].

One feature in which the pp \(v_2\) differs from the \(v_2\) in \(\hbox {A+A}\) collisions is that the pp \(v_2\) is observed to be independent, within uncertainties, of the event multiplicity [2, 3], while the \(\hbox {A+A}\) \(v_2\) exhibit considerable dependence on the event multiplicity [5,6,7,8]. This dependence is understood to be due to a correlation between the collision geometry and collision impact parameter (\(b\)) [35]. In collisions with small \(b\) the second-order eccentricity \(\epsilon _{2}\) [36, 37] quantifying the ellipticity of the initial collision geometry is small, resulting in a small \(v_2\). Interactions at \(b \sim R\), where R is the nuclear radius, result in an overlap region that becomes increasingly elliptic, with \(\epsilon _{2}\) increasing with \(b\). This, in turn, generates larger \(v_2\). Thus, the strong correlation between the \(v_2\) and multiplicity is in fact the result of the dependence of the collision geometry on \(b\). There are multiple theoretical studies in \((p\hbox {+A})\) and \(\hbox {A+A}\) collisions which reproduce the \(b\) dependence of the \(v_n\) quite well [24]. However, there are very few such calculations for ppcollisions. A recent study, that models the proton substructure that can induce event-by-event fluctuations in the number of final particles, showed that the eccentricities \(\epsilon _{2}\) and \(\epsilon _{3}\) of the initial entropy-density distributions in ppcollisions have no correlation with the final particle multiplicity [38].

This paper reports the long-range correlations of charged particles measured in pp interactions that contain a Z  boson decaying to dimuons. The presence of a Z  boson selects events in which a hard scattering with momentum transfer \(\hbox {Q}^2\gtrapprox (80\,\mathrm {GeV\,})^{2}\) occurred. Based on the arguments in Ref. [39], such events on average may have a lower impact parameter, \(b\), than pp events without any requirement on \(\hbox {Q}^2\) (termed inclusive events in this paper). An assumption, driven by the measurements performed in \(\hbox {A+A}\) collisions, is that if the pp \(v_2\) is related to the eccentricity of the collision geometry, then events ‘tagged’ by a Z  boson having a smaller \(b\) might also have a smaller \(v_2\) value than that measured in inclusive events. As in previous ATLAS analyses of long-range correlations in \(p\hbox {+Pb}\)  and ppcollisions [2, 3, 17, 18], the measured charged-particle multiplicity, uncorrected for detector efficiency, is used to quantify the event activity.

The data used in previous ATLAS pp studies investigating structures observed in the long-range two-particle correlations, also known as ‘ridge’ [2, 3], were recorded under conditions of low instantaneous luminosity, for which the number of collisions per bunch crossing (\(\mu \)), was \(\mu \lesssim {1}\). However, the Z-boson dataset used in the present analysis is characterized by significantly higher luminosity conditions, with a typical \(\mu \) of about 20. This large luminosity poses significant complications to the correlation analysis, as it is not possible to fully separate reconstructed tracks associated with the interaction producing the Z  boson from tracks from other interactions (pile-up) in the same bunch crossing. In order to solve the problem of pile-up tracks, a new procedure is developed that on a statistical basis corrects the multiplicity and removes the contribution of pile-up tracks from the measured 2PC.

The paper is organized as follows. Section 2 gives a brief overview of the ATLAS detector subsystems. Section 3 describes the dataset, triggers and the offline selection criteria used to select events and reconstruct charged-particle tracks used in the analysis. Section 4 gives a brief overview of the two-particle correlation method and how it is used to obtain the \(v_2\). Section 5 details the corrections applied for analysing data in the presence of background from pile-up. In Sect. 6, the two-particle correlations are calculated following procedures described in Refs.  [2, 3]. The systematic uncertainties are detailed in Sect. 7 and the results are presented and discussed in Sect. 8. Section 9 gives the summary.

2 ATLAS detector

The ATLAS detector [40] at the LHC covers nearly the entire solid angle around the collision point. It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroid magnets. The inner-detector system (ID) consisting of a silicon pixel detector, a silicon microstrip tracker and a transition radiation tracker is immersed in a 2 T axial magnetic field. The ID provides charged-particle tracking in the range \(|\eta | < 2.5\).

The high-granularity silicon pixel detector covers the ppinteraction region and typically provides three measurements per track. In the 13 TeV data samples, the number of measurements per track is increased to four because an additional silicon layer, the insertable B-layer (IBL) detector [41, 42], was installed prior to the 13 TeV data-taking. The pixel detector is followed by the silicon microstrip tracker, which typically provides measurements of four two-dimensional points per track. These silicon detectors are complemented by the transition radiation tracker, which enables radially extended track reconstruction up to \(|\eta | = 2.0\), providing around 30 hits per track.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta |< 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) electromagnetic calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\), to correct for energy loss in the material upstream of the calorimeters. Hadronic calorimetry is provided by a steel/scintillating-tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimized for electromagnetic and hadronic measurements respectively.

The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The precision chamber system covers the region \(|\eta | < 2.7\) with three layers of monitored drift tubes, complemented by cathode strip chambers in the forward region, where the background is highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive plate chambers in the barrel, and thin gap chambers in the endcap regions.

A multi-level trigger system is used to select events of interest for recording [43, 44]. The first-level (L1) trigger is implemented in hardware and uses a subset of detector information to reduce the event rate to \(\lesssim 100\hbox { kHz}\). The subsequent, software-based high-level trigger (HLT) selects events for recording.

3 Datasets, event and track selection

The analysis presented in this paper uses a \(\sqrt{s}\,=8\) TeV pp dataset with an integrated luminosity of 19.4 \(\hbox {fb}^{-1}\)  obtained by the ATLAS experiment in 2012 and a \(\sqrt{s}\,=13\) TeV pp dataset recorded in 2015 and 2016 with integrated luminosities of 3.2 \(\hbox {fb}^{-1}\)  and 32.9 \(\hbox {fb}^{-1}\) , respectively. All data used in the analysis come from data-taking periods where the beam and detector operations were stable, and the detector subsystems relevant for this analysis were fully operational.

The primary dataset used for the measurement was collected using the dimuon or high-\(p_{\mathrm{T}}\) single-muon triggers. The primary triggers used in this analysis apply a combination of L1 and HLT muon-trigger algorithms [44, 45] to select events with muons. For the 8 TeV analysis, events are selected using a single-muon trigger requiring \(p_{\mathrm{T}}\,> 36\) GeV or a dimuon trigger requiring \(p_{\mathrm{T}}\,> 18\) GeV for the first muon and \(p_{\mathrm{T}}\,>8\) GeV for the other. For the 13 TeV analysis a single-muon trigger with a \(p_{\mathrm{T}}\) threshold of 24 GeV or a dimuon trigger with a \(p_{\mathrm{T}}\)  threshold of 14 GeV for both muons are used to select events. These triggers are complemented by other triggers depending on the running conditions over the course of the data taking. A separate ‘zero bias’ trigger is used to select events effectively at random but with the same luminosity profile as the muon triggers. The zero-bias events are used to study charged-particle backgrounds arising from pile-up. Muons are reconstructed as combined tracks spanning both the ID and the MS [46, 47]. For this analysis, muons associated with the event primary vertex [48] are selected and required to have \(p_{\mathrm{T}}\,> 20\) GeV and \(|\eta |<2.4\). Track quality requirements are imposed in both the ID and MS to suppress backgrounds. In the analysis of the 13 TeV data, muons are also isolated using track-based and calorimeter-based isolation criteria studied in Ref. [47]. Events having exactly two such muons with opposite charge and pair invariant mass between 80 and 100 GeV are considered to be Z-boson candidate events. Data sample parameters are summarized in Table 1.

Table 1 The total integrated luminosity and number of Z-tagged events in the datasets used in this analysis

All events considered in this analysis are required to have at least one reconstructed primary vertex with at least two associated tracks [48]. Charged-particle tracks are reconstructed in the ID using the methods described in Refs. [49, 50]. Tracks selected for this analysis are required to pass a set of quality requirements on the number of used and missing hits in the detector layers according to the track reconstruction model [50] and to have \(p_{\mathrm{T}}\,>0.4\) GeV and \(|\eta |<2.5\). The ID tracks produced by Z-boson decay muons are not included in the 2PC analysis.

The track reconstruction efficiencies, \(\epsilon (p_{\mathrm{T}}\,,\eta )\), are calculated as a function of \(p_{\mathrm{T}}\)  and \(\eta \) from Monte Carlo (MC) simulations of pp collisions which are processed with a Geant4-based MC simulation [51] of the ATLAS detector [52]. In the 8 TeV data, the reconstruction efficiency ranges from approximately 70% at \(p_{\mathrm{T}}\,=0.4\) GeV to 80% at \(p_{\mathrm{T}}\,=5\) GeV for tracks at mid-rapidity (\(|\eta |<0.5\)). The efficiency at forward rapidity (\(2.0<|\eta |<2.5\)) varies between 55% at \(p_{\mathrm{T}}\,=0.4\) GeV to 75% at \(p_{\mathrm{T}}\,=5\) GeV . The 13 TeV data were reconstructed with the IBL installed and this leads to a higher efficiency of 85% (75%) for mid-rapidity (forward) tracks. The 13 TeV efficiency shows only a very weak \(p_{\mathrm{T}}\)  dependence.

Tracks resulting from secondary particles and tracks produced in pile-up interactions are suppressed by requiring:

$$\begin{aligned}&|d_{0}|< 1.5~\mathrm {mm}, \,\,\, |\omega |<0.75~\mathrm {mm}, \nonumber \\&\quad \omega \equiv \left( z_{0}-z_{\mathrm{vtx}}\,\right) \sin {\theta }, \end{aligned}$$
(2)

where \(d_0\) is the distance of the closest approach of the track to the beam line in the transverse plane, \(z_0\) and \(z_{\mathrm{vtx}}\) are the longitudinal coordinates of the track at \(d_0\) and the Z-tagged collision vertex, respectively, and \(\theta \) is the polar angle of the track.

4 Two-particle correlations

The study of two-particle correlations in this paper follows previous ATLAS measurements in pp collisions [2, 3] , with the additional complication of handling the pile-up, which is discussed later in Sect. 5. The two-particle correlations are measured as a function of the relative azimuthal angle \(\Delta \phi \equiv \phi ^a-\phi ^b\) for particles separated by \(|\Delta \eta |>2\). This pseudorapidity gap is used to study the long-range component of the correlations [2, 3]. The labels a and b denote the two particles in the pair, and in this paper are referred to as the ‘reference’ and ‘associated’ particles, respectively. The correlation function is defined as:

$$\begin{aligned} C(\Delta \phi ) =\frac{S(\Delta \phi )}{B(\Delta \phi )}, \end{aligned}$$
(3)

where S represents the pair distribution constructed using all particle pairs that can be formed from tracks that are associated with the event containing the Z-boson candidate and pass the selection requirements. The S distribution contains both the physical correlations between particle pairs and correlations arising from detector acceptance effects. The pair-acceptance distribution \(B(\Delta \phi )\), is similarly constructed by choosing the two particles in the pair from different events. The B distribution does not contain physical correlations, but has detector acceptance effects in \(\Delta \phi \) identical to those in S. By taking the ratio, S/B in Eq. (3), the detector acceptance effects cancel out, and the resulting \(C(\Delta \phi )\) contains physical correlations only. To correct \(S(\Delta \phi )\) and \(B(\Delta \phi )\) for the individual \(\phi \)-averaged inefficiencies of particles a and b, the pairs are weighted by the inverse product of their tracking efficiencies \(1/(\epsilon _a\epsilon _b)\). Statistical uncertainties are calculated for \(C(\Delta \phi )\) using standard uncertainty propagation procedures with the statistical variance of S and B in each \(\Delta \phi \) bin taken to be \(\sum 1/(\epsilon _a\epsilon _b)^2\), where the sum runs over all of the pairs included in the bin. Since the role of the reference and associated particles in the 2PC are different, when the reference and associated particles are from overlapping \(p_{\mathrm{T}}\)  ranges, the two pairings ab and ba are considered distinct and included separately in the pair distributions. However, including both pairings correlates the statistical fluctuations at \(\Delta \phi =\phi ^a-\phi ^b\) and \(\Delta \phi =\phi ^b-\phi ^a\). Thus the statistical uncertainties in the measured pair distributions are calculated by accounting for this correlation. This is done by increasing the contribution to the statistical error in the S and B distributions for such correlated pairs by \(\sqrt{2}\). The two-particle correlations are used only to study the shape of the correlations in \(\Delta \phi \), and their overall normalization does not matter. In this paper, the normalization of \(C(\Delta \phi )\) is chosen such that the \(\Delta \phi \)-averaged value of \(C(\Delta \phi )\) is unity.

Fig. 1
figure 1

Distribution of parameters: a vertex position \(z_{\mathrm{vtx}}\) , b instantaneous luminosity parameter measured as the number of interactions per bunch crossing \(\mu \), c the average number of pile-up tracks accepted in the analysis \(\nu \), in the three data-taking periods. The vertical dashed line in the right plot at \(\nu =7.5\) indicates the criterion below which the events are selected for the analysis

The strength of the long-range correlation can be quantified by extracting Fourier moments of the 2PC. The Fourier coefficients of the 2PC are denoted \(v_{n,n}\) and defined by:

$$\begin{aligned} C(\Delta \phi )=C_{0}\bigg (1+2\sum _{n}v_{n,n} \cos (n\Delta \phi )\bigg ). \end{aligned}$$
(4)

The \(v_{n,n}\) are directly related to the single-particle anisotropies \(v_n\) described in Eq. (1). In the case where the \(v_{n,n}\) entirely result from the convolution of the single particle anisotropies, for reference and associated particles with \(p_{\mathrm{T}}\,=p_{\mathrm {T}}^{a} \) and \(p_{\mathrm {T}}^{b}\) respectively, the \(v_{n,n} (p_{\mathrm {T}}^{a},p_{\mathrm {T}}^{b})\) is the product of the \(v_n (p_{\mathrm {T}}^{a})\) and \(v_n (p_{\mathrm {T}}^{b})\) [5], i.e.:

$$\begin{aligned} v_{n,n}(p_{\mathrm {T}}^{a},p_{\mathrm {T}}^{b})=v_{n}(p_{\mathrm {T}}^{a})v_{n}(p_{\mathrm {T}}^{b}). \end{aligned}$$
(5)

Thus, the \(v_n (p_{\mathrm {T}}^{a})\) can be obtained as:

$$\begin{aligned} v_n (p_{\mathrm {T}}^{a})=\frac{v_{n,n} (p_{\mathrm {T}}^{a},p_{\mathrm {T}}^{b})}{v_n (p_{\mathrm {T}}^{b})}=\frac{v_{n,n} (p_{\mathrm {T}}^{a},p_{\mathrm {T}}^{b})}{\sqrt{v_{n,n} (p_{\mathrm {T}}^{b},p_{\mathrm {T}}^{b})}}, \end{aligned}$$
(6)

where \(v_{n,n} (p_{\mathrm {T}}^{b},p_{\mathrm {T}}^{b})\) is the Fourier coefficient of the 2PC when both reference and associated particles are from the same \(p_{\mathrm{T}}\) range. This technique has been used extensively in heavy-ion collisions to obtain the flow harmonics [5]. However, in pp collisions a significant contribution to the 2PC arises from back-to-back dijets, which can correlate particles at large \(|\Delta \eta |\). These correlations must be removed before Eq. (5) or Eq. (6) can be used. In order to estimate the contribution from back-to-back dijets and other processes which correlate only a subset of all particles in the event, a template-fitting method was developed and used in two recent ATLAS measurements [2, 3]. The template-fitting procedure assumes that: (1) the jet–jet correlation has the same shape in \(\Delta \phi \) in low-multiplicity and in higher-multiplicity events; the only change is in the relative contribution of the dijets to the 2PC, (2) at low-multiplicity most of the structure of the 2PC arises from back-to-back dijets, i.e. the shape of the dijet correlation can be obtained from low-multiplicity events. With the above assumptions, the correlation in higher-multiplicity events \(C(\Delta \phi )\), is then described by a template fit, \(C^{\mathrm {templ}}(\Delta \phi )\) consisting of two components: 1) the correlation that accounts for the dijet contribution, \(C^{\mathrm {periph}}(\Delta \phi )\), measured in low-multiplicity events and scaled by a factor F, and 2) a long-range harmonic modulation, \(C^{\mathrm {ridge}}(\Delta \phi )\):

$$\begin{aligned} C^{\mathrm {templ}}(\Delta \phi )= & {} FC^{\mathrm {periph}}(\Delta \phi )+G\left( 1+2\sum _{n=2}v_{n,n}\cos (n\Delta \phi )\right) \nonumber \\\end{aligned}$$
(7)
$$\begin{aligned}\equiv & {} FC^{\mathrm {periph}}(\Delta \phi )+C^{\mathrm {ridge}}(\Delta \phi ), \end{aligned}$$
(8)

where the coefficient F and the \(v_{n,n}\) are fit parameters adjusted to reproduce the \(C(\Delta \phi )\). The coefficient G is not a free parameter, but is fixed by the requirement that the integrals of the \(C^{\mathrm {templ}}(\Delta \phi )\) and \(C(\Delta \phi )\) over the full \(\Delta \phi \) range are equal.

In this analysis, the \(C^{\mathrm {periph}}(\Delta \phi )\) is obtained for the 20–30 multiplicity interval, where the multiplicity is evaluated using tracks satisfying the selection criteria described in Sect. 3 and corrected for pile-up as described below in Sect. 5. This choice of peripheral reference is different from the analysis in Refs.  [2, 3] where the 0–20 interval was used. This change is due to the relative rarity of events having less than 20 tracks in the Z-tagged sample, which would impair the statistical precision of the peripheral reference. The systematic uncertainty associated with choosing a higher-multiplicity peripheral reference is evaluated by comparing \(v_2\) results obtained when using other peripheral intervals, including the 0–20 track multiplicity interval.

5 Pile-up subtraction

Selected events from all three data-taking periods contain significant pile-up, which has a direct impact on the measurement of the two-particle correlations. The tracks used in the analysis, selected using the requirements in Eq. (2), are associated with the collision vertex that includes the Z  boson. The residual contribution from pile-up tracks to the measured distributions is evaluated and corrected on a statistical basis. The correction procedure, based on an event mixing technique, is explained in this section. The main parameters affecting the pile-up are described below. Track categories used in the analysis are introduced in Sect. 5.1, and a description of the event mixing technique and its performance can be found in Sect. 5.2. Section 5.3 introduces the parameter \(\nu \), the average number of pile-up tracks expected in the event. The parameter \(\nu \) fully defines properties of the residual pile-up as discussed in Sect. 5.4 and therefore can be used to correct the measured multiplicity as explained in Sect. 5.5. Section 5.6 derives the algorithm in which the additional event sample obtained with the mixing procedure is used in the measurement of the two-particle correlations.

The two main time-dependent characteristics which primarily define the pile-up contributions to the measured events are the distribution of the Z-boson interaction longitudinal vertex position, \(z_{\mathrm{vtx}}\) , and the instantaneous luminosity which is characterized by the per-crossing number of collisions, \(\mu \). Distributions of \(z_{\mathrm{vtx}}\) and \(\mu \) are shown in panels (a) and (b) of Fig. 1, respectively, for the three data-taking periods used in the measurement. The mean values of the \(z_{\mathrm{vtx}}\) distributions are close to the centre of the ATLAS detector and are slightly negative. The RMS of the \(z_{\mathrm{vtx}}\) distributions vary period by period from approximately 48 mm to 35 mm. The instantaneous luminosity conditions yield an average number of interactions per bunch crossing \(\langle \mu \rangle \approx 20\), 15 and 26 in the years 2012, 2015 and 2016, respectively.

The \(z_{\mathrm{vtx}}\) position and the instantaneous luminosity define the parameter that is used to characterize pile-up in the analysis. This parameter, denoted \(\nu \), is the average number of background tracks per event from pile-up interactions that enter the analysis. Its distribution is shown in panel (c) of Fig. 1 and derivation is given in Sect. 5.2. The mean values of \(\nu \) over the datasets, denoted by \(\langle \nu \rangle \), are about 4 in the \(\sqrt{s}\,=8\) TeV data and above 7 in the \(\sqrt{s}\,=13\) TeV data. The 2015 data sample is only 10% as large as the 2016 sample, but the pile-up condition \(\langle \nu \rangle \) in this sample is less than half as large. The 2015 contribution forms the lower peak in the distribution of \(\nu \) shown in panel (c) of Fig. 1.

5.1 Event categories

Tracks and track pairs that pass the selections described in Sect. 3 and belong to a single event are referred to as Direct. The Direct contribution consists of tracks and pairs arising from the same interaction as the Z  boson – referred to as Signal – and from pile-up interactions – referred to as Background. The presence of the Background contribution in the Direct data affects both the number of measured tracks (\(n_{\text {trk}}\)) and the two-particle correlations. To extract the Signal, the contribution of the Background to the Direct data needs to be subtracted. For this purpose, a sample of events – referred to as Mixed and, ideally, equivalent to the Background events – is constructed using a random selection procedure. In the following sections, the numbers of tracks in the different event categories are denoted by \(n_{\mathrm {trk}}^{\mathrm {dir}}\), \(n_{\mathrm {trk}}^{\mathrm {sig}}\), \(n_{\mathrm {trk}}^{\mathrm {bkg} }\) and \(n_{\mathrm {trk}}^{\mathrm {mix}}\).

5.2 Mixed event sample

The Mixed event sample is constructed using a random selection procedure which is an extension of a technique used in Ref. [53]. It constructs an event that is similar to the Direct event, but contains no Signal component. It is done by requiring the longitudinal impact parameter of the track in one event to be within 0.75 mm of the \(z_{\mathrm{vtx}}\) measured in another event (Eq. (2)) taken during the same beam fill of the LHC. To account for differences between \(z_{\mathrm{vtx}}\) distributions from different LHC fills during the data taking, the analysis uses reduced values of \(\mu \) and \(z_{\mathrm{vtx}}\) that are:

$$\begin{aligned} ({\bar{\mu }},{\bar{z}}_{\text {vtx}})=\biggl (\frac{\mu }{\sqrt{2\pi }\,\mathrm {RMS}(z_{\mathrm{vtx}}\,)},\frac{z_{\mathrm{vtx}}\,-\langle z_{\mathrm{vtx}}\,\rangle }{\mathrm {RMS}(z_{\mathrm{vtx}}\,)}\biggr ), \end{aligned}$$
(9)

where \(\langle z_{\mathrm{vtx}}\,\rangle \) and \(\mathrm {RMS}(z_{\mathrm{vtx}}\,)\) are the mean and width of the \(z_{\mathrm{vtx}}\) distribution parameterized as a function of time during the data taking, and \(\sqrt{2\pi }\) comes from the normalization of a Gaussian probability distribution. Direct and Mixed events are required to have \({\bar{\mu }}\) values within 0.01 \(\hbox {mm}^{-1}\) of each other, a parameter chosen in the analysis to be small enough to ensure the same instantaneous luminosity condition for both events.

Two event samples can be used by the random selection procedure to construct Mixed events, one obtained with a random trigger (zero bias sample), and the other obtained with the same trigger as the Direct event sample. In the latter case, an additional condition must be used that requires the distance between the \(z_{\mathrm{vtx}}\) positions in two events to be \(|\Delta z_{\mathrm{vtx}}\,|>15\) mm. This is to ensure that the interaction that triggered the event recording and has particle counts and kinematics different from the inclusive (pile-up) interactions does not contribute to the Mixed event which aims to reproduce only the Background component. Mixed events constructed with both samples yield identical results, so the analysis uses the data sample with Z  bosons, which automatically ensures identical data-taking conditions in Direct and Mixed events. The procedure is validated using a MC simulation sample where Z-boson events from 8 TeV ppcollisions are generated with the Sherpa event-generator [54] and reconstructed with pile-up conditions corresponding to the 8 TeV dataset used in this analysis. This pile-up is simulated using Pythia v8.165 [55] with parameter values set according to the A2 tune [56] and the MSTW2008LO PDF set [57]. Implementing the procedue in the MC sample shows that the distributions found in Mixed events are equivalent to those in the Background events. To suppress undesired statistical fluctuations in the Mixed event sample the random selection procedure is performed 20 times for each Direct event.

Figure 2 shows the average track density in Direct and Mixed events for different values of \({\bar{z}}_{\text {vtx}}\) and \({\bar{\mu }}\). The three panels correspond to three different intervals of \({\bar{z}}_{\text {vtx}}\) position and the different markers denote different \({\bar{\mu }}\) intervals. For the distributions corresponding to Direct events, the contribution from the Signal tracks forms the peak at \(\omega =0\), and the contribution from the Background tracks produces a slowly changing distribution outside and under the peak. The vertical axis in Fig. 2 is restricted to low values in order to clearly show the contribution from Mixed events, so the peaks at \(\omega =0\) are truncated. The solid lines are parabolic fits to the Direct track distributions outside the peak regions and then interpolated under the peaks. There is good agreement between the results of the fits and the results of Mixed events in the region under the peak. At values of \(|\omega |>2.45\) mm, Mixed curves in all \(({\bar{\mu }}, {\bar{z}}_{\text {vtx}})\) intervals depart from the Direct ones. This is due to the contribution from collisions that fired the trigger. The \(n_{\text {trk}}\) in them is larger than in the pile-up interaction and causes the excess. However, due to the requirement that \(|\Delta z_{\mathrm{vtx}}\,|\) between the Direct event and the event used by the random selection procedure must be greater than 15 mm, no tracks from triggered collisions can affect the region of \(|\omega |<0.75\) mm, where agreement between the fitted Direct and Mixed events is good for the purpose of the analysis.

Fig. 2
figure 2

The number of tracks per mm as a function of \(\omega \), defined by Eq. (2), for Direct (solid markers) and Mixed events (open markers). The three panels show results in different intervals of the reduced vertex \({\bar{z}}_{\text {vtx}}\) position and different marker colours correspond to several intervals of reduced \({\bar{\mu }}\). The solid lines are parabolic fits to Mixed events in the region \(|\omega |<3\) mm and the vertical dashed lines show the acceptance window \(|\omega |<0.75\) mm. The vertical axis is restricted to low values in order to show the Mixed events, so the peaks at \(\omega = 0\) are truncated

Based on the level of agreement shown in Fig. 2, and on the MC simulation studies, this analysis uses the approximation that the features of the Mixed events (momentum, pseudorapidity distributions of tracks and two particle correlations) are equivalent to those of the Background events.

5.3 Background estimator

The Mixed track density under the peak (\(|\omega |<0.75~\hbox {mm}\)) shown in Fig. 2 for Mixed events is plotted in the left panel of Fig. 3 as a function of \({\bar{\mu }}\) for different \({\bar{z}}_{\text {vtx}}\).

Fig. 3
figure 3

Left: The number of tracks in Mixed events per mm at \(\omega =0\) as a function of \({\bar{\mu }}\). Different marker colours correspond to selected \({\bar{z}}_{\text {vtx}}\) intervals. Not all intervals are shown for figure clarity. Solid lines are fits assuming scaling of track density with \({\bar{\mu }}\). Right: Slopes of the lines shown in the left panel as a function of \({\bar{z}}_{\text {vtx}}\) fitted to a Gaussian shape

Fig. 4
figure 4

The upper panels show the probability distributions for the \(n_{\text {trk}}^{\mathrm {mix} }\) measured without any restriction on \({\bar{z}}_{\text {vtx}}\) (filled markers) as well as in three different \({\bar{z}}_{\text {vtx}}\) intervals (open markers). From left to right, the panels correspond to \(\nu \) ranges of a \(\nu <0.5\), b \(3<\nu <3.5\) and c \(7<\nu <7.5\). The lower panels show the ratio of the \(P_{\mathrm {mix}}\) in the \({\bar{z}}_{\text {vtx}}\) intervals to the \(P_{\mathrm {mix}}\) distribution obtained without any restriction on \({\bar{z}}_{\text {vtx}}\). Vertical bars are the statistical uncertainty

Only intervals in \({\bar{z}}_{\text {vtx}} <0\) are plotted since there is a symmetry around \({\bar{z}}_{\text {vtx}} =0\). The distribution of \(\mathrm {d}n_{\text {trk}}^{\mathrm {mix} }/\mathrm {d}\omega \) evaluated as a function of \({\bar{\mu }}\) shows that track density is proportional to the interaction density: \(\mathrm {d}n_{\text {trk}}^{\mathrm {mix} }/\mathrm {d}\omega \propto {\bar{\mu }} \). The proportionality coefficients, \(\mathrm {d^{2}}n_{\text {trk}}^{\mathrm {mix} }/(\mathrm {d}\omega \mathrm {d}{\bar{\mu }})\), are determined by fitting a linear function to the \(\mathrm {d}n_{\text {trk}}^{\mathrm {mix} }/\mathrm {d}\omega ({\bar{\mu }})\) distribution. The small residual deviations from this linear fit are taken into account while estimating systematic uncertainties; they are primarily present in the regions of \(({\bar{\mu }},{\bar{z}}_{\text {vtx}})\) that are not used in the analysis. The dependence of these coefficients on \({\bar{z}}_{\text {vtx}}\) is shown in the right panel of Fig. 3. One can see that \(\mathrm {d^{2}}n_{\text {trk}}^{\mathrm {mix} }/(\mathrm {d}\omega \mathrm {d}{\bar{\mu }})(z_{\mathrm{vtx}}\,)\) is Gaussian with mean at zero and width very close to unity. This is expected as the \({\bar{z}}_{\text {vtx}}\), according to Eq. (9), is already a reduced parameter. Using the equivalence \(Background \equiv Mixed \), the average number of Background tracks can be expressed as:

$$\begin{aligned} \nu \equiv \big \langle n_{\mathrm {trk}}^{\mathrm {bkg}}\big \rangle = 2\omega _{0}\;\frac{\mathrm {d}^{2}n_{\text {trk}}^{\mathrm {mix} }}{\mathrm {d}\omega ~\mathrm {d}{\bar{\mu }}}\bigg |_{{\bar{z}}_{\text {vtx}} =0}\mathrm {Gauss}({\bar{z}}_{\text {vtx}}) {\bar{\mu }}, \end{aligned}$$
(10)

where \(\omega _{0}=0.75\) mm is half of the width of the track acceptance window, \(\mathrm {d}^{2}n_{\text {trk}}^{\mathrm {mix} }/(\mathrm {d}\omega \mathrm {d}{\bar{\mu }})|_{{\bar{z}}_{\text {vtx}} =0}\) is the coefficient defined by particle production in inclusive pp collisions and by the detector rapidity coverage and efficiency, and \(\mathrm {Gauss}({\bar{z}}_{\text {vtx}})\) is a Gaussian function with mean equal to 0 and a variance of 1.0.

5.4 Properties of mixed events

The parameters \({\bar{\mu }}\) and \({\bar{z}}_{\text {vtx}}\) factorize in Eq. (10). There is only a scaling coefficient between \(\nu \) and the interaction density \(\mathrm {Gauss}({\bar{z}}_{\text {vtx}}){\bar{\mu }} \), such that the same \(\nu \) can be reached at low instantaneous luminosity and close to the centre of the \({\bar{z}}_{\text {vtx}}\) interval, or at high instantaneous luminosity and large \({\bar{z}}_{\text {vtx}}\). Using the MC simulations and Mixed events taken at different \(({\bar{\mu }}, {\bar{z}}_{\text {vtx}})\) one can find that not only the average value, but also the shape of the \(n_{\text {trk}}^{\mathrm {bkg} }\) distribution are the same for the same interaction density \(\mathrm {Gauss}({\bar{z}}_{\text {vtx}}){\bar{\mu }} \) and consequently for the same \(\nu \). Events are therefore fully characterized with respect to their background conditions by \(\nu \), calculated using Eq. (10). This is demonstrated in Fig. 4 for three intervals: \(\nu <0.5\), \(3<\nu <3.5\) and \(7<\nu <7.5\). For each interval the probability distributions of Mixed tracks \(P_{\mathrm {mix}}\) obtained without any restriction on \({\bar{z}}_{\text {vtx}}\), are compared with the \(P_{\mathrm {mix}}\) distributions obtained when restricting the \({\bar{z}}_{\text {vtx}}\) to three different intervals of \(|{\bar{z}}_{\text {vtx}} |<0.2\), \(0.2<{\bar{z}}_{\text {vtx}} <0.8\), and \(0.8<{\bar{z}}_{\text {vtx}} <3\). Although no constraint is imposed on \({\bar{\mu }}\), its value varies over a different range for each \({\bar{z}}_{\text {vtx}}\) interval to provide \(\nu \) according to Eq. (10). Some distributions are not shown because it is impossible to find low-\(\nu \) conditions at the centre of the \(z_{\mathrm{vtx}}\) distribution at any \(\mu \) shown in Fig. 1. The upper panels of the figure show the \(P_{\mathrm {mix}}\) distributions and the lower panels show the ratios of the \(P_{\mathrm {mix}}\) distributions in each \({\bar{z}}_{\text {vtx}}\) interval to the \(P_{\mathrm {mix}}\) distribution measured without any restriction on \({\bar{z}}_{\text {vtx}}\). The ratios in the lower panels are consistent with unity within 5% in most cases, demonstrating that for a given \(\nu \) the shape of the \(P_{\mathrm {mix}}\) distribution does not depend on \({\bar{z}}_{\text {vtx}}\) or \({\bar{\mu }}\). Residual deviations are due to tracking efficiency variation along the beam axis, accuracy of determining \(\mu \), and deviations from the parameterizations used in Eq. (10).

The probability distributions for the \(n_{\text {trk}}\) found under different \(\nu \) conditions are shown in Fig. 5. The left and right panels display probabilities \(P_{\mathrm {dir}}\) and \(P_{\mathrm {mix}}\) for the Direct and Mixed events respectively. The continuous lines are the fits to the data points to smooth the statistical fluctuations at high \(n_{\text {trk}}\).

Fig. 5
figure 5

Probability distributions for the \(n_{\text {trk}}\) in Direct events (left) and Mixed events (right). The different coloured markers correspond to different values of \(\nu \). The grey distribution in each panel indicates the distributions from the other panel averaged over the sample, and is shown for comparison. The lines are fits to data points. The x-axis ranges are different in the two panels

Figure 5 shows that the Background tracks affect Direct distributions differently, depending on the \(n_{\text {trk}}\) regions. Assuming that the lower \(n_{\text {trk}}^{\mathrm {dir}}\) distribution, shown with black markers \(\big (\nu <0.5\big )\), resembles the no pile-up condition, Fig. 5 implies that at \(n_{\mathrm {trk}}^{\mathrm {dir}} >100\) the Direct event distributions at high \(\nu \) are dominated by the Background tracks, rising by an order of magnitude relative to black markers for the highest \(\nu \) measured in the event sample. Averaged over the sample, the distribution for \(n_{\text {trk}}^{\mathrm {dir}}\) is shown in the right panel and for \(n_{\text {trk}}^{\mathrm {mix} }\) in the left panel for comparison with the distributions of the opposite type. The mean numbers of tracks in those distributions are 30 and 4 respectively.

5.5 Correction of \(n_{\text {trk}}\) distribution

The \(n_{\text {trk}}^{\mathrm {sig}}\) distributions are derived by unfolding the \(n_{\text {trk}}^{\mathrm {dir}}\) distributions. Transition matrices required for that are constructed from the data. For the analysis of the 2PC the same matrices are used to remap the correlation coefficients measured for \(n_{\text {trk}}^{\mathrm {dir}}\) to \(n_{\text {trk}}^{\mathrm {sig}}\) explained later in Sect. 5.6. These matrices are constructed from the data using the distributions shown in Fig. 5:

$$\begin{aligned} M\big (\nu ,n_{\text {trk}}^{\mathrm {sig}},n_{\text {trk}}^{\mathrm {dir}} \big )=P_{\mathrm {dir}}\big (\nu <0.5,n_{\text {trk}}^{\mathrm {sig}} \big )~P_{\mathrm {mix}}\big (\nu ,n_{\text {trk}}^{\mathrm {dir}}-n_{\text {trk}}^{\mathrm {sig}} \big ). \end{aligned}$$

The matrices are calculated using the \(n_{\text {trk}}^{\mathrm {dir}}\) distribution measured at the lowest \(\nu \) (\(\nu <0.5\)) as a proxy for the \(n_{\text {trk}}^{\mathrm {sig}}\) distribution and \(n_{\text {trk}}^{\mathrm {mix} }\) distributions corresponding to different intervals of \(\nu \). The probabilities to find \(n_{\text {trk}}^{\mathrm {dir}}\) shown in the left panel of Fig. 5 are multiplied by the probabilities to find \(n_{\text {trk}}^{\mathrm {mix} }\), shown in the right panel of Fig. 5. The product of the two probabilities is the matrix element for \(\big (n_{\text {trk}}^{\mathrm {sig}},n_{\text {trk}}^{\mathrm {dir}} \big )\) using the relation \(n_{\text {trk}}^{\mathrm {sig}} =n_{\text {trk}}^{\mathrm {dir}}-n_{\text {trk}}^{\mathrm {mix} } \). For high numbers of tracks, the fits shown in Fig. 5 are used to suppress statistical fluctuations. Examples of the transition matrices for two different \(\nu \) are shown in Fig. 6.

The contour lines of the matrices have a distinct ‘spinnaker’ shape with the amount of ‘drag’ increasing with \(\nu \). At high \(\nu \), the higher values of \(n_{\text {trk}}\) in Direct events become only weakly correlated with the \(n_{\text {trk}}\) in Signal events. The right panel of Fig. 6 shows that the largest number of tracks in Direct events corresponds to relatively moderate Signal \(n_{\text {trk}}\) smeared by the Background. This effect limits the range of \(n_{\text {trk}}\) values where the pile-up data samples can be analysed, and the limit depends on the value of \(\nu \).

Each Direct event with a given \(n_{\text {trk}}\) contains contributions from Signal events with any number of tracks such that \(n_{\text {trk}}^{\mathrm {sig}} \le n_{\text {trk}}^{\mathrm {dir}} \). Those contributions are calculated from the transition matrices, shown in Fig. 6, by making a projection of \(n_{\text {trk}}^{\mathrm {dir}}\) onto \(n_{\text {trk}}^{\mathrm {sig}}\) for a given value of \(n_{\text {trk}}^{\mathrm {dir}}\). These projections are shown in Fig. 7 for two intervals of \(\nu \).

Fig. 6
figure 6

Data-driven transition matrices corresponding to intervals a \(\nu <0.5\) and b \(7<\nu <7.5\) that are used for remapping

Fig. 7
figure 7

The probability of a Signal event with multiplicity \(n_{\text {trk}}^{\mathrm {sig}}\), to contribute to a Direct event with \(n_{\text {trk}}^{\mathrm {dir}} =30\), 60 and 90 (solid, dashed, and dotted-dashed), as a function of \(n_{\text {trk}}^{\mathrm {sig}}\). The shaded bands denote the horizontal range equal to the mean ± RMS value of the histogram with the corresponding colour. a \(\nu < 0.5\) and b for \(7< \nu < 7.5\)

The histograms in Fig. 7 are examples of probability distributions of the \(n_{\text {trk}}^{\mathrm {sig}}\) contributing to Direct events with \(n_{\text {trk}}^{\mathrm {dir}} =30,\,60\), and 90. At low \(\nu \), shown in the left panel, the distributions are narrow and peaked at \(n_{\text {trk}}^{\mathrm {sig}} =n_{\text {trk}}^{\mathrm {dir}} \). For this low pile-up condition, more than 85% of Direct events do not have even one Background track. The situation is different for high \(\nu \) (right panel) where the contributions to Direct events come from a wide range of Signal events with smaller \(n_{\text {trk}}\). The shaded bands shown in the plot are centred horizontally at the mean values of \(n_{\text {trk}}^{\mathrm {sig}}\) contributing to the Direct events, and have widths equal to 2\(\times \)RMS of the corresponding distributions. Distributions for high values of \(n_{\text {trk}}^{\mathrm {dir}}\) become increasingly wider as shown in the right panel of Fig. 7. This figure demonstrates that with increasing \(\nu \) it becomes impossible to accurately determine to what \(n_{\text {trk}}^{\mathrm {sig}}\) the measurement belongs. Figure 7 shows that the presence of pile-up degrades the resolution with which one can measure \(n_{\text {trk}}^{\mathrm {sig}}\). As described in Sect. 5.6, this analysis is restricted to \(\nu <7.5\) because otherwise the pile-up is too large to correct the two particle correlations.

5.6 Correction for the pair-distribution

This section describes the pile-up correction procedure for the pairs that are obtained by correlating particle pairs in Direct events. The \(\Delta \phi \) distribution of track-pairs found in one Direct event can be formally written as:

$$\begin{aligned} \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {dir}}}{\mathrm {d}\Delta \phi }= & {} \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {dir}}} \sum _{b\ne a}^{n_{\mathrm {trk}}^{\mathrm {dir}}} \delta (\Delta \phi -\Delta \phi ^{ab})\nonumber \\= & {} \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \sum _{b\ne a}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \delta (\Delta \phi -\Delta \phi ^{ab})+ \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }}\sum _{b\ne a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }} \delta (\Delta \phi -\Delta \phi ^{ab})\nonumber \\&\quad + \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }}\sum _{b}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \delta (\Delta \phi -\Delta \phi ^{ab})+ \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \sum _{b}^{n_{\mathrm {trk}}^{\mathrm {bkg} }} \delta (\Delta \phi -\Delta \phi ^{ab}), \nonumber \\ \end{aligned}$$
(11)

where the indices a and b run over tracks in a subevent of its corresponding category, \(\Delta \phi ^{ab}\) is a short-hand notation for \(\phi ^a-\phi ^b\), and the Dirac delta function \(\delta (\Delta \phi -\Delta \phi ^{ab})\) ensures that the requirement \(\Delta \phi =\phi ^a-\phi ^b\) is satisfied. Besides requiring that the index \(b \ne a\), as is made explicit in Eq. (11) above, the requirement that \(|\eta ^a-\eta ^b|>2\) is also imposed. This requirement can be imposed in Eq. (11) by including the step function \(\Theta (|\eta ^a-\eta ^b|-2)\), but for brevity, is not included explicitly. Additionally the indices a and b are restricted to the particles within the chosen \(p_{\mathrm{T}}\) -ranges for the reference and associated particles, respectively.

To take account of different pile-up conditions, the analysis is done in intervals of \(\nu \). Therefore, the expression given by Eq. (11) has to be summed over a subset of data in each \(\nu \) interval. In the following, the number of events in the interval where the number of observed tracks is \(n_{\mathrm {trk}}^{\mathrm {dir}}\), is denoted by \(n_{\mathrm {evt}}^{\mathrm {dir}}\). Averaging the first contribution in Eq. (11) over all events at fixed \(n_{\mathrm {trk}}^{\mathrm {dir}}\) and \(\nu \) yields:

$$\begin{aligned}&\frac{1}{n_{\mathrm {evt}}^{\mathrm {dir}}} \sum _{n}^{n_{\mathrm {evt}}^{\mathrm {dir}}}\sum _{a}^{n_{\mathrm {trk}}^{\mathrm {sig}}}\sum _{b\ne a}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \delta (\Delta \phi -\Delta \phi ^{ab})\nonumber \\&\quad = \sum _{n_{\mathrm {trk}}^{\mathrm {sig}} =0}^{n_{\mathrm {trk}}^{\mathrm {dir}}}P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle \equiv \Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle \!\Bigg \rangle .\nonumber \\ \end{aligned}$$
(12)

In the presence of pile-up, the contributions to the Direct tracks come from different numbers of Signal tracks such that \(n_{\mathrm {trk}}^{\mathrm {sig}} \le n_{\mathrm {trk}}^{\mathrm {dir}} \) (as \(n_{\mathrm {trk}}^{\mathrm {sig}} +n_{\mathrm {trk}}^{\mathrm {bkg} } =n_{\mathrm {trk}}^{\mathrm {dir}} \)). Probabilities to find \(n_{\mathrm {trk}}^{\mathrm {sig}}\) in events are denoted \(P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\) and are shown in Fig. 7. For clarity, the parameters that this probability depends on, i.e. \(\nu \) and \(n_{\mathrm {trk}}^{\mathrm {dir}}\), are labelled explicitly here. The averaging is done over all values of \(n_{\mathrm {trk}}^{\mathrm {sig}}\), which is reflected by the double angular bracket that appears in the equation: the average over events with fixed \(n_{\mathrm {trk}}^{\mathrm {sig}}\) is denoted by the smaller angular brackets, and the weighted average over all \(n_{\mathrm {trk}}^{\mathrm {sig}}\) for a given \(n_{\mathrm {trk}}^{\mathrm {dir}}\) in a category is denoted by larger angular brackets. In practice, only a relatively narrow region of \(n_{\mathrm {trk}}^{\mathrm {sig}}\) effectively contributes to \(\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}/\mathrm {d}\Delta \phi \). The width of this region depends on \(n_{\mathrm {trk}}^{\mathrm {dir}}\) and on \(\nu \).

Similarly to the first contribution, the second contribution to Eq. (11) can be written as:

$$\begin{aligned}&\frac{1}{n_{\mathrm {evt}}^{\mathrm {dir}}} \sum _{n}^{n_{\mathrm {evt}}^{\mathrm {dir}}} \sum _{a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }} \sum _{b\ne a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }} \delta (\Delta \phi -\Delta \phi ^{ab})= \sum _{n_{\mathrm {trk}}^{\mathrm {sig}} =0}^{n_{\mathrm {trk}}^{\mathrm {dir}}}P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\nonumber \\&\quad \times \,\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {bkg}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle =\Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {bkg}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \!\Bigg \rangle . \end{aligned}$$
(13)

Averaging the last two terms in Eq. (11) over the event sample eliminates any \(\Delta \phi \) dependence except a constant one, because the Background tracks cannot be correlated with Signal tracks since they originate from different interactions. The third term in Eq. (11) can be written as:

$$\begin{aligned}&\frac{1}{n_{\mathrm {evt}}^{\mathrm {dir}}}\sum _{n}^{n_{\mathrm {evt}}^{\mathrm {dir}}} \bigg (\sum _{a}^{n_{\mathrm {trk}}^{\mathrm {bkg} }}\sum _{b}^{n_{\mathrm {trk}}^{\mathrm {sig}}} \delta (\Delta \phi -\Delta \phi ^{ab})\bigg ) \nonumber \\&\quad =\sum _{n_{\mathrm {trk}}^{\mathrm {sig}} =0}^{n_{\mathrm {trk}}^{\mathrm {dir}}}P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {bkg}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \nonumber \\&\qquad \times \, \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {sig}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \bigg \rangle \delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b \nonumber \\&\quad =\Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {bkg}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {sig}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle \ \nonumber \Bigg \rangle \\&\qquad \times \, {\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b}, \end{aligned}$$
(14)

where \(\big \langle \mathrm {d}N_{\mathrm {trk}}/\Delta \phi \big \rangle \) are the single-particle angular track densities averaged over many events. Equation (14) states that averaged over many events, the pair distribution involving Signal and Background tracks can be replaced by the convolution of the individual single-particle distributions. The fourth term in Eq. (11) gives an expression identical to Eq. (14) except that the indices a and b interchanged. Substituting Eqs. (12)–(14) into Eq. (11) and rearranging gives:

$$\begin{aligned}&\Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \bigg \rangle \!\Bigg \rangle = \bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {dir}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big ) \bigg \rangle -\ \Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {bkg}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \!\Bigg \rangle \nonumber \\&\quad -\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {bkg}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {sig}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times {\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad -\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {sig}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {bkg}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b}. \end{aligned}$$
(15)

So far, no approximations are made in the derivation of Eq. (15). However, in implementing the pile-up subtraction in the analysis, some approximations are necessary.

The first approximation relies on \(Background \equiv Mixed \), which is established earlier in the analysis. Then the Background terms in Eq. (15) can be replaced by the corresponding Mixed terms. Additionally the single-track distributions for Signal tracks on the second line of Eq. (15) can be written as:

$$\begin{aligned} \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {sig}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big )\bigg \rangle = \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {dir}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big )\bigg \rangle - \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle . \end{aligned}$$

With these substitutions, Eq. (15) becomes:

$$\begin{aligned}&\Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \bigg \rangle \!\Bigg \rangle = \bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {dir}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big ) \bigg \rangle -\ \Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {mix}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \!\Bigg \rangle \nonumber \\&\quad -\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {dir}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad -\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {dir}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad + \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad +\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b}. \end{aligned}$$
(16)

The second approximation requires that \(\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}/\mathrm {d}\Delta \phi \) changes slowly with \(n_{\mathrm {trk}}^{\mathrm {sig}}\), i.e. that the correlations do not change significantly over the range of \(n_{\mathrm {trk}}^{\mathrm {sig}}\) that contributes to a given \(n_{\mathrm {trk}}^{\mathrm {dir}}\). In other words, this assumption requires that the analysed correlation does not change significantly over an effective range of \(n_{\mathrm {trk}}^{\mathrm {sig}}\) that cannot be resolved in the presence of the pile-up. Those ranges are effectively the widths of the peaks shown in Fig. 7, and are fixed for a given background condition \(\nu \). By limiting the background condition to \(\nu < \nu _{\mathrm {max}}\), one can control the magnitude of this width. In the present analysis, the maximum value of the background condition is chosen to be \(\nu _{\mathrm {max}}=7.5\). This limit is shown in panel (c) of Fig. 1.

To measure the two-particle correlation as function of \(n_{\mathrm {trk}}^{\mathrm {sig}}\) in the presence of pile-up, quantities defined by Eq. (16) found at fixed values of \(n_{\mathrm {trk}}^{\mathrm {dir}}\) and in different intervals of \(\nu \) have to be summed with weights as:

$$\begin{aligned}&\frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \approx \frac{1}{\sum _{\nu<\nu _{\mathrm {max}}} n_{\mathrm {evt}}^{\mathrm {dir}}}\sum _{\nu <\nu _{\mathrm {max}}}\nonumber \\&\quad \times \sum _{n_{\mathrm {trk}}^{\mathrm {dir}} \ge n_{\mathrm {trk}}^{\mathrm {sig}}} n_{\mathrm {evt}}^{\mathrm {dir}}P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \bigg \rangle \!\Bigg \rangle . \end{aligned}$$
(17)

Combining Eqs. (16) and (17) the final result is obtained using the expression:

$$\begin{aligned}&\frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {sig}} \big ) \approx \frac{1}{\sum _{\nu }n_{\mathrm {evt}}^{\mathrm {dir}}}\sum _{\nu }\sum _{n_{\mathrm {trk}}^{\mathrm {dir}} \ge n_{\mathrm {trk}}^{\mathrm {sig}}} n_{\mathrm {evt}}^{\mathrm {dir}}P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\nonumber \\&\quad \times \,\bigg ( \bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {dir}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big ) \bigg \rangle - \bigg [ \Bigg \langle \!\bigg \langle \frac{\mathrm {d}N_{\mathrm {pair}}^{\mathrm {mix}}}{\mathrm {d}\Delta \phi }\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \!\Bigg \rangle \nonumber \\&\quad +\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {dir}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad + \,\Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {dir}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {dir}} \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad - \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \nonumber \\&\quad -\, \Bigg \langle \ \iint \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{a}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big )\bigg \rangle \bigg \langle \frac{\mathrm {d}N_{\mathrm {trk }}^{\mathrm {mix}}}{\mathrm {d}\phi ^{b}}\big (n_{\mathrm {trk}}^{\mathrm {bkg} } \big ) \bigg \rangle \ \nonumber \Bigg \rangle \\&\quad \times \,{\delta (\Delta \phi -\Delta \phi ^{ab})\mathrm {d}\phi ^a\mathrm {d}\phi ^b} \bigg ] \bigg ) . \end{aligned}$$
(18)

The analysis uses Eq.  (18) in the following way. In each category of \(\nu \), the distributions of two-particle pair-distributions are built for all values of \(n_{\mathrm {trk}}^{\mathrm {dir}}\) and for all values of \(n_{\mathrm {trk}}^{\mathrm {mix}}\). They are then summed using weights \(P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\) to build the background contributions, given by the square brackets in Eq. (18) for different \(n_{\mathrm {trk}}^{\mathrm {dir}}\) and \(n_{\mathrm {trk}}^{\mathrm {mix}} =n_{\mathrm {trk}}^{\mathrm {dir}}-n_{\mathrm {trk}}^{\mathrm {sig}} \) combinations. Next, these contributions are subtracted from the distribution measured in the Direct event for each \(n_{\mathrm {trk}}^{\mathrm {dir}}\), giving the expression in round brackets. Subtracted results are weighted with probabilities \(P\big (n_{\mathrm {trk}}^{\mathrm {sig}} \vert \nu ,n_{\mathrm {trk}}^{\mathrm {dir}} \big )\) and multiplied by \(n_{\mathrm {evt}}^{\mathrm {dir}}\), the number of events with any given \(n_{\mathrm {trk}}^{\mathrm {dir}}\). The resulting distributions are added to the distributions of Signal events for all values such that \(n_{\mathrm {trk}}^{\mathrm {sig}} \le n_{\mathrm {trk}}^{\mathrm {dir}} \). In the last step, the values of \(\mathrm {d}N_{\mathrm {pair}}^{\mathrm {sig}}/\mathrm {d}\Delta \phi \) in those categories of \(\nu \) that are used in the analysis are added together and normalised.

Equation (18) gives the pile-up-corrected distribution of track-pairs – \(S(\Delta \phi )\) in Eq. (3) – evaluated at fixed \(n_{\mathrm {trk}}^{\mathrm {sig}}\). The pair-acceptance distribution \(B(\Delta \phi )\) does not require any correction for pile-up as it is an estimate of the detector acceptance which is not affected by pile-up. The pile-up-corrected correlation functions \(C(\Delta \phi )\) are then built by dividing the \(S(\Delta \phi )\) by the \(B(\Delta \phi )\) and normalizing to a \(\Delta \phi \)-averaged value of unity.

6 Template fits

Figure 8 shows the pile-up-corrected 2PC for several \(n_{\text {trk}}^{\mathrm {sig}}\) intervals for the 13 TeV Z-tagged data. Correlations are measured for tracks in the \(0.5<p_{\mathrm {T}}^{a,b} <5\) GeV range. In the higher track multiplicity intervals, a clear enhancement on the near-side (\(\Delta \phi =0\)) is visible. Figure 8 also shows results for the template fits (Eq. (8)) to the 2PC, with the \(n_{\text {trk}}^{\mathrm {sig}}\) interval of \(20<n_{\text {trk}}^{\mathrm {sig}} \le 30\) used as the peripheral reference. The measured correlation functions are well described by the template fits, and long-range correlations (indicated by dashed blue lines) are observed. The fits in Fig. 8 include harmonics \(n=2\)–4, however the subsequent analysis described in this paper focusses only on \(v_2 \), as the associated systematic and statistical uncertainties on the higher order harmonics are quite large.

Fig. 8
figure 8

Template fits to the pile-up-corrected \(C(\Delta \phi )\) in the 13 TeV Z-tagged data. The different panels correspond to different \(n_{\text {trk}}^{\mathrm {sig}}\) intervals. The \(n_{\text {trk}}^{\mathrm {sig}} \in (20,30]\) interval is used to determine the \(C^{\mathrm {periph}}(\Delta \phi )\), and the template fits include harmonics \(n=2\)–4. The \(FC^{\mathrm {periph}}(\Delta \phi )\) and \(C^{\mathrm {ridge}}\) terms have been shifted up by G and \(FC^{\mathrm {periph}}(0)\) respectively, for easier comparison. The plots are for \(0.5<p_{\mathrm {T}}^{a,b} <5\) GeV 

From the template fits the \(v_2\) is extracted following Eq. (6). The left panel of Fig. 9 shows the \(v_2\) values obtained from the template fits as a function of \(n_{\text {trk}}^{\mathrm {sig}}\). The \(v_2\) values before correcting for pile-up are also shown for comparison. Without the pile-up correction, a clear monotonic decrease in \(v_2\) is observed with increasing track multiplicity, corresponding to an increase in pile-up contamination.

The right panels of Fig. 9 show the ratios of uncorrected \(v_2\) to the corresponding values with the pile-up correction. The uncorrected values show a significant decrease with increasing multiplicity and are \(\sim \)25% (20%) lower than the corrected one in 8 TeV (13 TeV) data at the highest measured track multiplicity. However, after the pile-up correction, the \(v_2\) shows a significantly weaker dependence on the track multiplicity, similar to the observations in Refs.  [2, 3].

Fig. 9
figure 9

Top left panel shows the \(v_2\) values obtained from the template fits in the 8 TeV data, corrected for pile-up, plotted as a function of the \(n_{\mathrm {trk}}^{\mathrm {sig}}\) (black points). For comparison, the \(v_2\) not corrected for pile-up is also plotted. The uncorrected \(v_2\) is also plotted as a function of \(n_{\mathrm {trk}}^{\mathrm {sig}}\)– the pileup corrected multiplicity – so that the effect of the pileup correction on the \(v_n\) is compared between the same set of events. The top right panel shows the ratio of the two \(v_2\) values. Bottom row shows similar plots for the 13 TeV data. The error bars indicate statistical uncertainties and are not shown in the ratio plots. Plots are for \(0.5<p_{\mathrm {T}}^{a,b} <5\) GeV 

Figure 10 compares the \(p_{\mathrm{T}}\)  dependence of the \(v_2\) before and after correcting for pile-up. The \(p_{\mathrm{T}}\) dependence is evaluated over a broad 40–100 \(n_{\text {trk}}^{\mathrm {sig}}\) range. A dependence of the correction on the \(p_{\mathrm{T}}\) is observed. Over the 0.5–3 GeV \(p_{\mathrm{T}}\) interval, the magnitude of the correction decreases with increasing \(p_{\mathrm{T}}\) .

Fig. 10
figure 10

Top left panel compares the \(v_2\) values as a function of \(p_{\mathrm{T}}\) in the 8 TeV data, when correcting (solid-black points) and not correcting (open blue points) for pile-up pairs. The plots are for the 40–100 \(n_{\mathrm {trk}}^{\mathrm {sig}}\) interval. The blue points correspond to the Fourier coefficients of the uncorrected 2PCs for \(40<n_{\mathrm {trk}}^{\mathrm {sig}} \le 100\). The top right panel shows the ratio of the uncorrected \(v_2\) to the corrected \(v_2\). The error bars indicate statistical uncertainties and are not shown in the ratio plots. The lower panels show similar plots for the 13 TeV data

7 Systematic uncertainties

The systematic uncertainties in the \(v_2\) measurement can broadly be classified into two categories: the first category comprises systematic uncertainties that are intrinsic to the 2PC and to the template-fitting procedure and have been used in previous 2PC analyses [2, 3]. These include uncertainties from the choice of peripheral bin used in the template fits, the tracking efficiency, and the pair-acceptance. The second category comprises the uncertainties associated with the correction of the \(v_2\) that accounts for pile-up tracks; these uncertainties are specific to the present analysis.

7.1 Peripheral interval

The template-fitting procedure [2, 3] uses the \(n_{\text {trk}}^{\mathrm {sig}} \in \)(20,30] interval as the peripheral reference. To test the sensitivity of the measured \(v_2\) to any residual changes in the width of the away-side (\(\Delta \phi =\pi \)) jet peak and to the \(v_2\) present in the peripheral reference, the analysis is repeated using the 0–20, 10–20, and 30–40 multiplicity intervals as the peripheral reference. The resulting variation in the \(v_2\) when using these alternative peripheral references is included as a systematic uncertainty. The assigned uncertainties are conservatively taken to be larger of the three variations and symmetric about the nominal value. For the multiplicity dependence of the \(v_2\) measured in the integrated \(p_{\mathrm{T}}\)  interval of 0.5–5 GeV , this uncertainty varies from \(\sim \)8% at \(n_{\text {trk}}^{\mathrm {sig}} =30\) to \(\sim \)3% for \(n_{\text {trk}}^{\mathrm {sig}} >70\) in the 8 TeV data. For the 13 TeV data the uncertainty is within 4% across the entire measured multiplicity range. For the \(p_{\mathrm{T}}\) dependence, this uncertainty varies from 4% to 15% depending on the \(p_{\mathrm{T}}\) and the dataset.

7.2 Track reconstruction efficiency

In evaluating the correlation functions, each particle is weighted by a factor \(1/\epsilon (p_{\mathrm{T}}\,,\eta )\) to account for the tracking efficiency. The systematic uncertainties in the efficiency \(\epsilon (p_{\mathrm{T}}\,,\eta )\) thus need to be propagated into \(C(\Delta \phi )\) and the final \(v_{2,2}\) measurements. The \(C(\Delta \phi )\) and \(v_2\) are mostly insensitive to the tracking efficiency. This is because the \(v_{2,2}\) measures the relative variation of the yields in \(\Delta \phi \); an overall increase or decrease in the efficiency changes the yields but does not affect the \(v_2\). However, due to \(p_{\mathrm{T}}\) and \(\eta \) dependence of the tracking efficiency and its uncertainties [58], there is some residual effect on the \(v_2\). The corresponding uncertainty in the \(v_2\) is estimated by repeating the analysis while varying the efficiency within its upper and lower uncertainty values – of about 5% – in a \(p_{\mathrm{T}}\) -dependent manner. For \(v_2\) this uncertainty is estimated to be less than 1%, when studying the multiplicity dependence for the 0.5–5 GeV \(p_{\mathrm{T}}\)  interval, and less than 0.5% for the differential \(v_2 (p_{\mathrm{T}}\,)\).

7.3 Pair-acceptance

The analysis relies on the \(B(\Delta \phi )\) distribution to correct for the pair-acceptance of the detector using Eq. (3). The \(B(\Delta \phi )\) distributions are nearly flat in \(\Delta \phi \), and the effect on the \(v_2\) when correcting for the acceptance is less than 1% for all multiplicities and \(p_{\mathrm{T}}\) . Since the pair-acceptance corrections are small, the entire correction is conservatively taken as the systematic uncertainty associated with pair-acceptance effects.

7.4 Accuracy of the background estimator

This uncertainty arises due to inaccuracy in the determination of \(\mu \) during the run and stability of the \(z_{\mathrm{vtx}}\) distribution. They are estimated using the inaccuracy in the luminosity determination described in Refs. [59, 60] and stability studies performed in the analysis. Another contribution is coming from the quality of the fits. Although fits used in the functional form of Eq. (10) accurately reproduce data as shown in Figs. 2 and 3, alternative fit functions are also studied to derive an uncertainty that, together with the factors mentioned earlier, results in \(\lesssim \)1% uncertainty added to the final results.

7.5 Uncertainties in transition matrices

The transition matrices discussed in Sect. 5.5 for unfolding the \(n_{\text {trk}}\) distributions and for finding coefficients for correcting the 2PC are determined using data. The \(n_{\text {trk}}^{\mathrm {sig}}\) distribution is approximated with the Direct distributions in the lowest \(\nu \) interval (\(\nu <0.5\)). Uncertainties in the \(v_2\) values due to this approximation are estimated by repeating the analysis with the matrices calculated from the Direct distributions in the interval \(\nu <1\). The variation is less than 2% throughout, and is included as a systematic uncertainty in the value of \(v_2\).

7.6 Accuracy of the pile-up correction procedure

As described in Sect. 5, the pile-up correction procedure is implemented in intervals of \(\nu \). In order to check residual pile-up effects that are not removed by the correction procedure, a study of the pile-up-corrected \(v_2\) is performed as a function of \(\nu \), and the variation in the measured \(v_2\) is included as a systematic uncertainty. This uncertainty is determined to be \(\pm 3.5\)% for the 8 TeV data across the measured multiplicity range. For the 13 TeV data, this uncertainty is \(\pm 4\)% for \(n_{\mathrm {trk}}^{\mathrm {sig}} <100\) but increases to 15% at higher multiplicities.

An independent check of the pile-up correction procedure is done by performing an MC closure analysis using the MC sample described in Sect. 5.2. Since the MC generated events do not have any physical long-range correlations, the closure test is performed on the Fourier components of the 2PC defined by Eqs. (4)–(6) instead. The Fourier-\(v_2\) includes mostly contributions from back-to-back dijets, which is the main (physics) background that the template analysis removes. The pile-up affects the 2PC caused by the dijet and the long-range correlations in a similar way. Thus the Fourier-\(v_2\) is an ideal object to use in checking the performance of the pile-up correction, because it has a non-zero value in the simulated events. The MC closure test is performed as follows. The generator-level \(v_2\) is obtained from 2PCs using only reconstructed tracks that are known to be associated with the Z-boson vertex. The generated \(v_2\) is then compared to the reconstruction-level Fourier-\(v_2\) obtained when applying the pile-up correction procedure used in the analysis of the real data. The corrected and generated \(v_2\) are found to be consistent within the systematic uncertainties associated with the pile-up correction procedure.

Table 2 summarizes the systematic uncertainties in the multiplicity dependence of the measured \(v_2\). Table 3 summarizes the uncertainties for the \(p_{\mathrm{T}}\) dependence of \(v_2\). The dominant systematic uncertainties arise from the choice of peripheral reference and from the \(\nu \) dependence.

Table 2 Systematic uncertainties for the multiplicity dependence of the \(v_2\) integrated over the 0.5–5 GeV \(p_{\mathrm{T}}\) interval. Where ranges are provided for both multiplicity and the uncertainty, the uncertainty varies from the first value to the second value as the multiplicity varies from the lower to upper limits of the range. The listed uncertainties are taken to be symmetric about the nominal \(v_2\) values
Table 3 Systematic uncertainties for the \(v_2 (p_{\mathrm{T}}\,)\)

8 Results

Figure 11 shows the final results for the multiplicity dependence of the \(v_2\) with all systematic uncertainties included. The results are corrected to account for pile-up and detector efficiency effects, and are plotted as a function of the measured multiplicity (\(n_{\text {trk}}^{\mathrm {sig}}\)), which is corrected for pile-up, but not for detector efficiency effects.Footnote 3 The left panel compares the final \(v_2\) values obtained from the template fit in the 8 and 13 TeV Z-tagged samples, to the \(v_2\) values obtained in 5 and 13 TeV inclusive pp collisions from Ref.  [3]. The right panel shows the ratio of the \(v_2\) in the 8 and 13 TeV Z-tagged samples to the \(v_2\) in 13 TeV inclusive pp collisions. The systematic uncertainties in the measured \(v_2\) for a given dataset are to some extent correlated across the different multiplicity intervals shown in Fig. 11. The Z-tagged \(v_2\) values show no significant dependence on the multiplicity, similar to the results obtained from the inclusive samples, and are consistent with each other as well as the inclusive measurements, within 1–2\(\sigma \) systematic uncertainties.

Figure 12 compares the \(p_{\mathrm{T}}\) dependence of the template-\(v_2\) between the Z-tagged and inclusive measurements. The \(p_{\mathrm{T}}\)  dependence is evaluated over the 40–100 \(n_{\text {trk}}^{\mathrm {sig}}\) range. The upper limit of 100 in the chosen multiplicity range is because the 8 TeV Z-tagged measurements are done upto this multiplicity. The lower limit of 40 tracks arises because the 30–40 track multiplicity range is used as the peripheral reference in the systematic uncertainty estimates, and thus events with less than 40 tracks are excluded from the measurement. As seen for the multiplicity dependence, the \(p_{\mathrm{T}}\) dependence is also consistent between the 8 and 13 TeV Z-tagged samples. The Z-tagged \(v_2 (p_{\mathrm{T}}\,)\) values are also consistent with the inclusive measurements within 1–1.5\(\sigma \) systematic uncertainties.

Fig. 11
figure 11

Left panel: the pile-up-corrected \(v_2\) values obtained from the template fits as a function of \(n_{\text {trk}}^{\mathrm {sig}}\). For comparison, the \(v_2\) values obtained in 5 and 13 TeV inclusive pp data from Ref.  [3] are also shown. The error bars and shaded bands indicate statistical and systematic uncertainties, respectively. Right panel: the ratio of the \(v_2\) in 8 and 13 TeV Z-tagged samples to the \(v_2\) in inclusive 13 TeV pp collisions as a function of \(n_{\text {trk}}^{\mathrm {sig}}\). The horizontal dotted line indicates unity and is intended to guide the eye. Results are ploted for \(0.5<p_{\mathrm {T}}^{a,b} <5\) GeV 

Fig. 12
figure 12

Left panel: the pile-up-corrected \(v_2\) values obtained from the template fits as a function of \(p_{\mathrm{T}}\) . For comparison, the \(v_2\) values obtained in 5 and 13 TeV inclusive pp data are also shown. The error bars and shaded bands indicate statistical and systematic uncertainties, respectively. Right panel: the ratio of the \(v_2\) in 8 and 13 TeV Z-tagged samples to the \(v_2\) in inclusive 13 TeV pp collisions as a function of \(p_{\mathrm{T}}\) . The horizontal dotted line indicates unity and is kept to guide the eye. Results are plotted for the \(40<n_{\text {trk}}^{\mathrm {sig}} \le 100\) multiplicity interval

9 Summary

In this analysis, the long-range component (\(|\Delta \eta |>2\)) of two-particle correlations (2PC) in \(\sqrt{s}\,=8\) and 13 TeV pp collisions containing a Z  boson is studied using data collected by the ATLAS detector at the LHC. The datasets correspond to an integrated luminosity of 19.4 \(\mathrm {fb^{-1}}\) for the 8 TeV data and 36.1 \(\mathrm {fb^{-1}}\) for the 13 TeV data.

The correlations are studied using a template-fitting procedure that separates the true long-range correlation from the dijet contribution. Due to the high-luminosity conditions, a significant contribution to the 2PC from pile-up events is observed that contaminates the measured correlations. A new pile-up correction procedure is developed to remove the contribution of pile-up tracks.

The pile-up-corrected 2PC are measured across a large range of track multiplicities and over the 0.5–5 GeV \(p_{\mathrm{T}}\) range. The second-order Fourier coefficient of the single-particle anisotropy, \(v_2\), is extracted and its multiplicity and \(p_{\mathrm{T}}\)  dependence is compared to that observed in inclusive pp collisions. The pile-up-corrected \(v_2\) values show no significant dependence on the event multiplicity, similar to that observed in inclusive pp collisions. The magnitude of the observed \(v_2\) as a function of multiplicity and \(p_{\mathrm{T}}\)  is found to be consistent with that observed in inclusive pp collisions. These measurements demonstrate that in pp collisions, the long-range correlation involving soft particles is not significantly altered by the presence of a hard-scattering process. This result is an important contribution towards a better understanding of the origin of the long-range correlations observed in pp collisions.