1 Introduction

The energetic proton–proton (\(pp\)) collisions produced by the Large Hadron Collider (LHC) yield final states that are predominantly characterized by jets, or collimated sprays of charged and neutral hadrons. Jets constitute an essential piece of the physics programme carried out using the ATLAS detector due to their presence in the signal processes being measured and searched for, the various background processes that hide those signals, and the additional activity due to simultaneous \(pp\) collisions. Measurements of the energy scale and resolution of these complex objects, as well as their associated systematic uncertainties, are therefore essential both for precision measurements of the Standard Model (SM) and for sensitive searches for new physics beyond it. This paper presents the strategy used for the determination of the jet energy scale (JES) and resolution (JER) by the ATLAS experiment and its implementation as it pertains to the analysis of data from Run 2 of the LHC. Results for the JES and JER are presented using data collected during 2015–2017, corresponding to integrated luminosities in the range 36–81 \({\hbox {fb}}^{-1}\), depending on the analysis method and its goals. This publication focuses on calibrating jets reconstructed with the anti-\(k_{t}\) [1] algorithm with radius parameter \(R=0.4\).

The ATLAS Collaboration has published previous calibrations and uncertainties of the energy scale and resolution for this jet definition with data taken in 2010 [2,3,4], 2011 [5], 2012 [6], and 2015 [7]. Additionally, some ATLAS publications have targeted different jet definitions. In particular, the Run 1 papers include dedicated calibrationsFootnote 1 of jets reconstructed with the anti-\(k_{t}\) algorithm with \(R=0.6\) and \(R=1.0\), and a dedicated in situ calibration of large-radius jets has also been completed in Run 2 data [9]. This publication extends and improves on previous calibrations of anti-\(k_{t}\) \(R=0.4\) jets, taking full advantage of the larger dataset recorded over the period of 2015–2017. The significant increase in the number of proton collisions per bunch crossing in 2016 and 2017 data-taking leads to a correspondingly more difficult environment for jet reconstruction, and this result presents new jet energy scale and resolution measurements in these unique high pile-up conditions.

Section 2 describes the ATLAS detector, and Sect. 3 describes the recorded data and the Monte Carlo (MC) simulation samples used in this paper. Section   4 presents the inputs and algorithms used to reconstruct the jets. Section   5 and Sect. 6 present the methods used and the result of both the calibration and the resulting systematic uncertainties of the JES and the JER, respectively.

2 The ATLAS detector

The ATLAS detector [10] at the LHC covers nearly the entire solid angle around the collision point.Footnote 2 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroidal magnets. The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range \(|\eta | < 2.5\).

The silicon pixel detector covers the vertex region and typically provides four measurements per track, with the innermost space-point provided by the insertable B-layer that was installed before Run 2 [11, 12]. The pixel detector is followed by the silicon microstrip tracker, which usually yields eight measurements per track. The silicon-based detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to \(|\eta | = 2.0\). The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta |< 3.2\), high-granularity lead/liquid-argon (LAr) calorimeters with both barrel and endcap sections provide electromagnetic calorimetry. An additional thin LAr presampler covers \(|\eta | < 1.8\), and is used to correct for energy loss in materials traversed by particles prior to reaching the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadronic endcap calorimeters cover the range \(1.5< |\eta |< 3.2\). The solid angle coverage between \(3.2< |\eta |< 4.9\) is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimized for electromagnetic and hadronic measurements respectively. Interfaces that exist between each of these components, in particular between the barrel and endcap regions, provide for space to route various services and infrastructure, such as electrical and fiber-optic cabling, cooling, and support structures. However, these so-called transition regions also create discontinuities in the response of the calorimeter to both charged and neutral particles due to energy absorption in the inactive materials and changes in the geometry of the active materials of the calorimeters. The calibrated response and resolution of the calorimeter must therefore either correct for these features, or account for them when establishing systematic uncertainties. Figure 1 shows the many components of the calorimeter system, with reference pseudorapidities and various relevant transition regions marked as well [10, 13, 14].

Fig. 1
figure 1

Layout of the ATLAS calorimeters with pseudorapitidy (\(\eta \)) values marked for reference. The inner detector systems can be seen in black-and-white in the center of the diagram; tracking is provided up to \(\eta = 2.5\). The electromagnetic (EM) barrel and endcap calorimeters are shown in green. The EM barrel has consistent performance throughout, but has a seam in the construction at \(\eta =0\) which can impact jet energy resolution. The EM endcap has a precision region marked in darker green and an extended region in light green, and the transition from one to the other around \(\eta \sim 2.5\) involves a dramatic change in the material layers. The hadronic Tile calorimeter is shown in light blue while the hadronic endcap calorimeters based on liquid argon are illustrated in light orange. The forward calorimeters are shown in dark orange. Pink filled regions represent the tile plug calorimeter, often referred to as TileGap1 and TileGap2. The thin hot pink line marks the location of the very narrow gap and cryostat scintillators (TileGap3). The regions corresponding to the transition from barrel to endcap (\(\eta \sim 1.4\)) and from endcap to forward calorimeter (\(\eta \sim 3.1\)) are given for reference

The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The field integral of the toroids ranges between 2.0 and 6.0 Tm across most of the detector. A set of precision chambers covers the region \(|\eta | < 2.7\) with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.

Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [15]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.

3 Data and Monte Carlo simulated samples

The data used for the measurements presented here were collected in \(pp\) collisions at the LHC with a centre-of-mass energy of 13 \({\text {Te}}{\text {V}}\) and a 25 ns proton bunch crossing interval during 2015–2017. The integrated luminosities of the datasets used are in the range 36–81 \({\hbox {fb}}^{-1}\) after requiring that all detector subsystems were operational during data recording.

Additional \(pp\) collisions in the same and nearby bunch crossings are referred to as pile-up. The number of reconstructed primary vertices (\(N_{\text {PV}}\)) and the mean number of interactions per bunch crossing (\(\mu \)) are optimal observables to quantify the level of pile-up activity. The average value of \(\mu \) is 13.7, 24.9, and 37.8 in the 2015, 2016, and 2017 datasets, respectively [16]. As described below, these conditions are accounted for in the production and reconstruction of simulated data.

Table 1 List of generators used for various processes. Information is given regarding the underlying-event tunes, the PDF parameter sets, and the perturbative QCD highest-order accuracy used in the matrix element. Abbreviations in the PDF names and matrix element orders are LO (leading order), NLO (next-to-leading order), and NNLO (next-to-next-to-leading order)

Simulated dijet, multijet, \(Z\text {+jet}\), and \(\gamma \text {+jet}\) samples are used in determining the jet energy scale and its uncertainties. Table 1 summarizes the MC generators, adjustable sets of parameters (tunes), and parton distribution function (PDF) sets used for all nominal and alternative samples of the various simulated processes. The nominal samples for the majority of analyses were generated with Pythia 8.186 [17] (from now on referred to as Pythia 8) or Powheg +Pythia 8.186 [17, 20, 21]. The multijet balance analysis uses Sherpa 2.1.1 [22] as the nominal generator since it incorporates up to three jets in the matrix element and is thus more suitable for multijet processes that have more than two jets in the final state. The dijet, multijet, and \(\gamma \text {+jet}\) nominal samples use the NNPDF2.3LO PDF set [19] and the A14 set of tuned parameters [18]. For the \(Z\text {+jet}\) analysis, the dedicated AZNLO tune [26] is used instead. Alternative samples for defining systematic variations use various generators and tunes.

Stable particles, defined as those with \(c\tau > 10\) mm, output by the generators were passed through the \(\textsc {Geant} 4\)-based simulation of the ATLAS detector [27, 28]. This step simulates the interactions of the particles with matter in the detector and generates outputs which can be reconstructed in the same way as data. Hadronic showers were simulated using the FTFP BERT model as described in Ref. [29]. A set of simulated dijet events using the less detailed Atlfast-II (AFII) are also studied to determine the difference in performance between full and fast simulation and provide appropriate calibrations for AFII samples in analyses [27].

Pile-up is incorporated in the MC samples by overlaying simulated inelastic interactions on the generated hard-scatter interaction. The inelastic interactions were simulated in Pythia 8.210 using the A3 tune and the NNPDF2.3LO PDF set [19, 30]. To determine the number of simulated \(pp\) collisions to overlay onto a particular hard-scattering process, a random value is drawn from a Poisson distribution of the number of \(pp\) collisions per bunch crossing with a mean given by the desired average number of collisions per crossing for a particular data period. Events simulated with a particular pile-up profile are then compared with data from the corresponding data period. One set of MC samples was created using the pile-up profile of 2015 + 2016 data (average number of collisions 23.7) while a second independent set of samples used the profile of 2017 data. When data and simulation are compared in this paper, both sets of MC samples are used unless otherwise specified and are normalized to the luminosity of 2015+2016 data and 2017 data separately.

4 Jet reconstruction

The primary jet definition used in the majority of physics analyses by the ATLAS Collaboration and in the studies presented here is the anti-\(k_{t}\) [1] algorithm with a radius parameter \(R=0.4\) as implemented in the FastJet 3.2.2 [31, 32] software package. Four-vector objects are used as inputs to the algorithm, and may be stable particles defined by MC generators, charged-particle tracks, calorimeter energy deposits, or algorithmic combinations of the latter two, as in the case of the particle-flow reconstruction technique [33].

For use in jet reconstruction, calorimeter cells are first clustered into three-dimensional, massless, topological clusters (topo-clusters) using a nearest-neighbour algorithm [34]. Cells are added to a topo-cluster according to the ratio of the cell energy to the expected noise in each cell using thresholds that control the growth of each topo-cluster. The resulting energy of the topo-cluster is defined at the electromagnetic (EM) scale, which is the baseline calorimeter scale that correctly measures energy depositions from electromagnetic showers. Only positive-energy topo-clusters are used as inputs to the jet reconstruction. A jet produced in the hard-scatter process is expected to originate from the primary vertex, defined as the reconstructed vertex with at least two associated tracks and the largest sum of squared track momentum. Therefore, an event-by-event correction to account for the position of the primary vertex in each event – referred to as an origin correction – is applied to every topo-cluster, based on its depth within the calorimeter and pseudorapidity. This method is to be contrasted with earlier approaches [7] that applied this correction only to the jet four-momentum rather than to its constituents.

Jets reconstructed using only calorimeter-based energy information use the origin-corrected EM scale topo-clusters and are referred to as EMtopo jets. This was the primary jet definition used in ATLAS physics results prior to the end of Run 2. EMtopo jets exhibit robust energy scale and resolution characteristics across a wide kinematic range, and are independent of other reconstruction algorithms such as tracking at the jet-building stage.

Hadronic final-state measurements can be improved by making more complete use of the information from both the tracking and calorimeter systems. The particle flow (PFlow) algorithm is based on Ref. [33] and updated as described below. Particle flow directly combines measurements from both the tracker and the calorimeter to form the input signals for jet reconstruction, which are intended to approximate individual particles. Specifically, energy deposited in the calorimeter by charged particles is subtracted from the observed topo-clusters and replaced by the momenta of tracks that are matched to those topo-clusters. These resulting PFlow jets exhibit improved energy and angular resolution, reconstruction efficiency, and pile-up stability compared to calorimeter jets [33]. EMtopo and PFlow jets are retained for the analyses discussed in this paper only if they have an uncalibrated \(p_{{\text {T}}} > 7\) \({\text {Ge}}{\text {V}}\) and \(|\eta | < 4.5\).

The updates to the PFlow algorithm since its description in Ref. [33] are as follows. The expected mean value of the energy deposited by pions, \(\langle E_{\text {dep}} \rangle \), and its expected standard deviation, \(\sigma (E_{\text {dep}})\), were recomputed using the updated simulation, geometry, and topo-cluster noise thresholds for Run 2 [7]. The shower profiles were similarly updated. The only algorithmic change was an improvement in the transition between using track energy and cluster energy in high-\(p_{{\text {T}}}\) jets. Since energetic particles are often in the core of jets and thus poorly isolated from nearby activity, accurate removal of the calorimeter energy associated with the track can be difficult. Therefore, the PFlow algorithm prevents energy subtraction in these cases. Formerly this was managed by applying a simple \(p_{{\text {T}}} ^{\text {trk}}<40~{\text {Ge}}{\text {V}}\) cut in the track selection. In the updated algorithm, a more sophisticated procedure is used to prevent the subtraction in cases where the advantages of the tracker are smaller and where the particle shower falls in a region with significant energy depositions from other particles. For all tracks up to \(p_{{\text {T}}} ^{\text {trk}}=100~{\text {Ge}}{\text {V}}\), if the energy \(E^{\text {clus}}\) in a cone of size \(\Delta R=0.15\) around the extrapolated particle satisfies

$$\begin{aligned} \frac{E^{\text {clus}}-\langle E_{\text {dep}} \rangle }{\sigma (E_{\text {dep}})}>33.2\times \log _{10}(40~{\text {Ge}}{\text {V}}/p_{{\text {T}}} ^{\text {trk}})\,, \end{aligned}$$
(1)

then the subtraction is not performed. With this parameterization, the subtraction is performed at lower track momenta unless the calorimeter activity measured by \(E^{\text {clus}}\) is very high, such as in very dense environments where the accuracy of the subtraction is degraded. Since the calorimeter provides a good energy measurement at high \(p_{{\text {T}}} ^{\text {trk}}\), this parameterization effectively slowly truncates the algorithm, yet allows the subtraction to continue to be performed for a small range above this cut-off even when the calorimeter energy deposition is low or near the expected value, \(\langle E_{\text {dep}} \rangle \). The momentum range up to which the subtraction is still allowed to be performed is driven by the coefficient of 33.2 in Eq. (1) and is typically about 20–50% above the 40 \({\text {Ge}}{\text {V}}\) cut-off previously used. Above \(p_{{\text {T}}} ^{\text {trk}}=100~{\text {Ge}}{\text {V}}\) no track information is used and the PFlow algorithm becomes equivalent to EMtopo, benefitting from excellent calorimeter performance at high energies. The result of the improved subtraction method detailed here is that the energy resolution of PFlow jets becomes compatible with that of EMtopo jets at high energies while remaining superior at low energies.

After the subtraction, two scalings are applied. These account for the difference in response, here defined as the ratio of measured to true particle energy, between topo-clusters at the EM scale and tracks for which the energy scale is closer to the true particle energy. The first scale factor applies only when no subtraction has been performed for a selected track. In this case the PFlow object includes both the full topo-cluster energy and the track momentum. To avoid double-counting the energy while maintaining the contribution from the calorimeter measurement, the track momentum is scaled by a factor \((1 -\langle E_{\text {dep}} \rangle /p_{\text {trk}})\). The resulting PFlow object uses the desired information and has a final energy of approximately \(p_{\text {trk}}\), matching the response for the subtracted case. The second scale factor is applied in both the subtracted and non-subtracted cases for all PFlow objects created from selected tracks below 100 \({\text {Ge}}{\text {V}}\). It smooths the transition between the lower-energy PFlow objects which are at the scale of the tracks and the higher-energy objects at the electromagnetic scale of the clusters. The energy of these PFlow objects is scaled by unity for \(p_{{\text {T}}} ^{\text {trk}}\) below \(30~{\text {Ge}}{\text {V}}\), by \((1- \langle E_{\text {dep}} \rangle /p_{\text {trk}})\)  for objects with \(60~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {trk}} < 100~{\text {Ge}}{\text {V}}\), and by a linearly descending scale factor in between. This ensures that all objects are at the electromagnetic scale by 60 \({\text {Ge}}{\text {V}}\).

Fig. 2
figure 2

Stages of jet energy scale calibrations. Each one is applied to the four-momentum of the jet

Tracks used in PFlow objects and in deriving calibrations for both EMtopo and PFlow jets are reconstructed within the full acceptance of the inner detector (\(|\eta | < 2.5\)), required to have a \(p_{{\text {T}}} > 500\) \({\text {Me}}{\text {V}}\), and satisfy quality criteria based on the number of hits in the ID subdetectors. To suppress the effects of pile-up, tracks must satisfy \(|z_0 \sin \theta |<2\) mm, where \(z_0\) is the distance of closest approach of the track to the hard-scatter primary vertex along the z-axis and \(\theta \) is the polar angle. Tracks are matched to jets using ghost association [35], a procedure that treats them as four-vectors of infinitesimal magnitude during the jet reconstruction and assigns them to the jet with which they are clustered.

MC simulation is used to determine the energy scale and resolution of jets by comparing PFlow and EMtopo jets with particle-level truth jets. Truth jets are reconstructed using stable final-state particles and exclude muons, neutrinos, and particles from pile-up interactions. Truth jets are selected with the same \(p_{{\text {T}}} > 7\) \({\text {Ge}}{\text {V}}\) and \(|\eta | < 4.5\) thresholds as EMtopo and PFlow jets, and are geometrically matched to those jets using the angular distance \(\Delta R\) with the requirement \(\Delta R < 0.3\).

5 Jet energy scale calibration

The jet energy scale calibration restores the jet energy to that of jets reconstructed at the particle level. The full chain of corrections is illustrated in Fig. 2. All stages correct the four-momentum, scaling the jet \(p_{{\text {T}}}\), energy, and mass.

At the beginning of the chain, the pile-up corrections remove the excess energy due to additional proton–proton interactions within the same (in-time) or nearby (out-of-time) bunch crossings. These corrections consist of two components: a correction based on the jet area and transverse momentum density of the event, and a residual correction derived from MC simulation and parameterized as a function of the mean number of interactions per bunch crossing (\(\mu \)) and the number of reconstructed primary vertices in the event (\(N_{\text {PV}}\)). These corrections are discussed in Sect. 5.1.1. The absolute JES calibration corrects the jet so that it agrees in energy and direction with truth jets from dijet MC events, and is detailed in Sect. 5.1.2. Furthermore, the global sequential calibration (derived from dijet MC events) improves the jet \(p_{{\text {T}}}\) resolution and associated uncertainties by removing the dependence of the reconstructed jet response on observables constructed using information from the tracking, calorimeter, and muon chamber detector systems, as introduced in Sect. 5.1.3. All these calibrations are applied to both data and MC simulation. Finally, a residual in situ calibration is applied to data only to correct for remaining differences between data and MC simulation. It is derived using well-measured reference objects, including photons, \(Z\) bosons, and calibrated jets, and for the first time benefits from a low-\(p_{{\text {T}}}\) measurement using the missing-\(E_{\text {T}}\) projection fraction method for better pile-up robustness. It is described in Sect. 5.2. The full treatment and reduction of the systematic uncertainties is discussed in Sect. 5.3.

5.1 Simulation-based jet calibrations

The derivation of the calibrations derived exclusively from MC simulation samples is described below.

5.1.1 Pile-up corrections

As a result of the increase of the topo-clustering \(p_{{\text {T}}}\) thresholds (to suppress electronic and pile-up noise) and in the instantaneous luminosity, the contribution from pile-up to the JES in the 2015–2017 data-taking period differs from the one observed in 2015. The pile-up corrections are therefore evaluated using updated MC simulations of the software reconstruction and pile-up conditions. These corrections are derived using the same methods employed in 2015 [7] and are summarized in the following paragraphs.

First, a jet \(p_{{\text {T}}}\)-density-based subtraction of the per-event pile-up contribution to the jet \(p_{{\text {T}}}\) is performed. The jet area A is a measure of the susceptibility of the jet to pile-up and is calculated by determining the relative number of ghost particles associated with a jet after clustering. Next, the pile-up contribution is estimated from the median \(p_{{\text {T}}}\) density, \(\rho \), of jets in the y\(\phi \) plane, \(\left\langle p_{{\text {T}}}/A \right\rangle \). The calculation of \(\rho \) uses jets reconstructed using the \(k_t\) algorithm [36] with radius parameter \(R=0.4\) from positive-energy topo-clusters with \(|\eta |<2\). The computation of \(\rho \) in the central region of the detector gives a more meaningful measure of the pile-up activity than the median over the entire \(\eta \) range, and this is because \(\rho \) drops to nearly zero beyond \(|\eta |\sim 2\). This drop is due to the lower occupancy in the forward region relative to the central region, which is a result of a coarser segmentation in the forward region. The \(k_{t} \) algorithm is chosen due to its tendency to naturally reconstruct jets including an uniform soft background [35], while \(\rho \) is used to reduce the bias from hard-scatter jets which populate the high-\(p_{{\text {T}}}\) tails of the distribution. The distribution of \(\rho \) in MC simulation for representative \(N_{\text {PV}}\) values is shown in Fig. 3. The ratio of the \(\rho \)-subtracted jet \(p_{{\text {T}}}\) to the uncorrected jet \(p_{{\text {T}}}\) is applied as a scale factor to the jet four-momentum and does hence not affect its direction.

Fig. 3
figure 3

Per-event median \(p_{{\text {T}}}\) density, \(\rho \), at \(N_{\text {PV}} =15\) (solid), \(N_{\text {PV}} =25\) (long dashed), and \(N_{\text {PV}} =35\) (short dashed) for \(37<\mu <38\) as found in MC simulation

The \(\rho \) calculation is derived from the central, lower-occupancy regions of the calorimeter and does not fully describe the pile-up sensitivity in the forward calorimeter region or in the higher-occupancy core of high-\(p_{{\text {T}}}\) jets. It is therefore observed that after this correction some dependence of the anti-\(k_{t}\) jet \(p_{{\text {T}}}\) on the pile-up activity remains, and consequently, a residual correction is derived. This residual dependence is defined as the difference between the reconstructed jet \(p_{{\text {T}}}\) and truth jet \(p_{{\text {T}}}\) and it is observed as a function of both \(N_{\text {PV}}\) and \(\mu \), which are sensitive to in-time and out-of-time pile-up respectively.

The jet \(p_{{\text {T}}}\) after all pile-up (\(p_{{\text {T}}}\)-density-based and residual) corrections is given by

$$\begin{aligned} p_{{\text {T}}} ^{\text {corr}} = p_{{\text {T}}} ^{\text {reco}}- \rho \times A - \alpha \times (N_{\text {PV}}-1) - \beta \times \mu \,, \end{aligned}$$

where \(p_{{\text {T}}} ^{\text {reco}}\) refers to the \(p_{{\text {T}}}\) of the reconstructed jet before any pile-up correction is applied. Reconstructed jets with \(p_{{\text {T}}} >7~{\text {Ge}}{\text {V}}\) are geometrically matched to truth jets within \(\Delta R = 0.3\). The residual \(p_{{\text {T}}}\) dependences on \(N_{\text {PV}}\) (\(\alpha \)) and \(\mu \) (\(\beta \)) are observed to be fairly linear and independent of one another. Independent linear fits are used to derive \(\alpha \) and \(\beta \) coefficients in bins of \(p_{{\text {T}}} ^{\text {true}}\) and \(|\eta _{\text {det}} |\), where \(p_{{\text {T}}} ^{\text {true}}\) is the \(p_{{\text {T}}}\) of the truth jet that matches the reconstructed jet. The jet \(\eta \) pointing from the geometric centre of the detector, \(\eta _{\text {det}}\), is used to remove any ambiguity as to which region of the detector is measuring the jet. Both the \(\alpha \) and \(\beta \) coefficients are seen to have a logarithmic dependence on \(p_{{\text {T}}} ^{\text {true}}\), and logarithmic fits are performed in the range \(20\,{\text {Ge}}{\text {V}}<p_{{\text {T}}} ^{\text {true}} <200\) \({\text {Ge}}{\text {V}}\) for each bin of \(|\eta _{\text {det}} |\). In each \(|\eta _{\text {det}} |\) bin, the fitted values of the \(\alpha \) and \(\beta \) coefficients at \(p_{{\text {T}}} ^{\text {true}} =25\) \({\text {Ge}}{\text {V}}\) are taken as their nominal values, reflecting their behaviour in the \(p_{{\text {T}}}\) region where pile-up effects are most relevant. The differences between the logarithmic fits over the full \(p_{{\text {T}}} ^{\text {true}}\) range and the nominal fits are used for a \(p_{{\text {T}}}\)-dependent systematic uncertainty in the residual pile-up dependence. Finally, linear fits are performed to the binned coefficients as a function of \(|\eta _{\text {det}} |\). This reduces the effects of statistical fluctuations and allows the \(\alpha \) and \(\beta \) coefficients to be smoothly sampled in \(|\eta _{\text {det}} |\), particularly in regions of varying dependence.

The dependences of the \(p_{{\text {T}}}\)-density-based and residual corrections on \(N_{\text {PV}}\) and \(\mu \) as a function of \(|\eta _{\text {det}} |\) for PFlow jets are shown in Fig. 4. The negative dependence on \(\mu \) for out-of-time pile-up is a result of the liquid-argon calorimeter’s pulse shape, which is negative during the period shortly after registering a signal [37]. These corrections are similar to those derived for EMtopo jets, although the \(N_{\text {PV}}\)-dependent corrections for PFlow jets in the \(|\eta _{\text {det}} |<2.5\) region are reduced by about 60% relative to EMtopo due to the usage of tracks in the PFlow algorithm. For EMtopo jets, the shape of the residual corrections is comparable to that found in 2015 MC simulation, except in the forward region (\(|\eta _{\text {det}} |>2.5\)), where it is found to be smaller by 0.1 \({\text {Ge}}{\text {V}}\). This difference is primarily caused by higher topo-cluster noise thresholds used in the full Run 2 data.

Fig. 4
figure 4

Dependence of PFlow jet \(p_{{\text {T}}}\) on a in-time pile-up (\(N_{\text {PV}}\) averaged over \(\mu \)) and b out-of-time pile-up (\(\mu \) averaged over \(N_{\text {PV}}\)) as a function of \(|\eta _{\text {det}} |\) for \(p_{{\text {T}}} ^{\text {true}} =25\) \({\text {Ge}}{\text {V}}\). Errors are taken from the fit results and are too small to be visible on the scale of the plot

Four systematic uncertainties are introduced to account for MC mis-modelling of \(N_{\text {PV}}\), \(\mu \), the \(\rho \) topology, and the \(p_{{\text {T}}}\) dependence of the residual pile-up corrections. The last of these is derived from the full logarithmic fits to \(\alpha \) and \(\beta \), as discussed previously. Two in situ methods are used to estimate uncertainties in the modelling of \(N_{\text {PV}}\) and \(\mu \). The first method uses jets reconstructed from tracks to provide a measure of the jet \(p_{{\text {T}}}\) independent of pile-up. This is only used for \(|\eta |<\) 2.1. The second method exploits the \(p_{{\text {T}}}\) balance between a reconstructed jet and a Z boson and is used for \(2.1<|\eta |<4.5\). These systematic uncertainties are described in more detail in Ref. [38]. Finally, the \(\rho \) topology uncertainty accounts for the uncertainty in the underlying event’s contribution to \(\rho \), and is discussed in detail in Sect. 5.2.4.

Fig. 5
figure 5

The average energy response as a function of reconstructed jet a \(\eta _{\text {det}} \) and b energy \(E^{\text {reco}}\). Each value is obtained from the corresponding parameterized function derived with the Pythia 8 MC sample and only jets satisfying \(p_{{\text {T}}} >20\) \({\text {Ge}}{\text {V}}\) are shown

5.1.2 Jet energy scale and \(\eta \) calibration

The absolute jet energy scale and \(\eta \) calibrations correct the reconstructed jet four-momentum to the particle-level energy scale accounting for non-compensating calorimeter response, energy losses in passive material, out-of-cone effects and biases in the jet \(\eta \) reconstruction. Such biases are primarily caused by the transition between different calorimeter technologies and sudden changes in calorimeter granularity. The calibration is derived for \(R=0.4\) anti-\(k_{t}\) jets from a Pythia 8 MC simulation of dijet events after the application of the pile-up corrections. Reconstructed jets are geometrically matched to truth jets within \(\Delta R =0.3\). In addition, reconstructed (truth) jets are required to have no other reconstructed (truth) jet of \(p_{{\text {T}}} >7~{\text {Ge}}{\text {V}}\) within \(\Delta R =0.6\) (\(\Delta R =1.0\)).

The average jet energy response \({\mathcal {R}}\), defined as the mean of a Gaussian fit to the core of the \(E^{\text {reco}}/E^{\text {true}} \) distribution, is measured in \(E^{\text {true}}\) and \(\eta _{\text {det}}\) bins. The decision to calculate the response as a function of \(E^{\text {true}}\) instead of \(E^{\text {reco}}\) is motivated by the fact that for fixed \(E^{\text {true}}\) (\(E^{\text {reco}}\)) bins the response distribution is (not) Gaussian. The average response is parameterized as a function of \(E^{\text {reco}}\) using a numerical inversion procedure, as detailed in Ref. [2], and the jet calibration factor is taken as the inverse of the average energy response. The response is higher for PFlow jets than for EMtopo jets at low energies since tracking information is used. The response for PFlow jets as a function of \(E^{\text {reco}}\) (\(\eta _{\text {det}} \)) for representative \(\eta _{\text {det}}\) (\(E^{\text {reco}}\)) bins is shown in Fig. 5.

After the JES calibration based on the results in Fig. 5 is applied, the response diverges from 1 by a maximum of about 5% (3%, 1%) at \(p_{{\text {T}}} ^{\text {true}} = 20~(30, 50)~{\text {Ge}}{\text {V}}\). This level of non-closure is observed across entire \(\eta _{\text {det}} \) range. These small non-closures are seen for low-\(p_{{\text {T}}}\) jets due to a slightly non-Gaussian energy response and jet reconstruction threshold effects, both of which impact the response fits. The closure in this result is an improvement with respect to the 2015 calibration and is thanks to advances in the fitting method and parameters.

Fig. 6
figure 6

The signed difference between the reconstructed and truth jet \(\eta \), denoted by \(\eta ^{\text {reco}}\) and \(\eta ^{\text {true}}\) respectively. Each value is obtained from the corresponding parameterized function derived with the Pythia 8 MC sample and only jets satisfying \(p_{{\text {T}}} >20\) \({\text {Ge}}{\text {V}}\) are shown

A bias in the reconstructed jet \(\eta \), defined as a significant deviation from zero in the signed difference between the reconstructed and truth jet \(\eta \), denoted by \(\eta ^{\text {reco}}\) and \(\eta ^{\text {true}}\) respectively, is observed and shown in Fig. 6 as a function of \(|\eta _{\text {det}} |\) for PFlow jets. The bias for EMtopo jets is similar, showing the same features. It is largest in jets that encompass two calorimeter regions with different energy responses caused by changes in calorimeter geometry or technology. This artificially increases the energy of one side of the jet relative to the other, altering the reconstructed four-momentum. The barrel–endcap (\(|\eta _{\text {det}} |\sim 1.4\)) and endcap–forward (\(|\eta _{\text {det}} |\sim 3.1\)) transition regions can be clearly seen in Fig. 5a as susceptible to this effect. A second correction is therefore derived as the difference between the reconstructed and truth \(\eta \) (\(\eta ^{\text {reco}}\) and \(\eta ^{\text {true}}\) respectively) parameterized as a function of \(E^{\text {true}}\) and \(\eta _{\text {det}}\) to remove such bias. A numerical inversion procedure is again used to derive corrections in \(E^{\text {reco}}\) from \(E^{\text {true}}\). This calibration only alters the jet \(p_{{\text {T}}}\) and \(\eta \), not the full four-momentum. EMtopo and PFlow jets calibrated with the full jet energy scale and \(\eta \) calibration are considered to be at the EM+JES scale and PFlow+JES scale, respectively.

The absolute JES and \(\eta \) calibrations are also derived for a Pythia 8 MC sample using AFII. An additional systematic uncertainty is considered for these samples to account for a small non-closure in the calibration beyond \(|\eta _{\text {det}} |\sim 3.2\), due to the approximate treatment of hadronic showers in the forward calorimeters. This uncertainty is below 0.5% for all central jets and is about 3% for a forward jet of \(p_{{\text {T}}}\) \( = 20~{\text {Ge}}{\text {V}}\), falling rapidly with increasing \(p_{{\text {T}}}\).

5.1.3 Global sequential calibration

Even after the application of the previous jet calibrations (from now on referred to as MCJES), for a given (\(p_{{\text {T}}} ^{\text {true}}\), \(\eta _{\text {det}} \)) bin, the response can vary from jet to jet depending on the flavour and energy distribution of the constituent particles, their transverse distribution, and the fluctuations of the jet development in the calorimeter. Furthermore, the average particle composition and shower shape of a jet varies between initiating particles, most notably between quark- and gluon-initiated jets. A quark-initiated jet will often include hadrons with a higher fraction of the jet \(p_{{\text {T}}}\) that penetrate further into the calorimeter, while a gluon-initiated jet will typically contain more particles of softer \(p_{{\text {T}}}\), leading to a lower calorimeter response and a wider transverse profile. The global sequential calibration (GSC), a procedure used in the 2012 [6] and 2015 [7] calibrations, is a series of multiplicative corrections to reduce the effects from these fluctuations and improve the jet resolution without changing the average jet energy response. The jet resolution \(\sigma _{{\mathcal {R}}}\) is given by the standard deviation of a Gaussian fit to the jet \(p_{{\text {T}}}\) response distribution, where the \(p_{{\text {T}}}\) response is defined similarly to jet energy response as the ratio of \(p_{{\text {T}}} ^{\text {reco}}\) to \(p_{{\text {T}}} ^{\text {true}}\).

Fig. 7
figure 7

Jet response for PFlow jets in four broad \(p_{{\text {T}}} ^{\text {true}}\) ranges as a function of each of the six observables used in the GSC a the fraction of the jet \(p_{{\text {T}}}\) carried by charged particles, b the fraction of energy in the first layer of the Tile calorimeter, c the fraction of energy in the third layer of the electromagnetic calorimeter, d the number of tracks, e the track width, and f the number of muon spectrometer track segments associated with the jet. Jets at the PFlow+JES scale with \(0.2<|\eta _{\text {det}} | < 0.3\) (except for \(n_{\text {segments}}\) which is shown for \(|\eta _{\text {det}} | < 1.3\) due to low statistics) are selected from a sample of Pythia 8 dijet MC events and the corresponding preceding GSC steps have been applied accordingly. The error bars show only the statistical uncertainty. The bottom panels show the normalized distributions of the variables

The GSC is based on global jet observables such as the longitudinal structure of the energy depositions within the calorimeters, tracking information associated with the jet, and information related to the activity in the muon chambers behind a jet. For these studies, reconstructed jets are geometrically matched to truth jets and a numerical inversion procedure is used, as explained in Sect. 5.1.2. Six observables are identified that improve the resolution of the JES through the GSC. For each observable, an independent jet four-momentum correction is derived as a function of \(p_{{\text {T}}} ^{\text {true}}\) and \(|\eta _{\text {det}} |\) by inverting the reconstructed jet response in Pythia 8 MC simulation events. Corrections for each observable are applied independently and sequentially to the jet four-momentum for jets with \(|\eta _{\text {det}} |<3.5\) (unless stated otherwise). No improvement in resolution was found from altering the sequence of the corrections.

The six stages of the GSC account for the dependence of the jet response on (in the order in which they are applied):

  • \(f_{\text {charged}}\), the fraction of the jet \(p_{{\text {T}}}\) measured from ghost-associated tracks with \(p_{{\text {T}}} >500\) \({\text {Me}}{\text {V}}\) (\(|\eta _{\text {det}} |<2.5\));

  • \(f_{\text {Tile}0}\), the fraction of jet energy measured in the first layer of the hadronic Tile calorimeter (\(|\eta _{\text {det}} |<1.7\));

  • \(f_{\text {LAr}3}\), the fraction of jet energy measured in the third layer of the electromagnetic LAr calorimeter (\(|\eta _{\text {det}} |<3.5\));

  • \(n_{\text {trk}}\), the number of tracks with \(p_{{\text {T}}} >1\) \({\text {Ge}}{\text {V}}\) ghost-associated with the jet (\(|\eta _{\text {det}} |<2.5\));

  • \(w _{\text {trk}}\), also known as track width, the average \(p_{{\text {T}}}\)-weighted transverse distance in the \(\eta \)\(\phi \) plane between the jet axis and all tracks of \(p_{{\text {T}}} >1\) \({\text {Ge}}{\text {V}}\) ghost-associated with the jet (\(|\eta _{\text {det}} |<2.5\));

  • \(n_{\text {segments}}\), the number of muon track segments ghost-associated with the jet (\(|\eta _{\text {det}} |<2.7\)).

The first correction is only applied to PFlow jets. The \(n_{\text {segments}}\) correction, also known as the punch-through correction, reduces the tails of the response distribution caused by high-\(p_{{\text {T}}}\) jets that are not fully contained in the calorimeter. All corrections are derived as a function of jet \(p_{{\text {T}}}\), except for the punch-through correction, which is derived as a function of jet energy since this effect is more correlated with the energy escaping the calorimeters.

The underlying distributions of these observables are shown for PFlow jets in MC simulation and bins of equal statistics in Fig. 7. Each observable has been studied in data and simulation and is found to be well modelled [6, 7, 33]. The spike at zero in the \(f_{\text {Tile}0}\) distribution at low \(p_{{\text {T}}} ^{\text {true}}\), shown in Fig. 7b, corresponds to jets that are fully contained in the electromagnetic calorimeter and do not deposit energy in the Tile calorimeter. The tail towards negative values in the \(f_{\text {Tile}0}\) and \(f_{\text {LAr}3}\) distributions at low \(p_{{\text {T}}} ^{\text {true}}\), shown in Fig. 7b, c, respectively, reflects calorimeter noise fluctuations. Slight differences with respect to data have a negligible impact on the GSC since the dependence of the average jet response on the observables is well modelled in MC simulation, as observed by an in situ dijet tag-and-probe method described in Ref. [2]. In this method, the average \(p_{{\text {T}}}\) asymmetry between back-to-back jets is measured as a function of each observable.

The average jet \(p_{{\text {T}}}\) response for PFlow jets in MC simulation as a function of each of the GSC observables is shown in Fig. 7 for representative \(p_{{\text {T}}} ^{\text {true}}\) ranges. The dependence of the jet response on each observable is reduced to less than 2% after the full GSC is applied, with small deviations from unity reflecting the correlations between observables that are unaccounted for in the corrections.

The fractional jet resolution, defined as \(\sigma _{{\mathcal {R}}}/{\mathcal {R}}\), is used to determine the size of the fluctuations in the jet energy reconstruction and is shown for PFlow jets with \(0.2<|\eta _{\text {det}} |<0.3\) in MC simulation in Fig. 8. As more corrections are applied, the fractional jet resolution improves and the jet response dependence on the jet flavour is reduced. No improvement is observed in Fig. 8 from the punch-through correction since only a small fraction of jets received this calibration, but there are analyses where their region of interest has a large fraction of jets that would receive this correction [39, 40].

5.2 In situ jet calibrations

Once jets are corrected to the particle level using the MCJES and GSC, they require one final calibration step to account for differences between the jet response in data and simulation. These differences are caused by imperfect simulation of both the detector materials and the physics processes involved: the hard scatter and underlying event, jet formation, pile-up, and particle interactions with the detector. The final in situ calibration measures the jet response in data and MC simulation separately and uses the ratio as an additional correction in data.

Jet response is calculated by balancing the \(p_{{\text {T}}}\) of a jet against that of a well-calibrated reference object or system. The response \({\mathcal {R}}_{{in~situ}}\) is defined as the average ratio of the jet \(p_{{\text {T}}}\) to the reference object \(p_{{\text {T}}}\) in bins of reference object \(p_{{\text {T}}}\), where that average is taken from the peak location found by fitting the distribution with a Gaussian function. \({\mathcal {R}}_{{in~situ}}\) is sensitive to effects such as the presence of additional radiative jets or the transition of energy into or out of the jet cone, although these effects can be mitigated through careful event selection.Footnote 3 A better method is to form the double ratio from the response in data and MC simulation:

$$\begin{aligned} c = \frac{{{\mathcal {R}}}^{\text {data}}_{{in~situ}}}{{{\mathcal {R}}}^{\text {MC}}_{{in~situ}}}\,, \end{aligned}$$

which is robust to secondary effects so long as they are well modelled in simulation and is therefore a reliable measure of the jet energy scale difference between data and MC simulation. The double ratio c is transformed via numerical inversion from a function of reference object \(p_{{\text {T}}}\) to a function of jet \(p_{{\text {T}}}\) (and jet \(\eta \) where applicable). This is the final in situ calibration.

Fig. 8
figure 8

Resolution of jets at the PFlow+JES scale with \(0.2< |\eta _{\text {det}} | < 0.3\) measured in Pythia 8 dijet MC simulation after each stage of the global sequential calibration (GSC). All jet flavours, including \(b\)-jets, are considered. The lower panel shows the difference in quadrature between the resolution before any GSC correction is applied (\(\sigma \)) and after the corresponding GSC step is applied (\(\sigma '\))

There are three stages of in situ analyses. First, the \(\eta \) intercalibration analysis corrects the energy scale of forward (\(0.8 \le |\eta _{\text {det}} | < 4.5\)) jets to match those of central (\(|\eta _{\text {det}} | < 0.8\)) jets using the \(p_{{\text {T}}}\) balance in dijet events. Second, the \(Z\text {+jet}\) and \(\gamma \text {+jet}\) analyses balance the hadronic recoil in an event against the \(p_{{\text {T}}}\) of a calibrated Z boson or photon. The missing-\(E_{\text {T}}\) projection fraction (MPF) method uses the full hadronic recoil instead of a jet to compute the balance to help mitigate effects of pile-up and jet reconstruction threshold which otherwise make low-\(p_{{\text {T}}}\) measurements challenging [41]. Finally, the multijet balance (MJB) analysis uses a system of well-calibrated low-\(p_{{\text {T}}}\) jets to calibrate a single high-\(p_{{\text {T}}}\) jet [42]. The \(Z/\gamma \)+jet and MJB analyses are computed only for central jets, but are also applicable to forward jets due to the effect of the \(\eta \) intercalibration. Each measurement is translated from a function of reference object \(p_{{\text {T}}}\) into jet \(p_{{\text {T}}}\). A statistical combination of the \(Z/\gamma \)+jet and MJB analyses provides a single smooth calibration applicable across the full momentum range.

Since the three in situ analyses (\(\eta \) intercalibration, \(Z/\gamma \)+jet MPF, and MJB) are performed sequentially, systematic uncertainties are propagated from each to the next. Within each analysis, systematic uncertainties arise from three sources: modelling of physics processes in simulation, uncertainties in the measurement of the reference object, and uncertainties in the expected \(p_{{\text {T}}}\) balance due to the event’s topology. Mis-modelling is accounted for by comparing the predictions of two MC generators and taking their difference as the uncertainty. Systematic uncertainties in the measurement of the reference object are taken from the \(\pm 1 \sigma \) uncertainties in each object’s calibration and propagated through the analysis. Event topology uncertainties are estimated by varying the event selections used and observing the impact on the final MC simulation to data ratio.

A rebinning procedure is applied to each systematic uncertainty to ensure that the features represented in the final result are statistically significant and not the result of fluctuations in small numbers of simulated or data events. This is only performed where the response does not vary sharply with \(p_{{\text {T}}}\), ensuring it does not obscure real physics effects. The rebinning procedure follows a bootstrapping method: pseudo-experiment datasets are created by sampling from a Poisson distribution with a mean of one for each event in the data or MC simulation [43]. The pseudo-experiments are therefore statistically correlated yet unique, and the root mean square of the response distribution across the pseudo-experiments provides a measure of the statistical uncertainty of the analysis. The measured result for each systematic uncertainty is then rebinned as appropriate for each analysis to ensure that the final shapes are statistically significant.

The \(Z/\gamma \)+jet and MJB calibrations and uncertainties are derived from the full 2015–2017 combined datasets with a total luminosity of 80 \({\hbox {fb}}^{-1}\). The \(\eta \) intercalibration analysis uses a dataset of total size 81 \({\hbox {fb}}^{-1}\), but since this analysis is more sensitive than the others to year-by-year fluctuations, the dataset is split into two blocks and a time-dependent result is computed instead. One \(\eta \) intercalibration is derived from and applies to the \(2015+2016\) dataset while a second independent calibration is derived from the 2017 dataset and applies to \(2017+2018\) data. These two data periods are treated separately due to a change in LAr calorimeter read-out that occurred between 2016 and 2017 data taking and affected jet reconstruction in the endcap regions. With no changes of similar scale made between 2017 and 2018 data taking, the 2017 calibration can be reasonably applied to 2018 as well. The post-calibration jet performance is consistent between these two different data periods and therefore a single set of uncertainties based on the \(2015+2016\) dataset is used for the \(\eta \) intercalibration in all years, with only a small localized additional uncertainty added for 2018 as described in Sect. 5.2.1.

Certain common selection criteria are applied to all three in situ analyses. Each event must have a reconstructed vertex with at least two associated tracks of \(p_{{\text {T}}} >500~{\text {Me}}{\text {V}}\). All jets must satisfy quality criteria to reject non-collision background, calorimeter noise, and cosmic rays [44]. Furthermore, each jet with \(20~{\text {Ge}}{\text {V}}< p_{{\text {T}}} < 60~{\text {Ge}}{\text {V}}\) and \(|\eta _{\text {det}} | < 2.4\) must pass jet vertex tagging, or JVT, requirements with selection criteria that are specific to the jet definition [45]. These requirements match jets to the primary vertex and are 92% efficient for EMtopo jets and 97% efficient for PFlow jets.

5.2.1 Relative calibration measurement in \(\eta \) using dijet events

The \(\eta \) intercalibration analysis produces a correction which is applied to forward jets (\(0.8 \le |\eta _{\text {det}} | < 4.5\)) to bring them to the same energy scale as central jets (\(|\eta _{\text {det}} | < 0.8\)). Jets in the central region of the detector are taken to be well-calibrated, while jets in the forward regions vary in response and must be corrected accordingly. Events are selected with exactly two jets in different \(\eta \) regions of the detector. To maximize statistics, neither jet need be in the central region: instead, all regions will be calibrated relative to one another.

For these dijet events, momentum balance requires that the transverse momentum of the two jets must be equal and opposite. Therefore, the momentum asymmetry of the two jets is a metric for the response difference between the two detector regions (left and right for simplicity):

$$\begin{aligned} {\mathcal {A}} = \frac{p_{{\text {T}}} ^{\text {left}}-p_{{\text {T}}} ^{\text {right}}}{p_{{\text {T}}} ^{\text {avg}}}\,, \end{aligned}$$

where \(p_{{\text {T}}} ^{\text {avg}} = (p_{{\text {T}}} ^{\text {left}}+p_{{\text {T}}} ^{\text {right}})/2\). The response ratio \({\mathcal {R}}\) of the two jets defines the calibration factor c for each jet and is then:

$$\begin{aligned} {\mathcal {R}} = \frac{c^{\text {left}}}{c^{\text {right}}} = \frac{2 + \langle {\mathcal {A}} \rangle }{2 - \langle {\mathcal {A}} \rangle } \cong \frac{p_{{\text {T}}} ^{\text {left}}}{p_{{\text {T}}} ^{\text {right}}}\,. \end{aligned}$$

The average response ratio \(\langle {\mathcal {R}} _{ijx} \rangle \) is measured in each bin i of \(\eta ^{\text {left}}\), j of \(\eta ^{\text {right}}\), and x of \(p_{{\text {T}}} ^{\text {avg}} \); \(\Delta \langle {\mathcal {R}} _{ijx} \rangle \) is the statistical uncertainty in each bin. All \(\eta \) values are in detector coordinates rather than corrected jet coordinates (\(\eta _{\text {det}}\)) since the properties of interest correlate to specific regions of detector hardware. The following function relates the correction factors and responses in each of the N bins:

$$\begin{aligned} S(c_{1x},\ldots ,c_{Nx})= & {} \sum _{j=2}^{N}\sum _{i=1}^{j-1} \left( \frac{1}{\Delta \langle {\mathcal {R}} _{ijx} \rangle } (c_{ix}\langle {\mathcal {R}} _{ijx} \rangle - c_{jx}) \right) ^{2} \\&+ X(c_{ix})\,. \end{aligned}$$

Here, the function \(X(c_{ix})\) quadratically imposes a penalty on correction factors deviating from 1.Footnote 4 Minimizing this function produces the correction factors to be used in the calibration.

Previous iterations of the jet energy scale have used a fit in Minuit to minimize \(S(c_{ix})\). The current calibration instead minimizes the function analytically. Suppressing the x indices for clarity and setting the derivative of S with respect to some correction factor \(c_\alpha \) equal to zero, the following equation defines the correction factor values which minimise S:

$$\begin{aligned}&\sum _{i=1}^{\alpha -1}\left( \Big (\frac{-\langle {\mathcal {R}} _{i\alpha } \rangle }{\Delta ^2 \langle {\mathcal {R}} _{i\alpha } \rangle } + \frac{\lambda }{N^2}\Big ) c_i \right) + \left( \sum _{i=1}^{\alpha -1}\frac{1}{\Delta ^2 \langle {\mathcal {R}} _{i\alpha } \rangle } + \sum _{i=\alpha +1}^{N}\frac{\langle {\mathcal {R}} _{\alpha i} \rangle ^2}{\Delta ^2 \langle {\mathcal {R}} _{\alpha i} \rangle } + \frac{\lambda }{N^2} \right) c_{\alpha } \nonumber \\&\qquad + \sum _{i=\alpha +1}^{N} \left( \Big ( \frac{- \langle {\mathcal {R}} _{\alpha i} \rangle }{\Delta ^2 \langle {\mathcal {R}} _{\alpha i} \rangle } + \frac{\lambda }{N^2} \Big ) c_i \right) - \frac{\lambda }{N} = 0~. \end{aligned}$$
(2)

Here \(\lambda \) is a Lagrange multiplier arising from the penalty term whose value has no effect on the minimization result but prevents the trivial solution where all the \(c_i\) are null.

Equation (2) can then be expressed as a matrix system of linear equations. This matrix system is solved independently for each \(p_{{\text {T}}} ^{\text {avg}} \) bin x to obtain values for the correction factors \(c_{ix}\) for each \(\eta _{\text {det}}\) bin i in this momentum range. Solving analytically for the \(c_{ix}\) in this way allows the result to be found approximately a thousand times more quickly than using a fit. This large reduction in computational requirements in turn allows the analysis to use a finer binning in \(\eta _{\text {det}}\), capturing more details of the detector structure. The two methods agree well and each shows good closure when tested in simulation. Finally, the full set of correction factors are normalized such that the average correction factor in the central region \(|\eta _{\text {det}} | < 0.8\) is unity.

Events are selected using a combination of single-jet triggers, with each trigger only considered in the jet \(p_{{\text {T}}}\) range for which it is at least 99% efficient [15, 46]. Events may pass either a central jet trigger or a forward jet trigger, or both. In the case that a trigger is prescaled, the passing event is weighted by the appropriate amount. Jets with \(|\eta _{\text {det}} |<2.4\) are also required to satisfy JVT criteria to minimize contributions from pile-up and must pass basic cleaning requirements [38, 44]. Each selected event must have two jets with \(p_{{\text {T}}} > 25\) \({\text {Ge}}{\text {V}}\) and \(|\eta | < 4.5\). To ensure a clean dijet topology, events are further required to have no third jet with significant \(p_{{\text {T}}}\): \(p_{{\text {T}}} ^{\text {third}}/p_{{\text {T}}} ^{\text {avg}} <0.25\), where \(p_{{\text {T}}} ^{\text {avg}} \) is the average momentum of the two leading jets. The two leading jets are required to be back-to-back in the azimuthal plane such that \(\Delta \phi > 2.5\) rad.

Like the other in situ analyses, the goal of the \(\eta \) intercalibration is to correct for data–simulation differences, so the quantity of interest is the ratio of the measured calorimeter response in MC simulation to the response in data. The nominal calibration is derived by comparison with Powheg +Pythia 8 simulated events. The analysis binning in \(p_{{\text {T}}} ^{\text {avg}}\) and \(\eta _{\text {det}}\) is selected to balance the requirements of both sufficient statistics in sparse regions and resolution of narrow detector features. As such, it varies for different values of \(\eta _{\text {det}}\). Remaining statistical fluctuations in the final calibration are smoothed using a two-dimensional Gaussian kernel with parameters selected to preserve significant structures.

Fig. 9
figure 9

Relative response of jets calibrated with PFlow+JES in data (black circles) and Powheg +Pythia 8 MC simulation (red squares). Response is shown as a function of \(\eta _{\text {det}}\) for jets of a \(40~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {jet}} < 60~{\text {Ge}}{\text {V}}\), b \(85~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {jet}} < 115~{\text {Ge}}{\text {V}}\), and c \(270~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {jet}} < 330~{\text {Ge}}{\text {V}}\), and as a function of \(p_{{\text {T}}}\) for jets of d \(1.2< \eta _{\text {det}} < 1.4\), e \(2.6< \eta _{\text {det}} < 2.8\), and f \(3.0< \eta _{\text {det}} < 3.2\). The lower panel shows the response ratio of simulation to data (red squares) as well as the smoothed in situ calibration factor derived from the ratio (solid curve) which is used to perform the \(\eta \) intercalibration. Dotted lines show the extrapolation of the in situ calibration to the regions without data points. The dashed red and blue horizontal lines provide reference points for the viewer

Figure 9 shows the measured response in data and Powheg +Pythia 8 MC simulation for the 2017 dataset as a function of \(\eta _{\text {det}}\) for three different \(p_{{\text {T}}} ^{\text {avg}}\) ranges (Fig. 9a–c) and as a function of \(p_{{\text {T}}} ^{\text {avg}}\) for three different \(\eta _{\text {det}}\) ranges (Fig. 9d–f). The simulation can be seen to approximately reproduce the \(\eta _{\text {det}}\)-dependent features of the response observed in data, although the response in data is consistently higher than the response in simulation. The simulation/data response ratio as directly measured is shown in discrete points in the bottom panel, while the calibration derived from smoothing the response ratio is overlaid as the solid curve. The dashed curve shows the extrapolation to \(p_{{\text {T}}}\) ranges beyond the available data, taken from the Gaussian smoothing results. Since the smoothing is stronger in the \(p_{{\text {T}}}\) direction and weaker in \(\eta _{\text {det}} \) to preserve detector features, this sets each extrapolated value to approximately the value of the last populated bin at lower \(p_{{\text {T}}}\). Above \(p_{{\text {T}}} =2~{\text {Te}}{\text {V}}\) the value is kept constant.

Uncertainties are derived as a function of \(\eta _{\text {det}}\) and \(p_{{\text {T}}}\) and account for mis-modelling of physics, detector, and event topology effects on the momentum balance of dijet events. The dominant uncertainty is in MC mis-modelling and is taken to be the difference between the smoothed calibration curves derived from the Powheg +Pythia 8 and Sherpa dijet samples. Additional uncertainties in the physics and topology modelling are assessed by varying the \(p_{{\text {T}}} ^{\text {third}}\), \(\Delta \phi \), and pile-up suppression cuts and using a bootstrapping method to ensure observed shapes are statistically significant as discussed in Sect. 5.2. Similarly, the JVT uncertainty is determined by comparison with tighter and looser working points. These uncertainties can take positive or negative values. The statistical uncertainty is strictly positive and is taken from the data and MC simulation sample sizes. Finally, a non-closure uncertainty is assessed by comparing the response in data with that in Powheg +Pythia 8 after applying the derived \(\eta \) intercalibration. This uncertainty is largest for \(|\eta _{\text {det}} | \sim 2.1\)–2.6, where detector transitions make modelling of the LAr pulse shape particularly difficult [34], and for jets near the kinematic limit, where they have the maximum possible \(p_{{\text {T}}}\) for a given \(\eta _{\text {det}} \) subject to the constraint of a 13 \({\text {Te}}{\text {V}}\) centre-of-mass energy. The non-closure uncertainty is treated as three independent nuisance parameters, two covering the regions around \(\pm 2.4\) in \(\eta _{\text {det}} \) and one at the kinematic limit, since these two non-closure uncertainties are uncorrelated.

After being corrected each with their dedicated calibration, the 2015+2016 and 2017 datasets are in good agreement, and therefore a single set of uncertainties is sufficient to cover both cases. The uncertainties calculated with the 2015+2016 dataset are selected for this role. The only dataset-dependent uncertainty is an additional small non-closure uncertainty used for 2018 data only. It covers the region around \(\eta = \pm 1.5\) to account for the difference in Tile calorimeter calibration during this year of data-taking and has a maximum size of 2%.

The method uncertainties are shown in Fig. 10. Three illustrative \(p_{{\text {T}}}\) values are selected. The uncertainties decrease slightly as a function of \(p_{{\text {T}}}\) and increase significantly as a function of \(\eta _{\text {det}} \) outside of the central detector region, while in the central region they are zero by construction. For practical use the various systematic uncertainty terms are summed in quadrature to produce one single systematic uncertainty dominated by the modelling term. In the cases where up and down variations differ, the largest absolute value of the two is used at each point. The total systematic uncertainty and the statistical uncertainty are both symmetrized in \(\eta _{\text {det}} \). The non-closure uncertainties, not included in Fig. 10 as they are not method uncertainties, are instead shown in Fig. 22 where it can be seen that they are kept asymmetric to reflect real differences in the detector.

Fig. 10
figure 10

Systematic uncertainties associated with the \(\eta \) intercalibration procedure as a function of \(\eta _{\text {det}} \) for PFlow+JES jets of a \(p_{{\text {T}}} =50~{\text {Ge}}{\text {V}}\), b \(p_{{\text {T}}} =100~{\text {Ge}}{\text {V}}\), and c \(p_{{\text {T}}} =300~{\text {Ge}}{\text {V}}\). The solid purple band shows the total systematic uncertainty, while the grey band shows the statistical uncertainty alone. Individual sources of uncertainty are marked by coloured lines. These have been smoothed to remove the impact of statistical fluctuations. Thus the visible shapes are statistically significant. The MC modelling term is the dominant source of uncertainty

The calibrations are similar in size and shape between PFlow and EMtopo jets. Systematic uncertainties are also similar in size and shape since the dominant MC modelling component does not differ meaningfully between the two jet collections.

5.2.2 Calibration measurement using \(Z\text {+jet}\) and \(\gamma \text {+jet}\) events

The next stage of the in situ calibration corrects for the differences between data and MC simulation using the momentum balance between the measured hadronic activity in the event and the \(p_{{\text {T}}}\) of a well-calibrated photon or Z boson. Only the central region of the detector (\(|\eta | < 0.8\)) is used for this analysis: the \(\eta \) intercalibration ensures that a correction derived centrally translates directly to forward jets as well.

The \(Z/\gamma \)+jet analyses rely on the energy scale of the photon or the electrons and muons from the Z decay being well measured. All three objects are cleanly measured in the ATLAS detector and the uncertainties in their energy scales are small [47, 48]. The response is calculated separately in \(Z\rightarrow e^{+}e^{-}\) and \(Z\rightarrow \mu ^{+}\mu ^{-}\) events since the sources of uncertainties propagated from e and \(\mu \) calibration are independent, and the three channels are combined at a later stage. The \(Z\text {+jet}\) response measurement is limited at moderate to high \(p_{{\text {T}}}\) by low statistics and thus covers a range in jet \(p_{{\text {T}}}\) from \(17~{\text {Ge}}{\text {V}}\) to \(1~{\text {Te}}{\text {V}}\) with large uncertainties in the final bin. The \(\gamma \text {+jet}\) response measurement benefits from much higher statistics and extends to 1.2 \({\text {Te}}{\text {V}}\) with little loss in sensitivity. However, it is limited at low jet \(p_{{\text {T}}}\) by both the trigger prescales and the prevalence of soft jets misidentified as photons and so begins at 25 \({\text {Ge}}{\text {V}}\).

The missing-\(E_{\text {T}}\) projection fraction technique is used for both of the \(Z/\gamma \)+jet analyses and balances the reference object \(p_{{\text {T}}}\) against the full hadronic recoil in an event. By doing so, it is possible to compute the calorimeter response to hadronic showers directly. This approach is robust to both pile-up and the underlying event, which each cancel out directionally on average over a large collection of events, and is not strongly affected by jet definitions since these become relevant only in the application of the calibration. The showering and topology effects in moving from a recoil-level quantity to a jet-level quantity are studied and found to be small, as discussed below. Taking \(\vec {p}_{T}^{\,\text {recoil}}\) as the total transverse momentum of the hadronic activity in a clean \(Z/\gamma \)+jet event and \(p_{{\text {T}}} ^{\text {ref}}\) as the transverse momentum of the photon or Z boson, conservation of transverse momentum means that at the particle level:

$$\begin{aligned} \vec {p}_{\text {T,truth}}^{\,\text {ref}} + \vec {p}_{\text {T,truth}}^{\,\text {recoil}} = 0\,. \end{aligned}$$
(3)

This balance could be altered by the presence of initial- or final-state radiation.Footnote 5 To suppress the effects of such additional radiation, a cut is placed on the azimuthal angle \(\Delta \phi \) between the jet and the reconstructed photon or Z boson in the event and an uncertainty due to the topology is evaluated by varying the event selection requirements. If the calorimeter response to the hadronic activity in this event is \(r_{\text {MPF}}\) and the response for the calibrated reference object is 1, and assuming any missing energy in the event is due to the low response to the hadronic recoil (\(r_{\text {MPF}} < 1\)), then at the detector level Eq. (3) becomes:

$$\begin{aligned} \vec {p}_{\text {T}}^{\,\text {ref}} + r_{\text {MPF}}\,\, \vec {p}_{\text {T,truth}}^{\,\text {recoil}} = - \vec {E}_{\text {T}}^{\,\text {miss}} \end{aligned}$$

After taking the projection of each term in the direction of the reference object, defined by a unit vector \({\hat{n}}_{\text {ref}}\), the response to the hadronic recoil is then seen to depend only on the missing energy in the event and the momentum of the reference object. The MPF response \({\mathcal {R}} _{\text {MPF}}\) is defined by measuring the average of \(r_{\text {MPF}}\) across events binned in the reference object \(p_{{\text {T}}}\). Thus,

$$\begin{aligned} {\mathcal {R}} _{\text {MPF}} = \left\langle 1 + \frac{{\hat{n}}_{\text {ref}} \cdot \vec {E}_{\text {T}}^{\,\text {miss}}}{p_{{\text {T}}} ^{\text {ref}}} \right\rangle \,. \end{aligned}$$

This peak location is taken to be the average response in that bin, and the response is mapped from reference to jet \(p_{{\text {T}}}\) by finding the average jet \(p_{{\text {T}}}\) in the events in each bin after \(\eta \) intercalibration but before the application of any other in situ steps.

Missing energy in each event is reconstructed from calorimeter topo-clusters in the case of EMtopo jet calibration and from particle-flow objects in the case of PFlow jet calibration, ensuring that the energy scale is consistent. The \(Z\rightarrow ee\) events are required to pass a dielectron trigger with \(p_{{\text {T}}} ^{e1,e2} > 15~{\text {Ge}}{\text {V}}\); \(Z\rightarrow \mu \mu \) events must pass a similar dimuon trigger with \(p_{{\text {T}}} ^{\mu 1,\mu 2} > 14~{\text {Ge}}{\text {V}}\) [49, 50]. Electrons entering the analysis must have \(p_{{\text {T}}} > 20~{\text {Ge}}{\text {V}}\), ensuring that the trigger is fully efficient, must be contained within the tracker such that \(|\eta _{e}| < 2.47\), and must not fall in the calorimeter transition region (\(1.37< |\eta | < 1.52\)). Muons entering the analysis are required to have \(p_{{\text {T}}} > 20~{\text {Ge}}{\text {V}}\) and to fall within \(|\eta | < 2.4\). Both electron and muon candidates must also pass loose identification and isolation requirements [47, 48]. All \(Z\text {+jet}\) events are selected such that the reconstructed mass calculated from the electron or muon pair must be close to the Z boson mass: \(66~{\text {Ge}}{\text {V}}< m_{ee/\mu \mu } < 116~{\text {Ge}}{\text {V}}\). A combination of single-photon triggers are used for the \(\gamma \text {+jet}\) analysis, with the lowest trigger threshold corresponding to \(E_{{\text {T}}} ^{\gamma } > 15~{\text {Ge}}{\text {V}}\). Offline photons must have \(E_{{\text {T}}} ^{\gamma } > 25~{\text {Ge}}{\text {V}}\) and \(|\eta ^{\gamma }| < 1.37\) and must satisfy tight identification and isolation criteria [47].

Both the \(Z\text {+jet}\) and \(\gamma \text {+jet}\) analyses have further selection requirements on the jets and event topology to suppress pile-up and initial- and final-state radiation. All jets within \(\Delta R = 0.2\) of a photon or \(\Delta R = 0.35\) of a lepton are removed. Jets must satisfy basic cleaning requirements and pass the JVT selection to suppress pile-up. Selected events must have one jet with \(p_{{\text {T}}} > 10~{\text {Ge}}{\text {V}}\) and \(|\eta | < 0.8\). Additional event activity is suppressed by requiring that any subleading jet must have \(p_{{\text {T}}} < \text {max}(0.3 \times p_{{\text {T}}} ^{\text {ref}},12)~{\text {Ge}}{\text {V}}\) and that the leading jet and reference object must be relatively back-to-back with \(\Delta \phi ^{\text {ref, jet}} > 2.9\). The relatively loose \(p_{{\text {T}}}\) cut on subleading jets is shown to be acceptable for the MPF analysis due to its intrinsic robustness to pile-up effects.

Figures 11 and 12 show the MPF response calculated in \(Z\text {+jet}\) and \(\gamma \text {+jet}\) events for data and for two MC samples using different generators. The lower panels show the MC simulation to data ratio for both generators. The results using Powheg +Pythia 8 (\(Z\text {+jet}\)) and Pythia 8 (\(\gamma \text {+jet}\)) constitute the nominal calibration while Sherpa is used to define an uncertainty due to the generator choice. In the lowest \(p_{{\text {T}}}\) bin of the \(\gamma \text {+jet}\) measurement, the discrepancy between the MC predictions is caused by a generator-level cut at 35 \({\text {Ge}}{\text {V}}\) present in the Sherpa sample. This point is included in the final in situ combination, but due to its large generator uncertainty it contributes very little to the overall weighted-average-based result (see Sect. 5.2.5 and Fig. 19a) and the total effect is negligible. The \(\gamma \text {+jet}\) generator uncertainty at this point has therefore been set to its value in the second-lowest bin for display purposes in Fig. 14 to better reflect its actual contribution to the total systematic uncertainties. The apparent dip near the lowest \(p_{{\text {T}}}\) range of each measurement is due to the interplay of two factors: an asymmetry in the \({\mathcal {R}} _{\text {MPF}}\) distribution near the low \(p_{{\text {T}}}\) reconstruction threshold which causes the measured response to increase for the lowest \(p_{{\text {T}}}\) values, and the natural increase in response with higher jet \(p_{{\text {T}}}\). One motivation for the use of the MPF technique is increased resilience to this threshold effect.

Fig. 11
figure 11

Average PFlow jet response as a function of reference \(p_{{\text {T}}}\) for \(Z\text {+jet}\) events where the Z boson decays into a electrons and into b muons calculated using the MPF technique. \(Z\rightarrow ee\) and \(Z \rightarrow \mu \mu \) events are combined at a later stage. The black points correspond to 2015–2017 data while the pink diamonds and blue triangles correspond to independent Monte Carlo samples from two different generators, and their error bars show the statistical uncertainties. The ratio of MC simulation to data for both generators is shown in the bottom panel and defines the in situ correction to be applied. The dotted lines at 1 and 1.05 serve as a reference

Fig. 12
figure 12

Average PFlow jet response as a function of reference \(p_{{\text {T}}}\) for \(\gamma \text {+jet}\) events calculated using the MPF technique. The black points correspond to 2015–2017 data. The red and blue triangles correspond to independent Monte Carlo samples from two different generators. Error bars show the statistical uncertainties. The ratio of MC simulation to data for both generators is shown in the bottom panel and defines the in situ correction to be applied. The dotted lines at 1 and 1.05 serve as a reference

Two small correction factors are derived in simulation and use the true calorimeter response, defined as the ratio of measured energy in the calorimeter deposited by particles belonging to a particle-level jet to the total energy of the particle-level jet. The topology correction accounts for the differences in calorimeter response for sparse energy depositions versus those in the dense cores of jets, and is found by taking the average of the ratio of \({\mathcal {R}} _{\text {MPF}}\) to the true calorimeter response in each \(p_{{\text {T}}}\) bin. The showering correction accounts for the flow of particles entering or exiting across the boundaries of the jet definition and is calculated from the ratio of the true calorimeter response to the measured response of the reconstructed jet, therefore varying with the jet algorithm and size. The total correction factor is the product of the two and is found to be less than \(2\%\) for jets of \(p_{{\text {T}}} < 50~{\text {Ge}}{\text {V}}\) and negligible above that. This correction factor would in principle be applied identically to \({\mathcal {R}} _{\text {MPF}}\) in both data and simulation to better estimate jet response, but since the ratio of \({\mathcal {R}} _{\text {MPF}}\) in data and simulation is the quantity of interest for the in situ calibration, the correction would cancel out in the ratio and only the uncertainty in its derivation is relevant. This uncertainty is taken from a comparison of two different physics lists (FTFP BERT [29] and QGSP BIC [51]) in the simulation of the particle/detector interactions and is found to be \(\sim 2\%\) for jets with \(p_{{\text {T}}} < 20~{\text {Ge}}{\text {V}}\), \(\sim 0.5\%\) for jets with \(20~{\text {Ge}}{\text {V}}< p_{{\text {T}}} < 40~{\text {Ge}}{\text {V}}\) and zero for jets with \(p_{{\text {T}}} > 40~{\text {Ge}}{\text {V}}\).

Fig. 13
figure 13

Systematic uncertainties for PFlow jets as a function of reference \(p_{{\text {T}}}\) for a \(Z\rightarrow ee+\)jet events and b \(Z\rightarrow \mu \mu +\)jet events calculated using the MPF technique. Uncertainties due to the JVT, second-jet veto, and \(\Delta \phi \) cuts derive from the analysis technique. Electron or muon (as appropriate) scale and resolution uncertainties are propagated through the analysis from the uncertainties associated with the individual objects. The statistical uncertainties come from the MC simulation/data ratio and reach a maximum value of 0.083 in b while the difference between the Pythia 8 and Sherpa samples defines the MC generator uncertainty. All uncertainties are smoothed to ensure that the visible fluctuations are statistically significant

The full set of uncertainties is shown for the \(Z \rightarrow ee + \text {jet}\) and \(Z \rightarrow \mu \mu + \text {jet}\) analyses in Fig. 13 and for the \(\gamma \text {+jet}\) analysis in Fig. 14. The dominant systematic uncertainties are due to generator differences at lower \(p_{{\text {T}}}\) and to the photon energy scale at higher \(p_{{\text {T}}}\). Uncertainties in the \(e,~\mu ,~\text {and}~\gamma \) energy scales and resolutions are taken from the calibrations provided for each physics object and are propagated through the analysis [47, 48]. The \(\Delta \phi \) and second-jet veto uncertainties are estimated by varying the cuts and comparing the resulting response measurements. As in the \(\eta \) intercalibration, the JVT uncertainty is determined by comparison with tighter and looser working points. A photon purity uncertainty is estimated for the \(\gamma \text {+jet}\) analysis using control regions dominated by dijet events where one of the jets can be misidentified as a photon. The uncertainty on the final state modelling is taken, as discussed, from the generator comparison. Limited data and MC statistics contribute to the statistical uncertainty, which is largest for the lowest and highest bins of each analysis. A bootstrapping procedure is applied to the uncertainties to suppress statistical fluctuations as previously described.

Fig. 14
figure 14

Systematic uncertainties on PFlow jets as a function of reference \(p_{{\text {T}}}\) for \(\gamma \text {+jet}\) events calculated using the MPF technique. Uncertainties due to the JVT, second-jet veto, and \(\Delta \phi \) cuts derive from the analysis technique. Photon scale and resolution uncertainties are propagated through the analysis from the uncertainties associated with the individual objects. The statistical uncertainties come from the MC simulation/data ratio while the difference between the Pythia 8 and Sherpa samples defines the MC generator uncertainty. All uncertainties are smoothed to ensure that the visible fluctuations are statistically significant

Similar analyses in the \(Z/\gamma \)+jet final states but explicitly balancing the reference \(p_{{\text {T}}}\) against the \(p_{{\text {T}}}\) of a reconstructed jet (direct balance) are used to cross-check the jet energy scale calibration. The JES results computed using direct balance showed good agreement with those derived via MPF.

The innate difference in response between EMtopo and PFlow jets can be seen by comparing their measured MPF responses. Since the MPF method uses topo-clusters and PFlow objects in computing the missing energy, the measured responses are independent of the MCJES calibration and reflect the precalibration response for each jet input type. The MPF responses measured in the \(\gamma \text {+jet}\) analysis for EMtopo and PFlow jets are shown in Fig. 15. The shape of the EMtopo measurement follows the form of the Groom’s function, which corresponds to the response expected from a hadronic calorimeter [52]. The PFlow measurement does not follow the same shape but instead shows an improvement over the baseline calorimeter response at low \(p_{{\text {T}}}\) thanks to the inclusion of information from tracks.

Fig. 15
figure 15

Average jet response as a function of reference \(p_{{\text {T}}}\) for \(\gamma \text {+jet}\) events calculated using the MPF technique in 2015–2017 data. The solid points correspond to PFlow jets while the hollow points correspond to EMtopo jets. The ratio of PFlow response to EMtopo response is shown in the bottom panel

5.2.3 High-\(p_{{\text {T}}}\) jet calibration using multijet balance

The final stage of in situ calibration derives a correction for jets with \(p_{{\text {T}}}\) above the range of the \(Z/\gamma \)+jet analyses using the multijet balance (MJB) technique. Events are selected with a single high-\(p_{{\text {T}}}\) jet balanced against a system of lower-\(p_{{\text {T}}}\) jets (the recoil system). The jets of the recoil system are selected to ensure they are well calibrated using a combination of the \(Z/\gamma \)+jet results (Sect. 5.2.2), while the leading jet is left at the scale of the \(\eta \) intercalibration. The response of the system is defined as:

$$\begin{aligned} {\mathcal {R}}_{\text {MJB}} = \left\langle \frac{p_{{\text {T}}} ^{\text {lead}}}{p_{{\text {T}}} ^{\text {ref}}} \right\rangle \,, \end{aligned}$$

where \(p_{{\text {T}}} ^{\text {ref}}\) is taken from the vector sum of all jets in the recoil system. In a procedure parallel to that used for the \(Z/\gamma \)+jet analyses, the response is measured in bins of \(p_{{\text {T}}} ^{\text {ref}}\) and the correction is then mapped to the uncalibrated leading jet by finding the average \(p_{{\text {T}}} ^{\text {lead}}\) of the events in each bin.

Since the MJB analysis can only include events where all jets of the recoil system can already be well-calibrated, events with very high \(p_{{\text {T}}} ^{\text {lead}}\) are often excluded as their second and third leading jets can have momenta outside the range of calibration by the \(Z/\gamma \)+jet analyses. To address this, MJB proceeds via two iterations. In the first iteration, a combination of the \(Z/\gamma \)+jet results is used to calibrate the recoil system, so only events with subleading jets of \(p_{{\text {T}}} < 1.2~{\text {Te}}{\text {V}}\) are included. In the second iteration, events with subleading jets up to \(p_{{\text {T}}} = 1.8~{\text {Te}}{\text {V}}\) are included and calibrated using the MJB results from the first iteration. This extends the range of the calibration to \(p_{{\text {T}}} ^{\text {lead}}=2.4~{\text {Te}}{\text {V}}\).

Events are selected for the MJB analysis using a variety of single-jet triggers with each corresponding to a unique range of \(p_{{\text {T}}} ^{\text {lead}}\). To suppress dijet topologies and ensure that only true multijet events are used, events must have at least three jets with \(p_{{\text {T}}} > 25~{\text {Ge}}{\text {V}}\) and \(|\eta | < 2.8\) and the subleading jet must not have a momentum above \(0.8p_{{\text {T}}} ^{\text {lead}}\). Jets are as usual required to pass JVT selections, limiting the effects of pile-up. Isolation of the leading jet from contamination by the recoil system is ensured by requiring that the azimuthal angle \(\Delta \phi \) between the leading jet and the direction of the recoil system is at least 0.3 radians and that the \(\Delta \phi \) between the leading jet and any individual jet in the recoil system with a \(p_{{\text {T}}} > 0.05 p_{{\text {T}}} ^{\text {lead}}\) is at least 1.0 radians.

The MJB response in data and in four MC samples with different generators is shown in Fig. 16a. In both data and MC simulation, the response decreases at lower \(p_{{\text {T}}}\) due to the intrinsic bias in \({\mathcal {R}}_{\text {MJB}}\) from the combined effects of the leading jet isolation and \(p_{{\text {T}}}\) asymmetry requirements. This bias is greater for lower-\(p_{{\text {T}}}\) leading jets, but is well modelled in simulation, leaving the calibration unbiased. The lower panel shows the ratio of the response of each MC sample to data. Here, the ratio of the Sherpa sample to data defines the nominal correction while the ratio based on Pythia defines an uncertainty on the generator choice. This response ratio is constant and approximately 2% for jets above 1 \({\text {Te}}{\text {V}}\); below this point the calculated correction is slightly smaller.

Fig. 16
figure 16

a Response for the leading PFlow jet in multijet events as a function of \(p_{{\text {T}}} ^{\text {ref}}\) and b the systematic uncertainties on the response. Subleading jets in the event are calibrated using the \(Z/\gamma \)+jet MPF corrections, while the leading jet is calibrated only up to the \(\eta \) intercalibration. The response is shown for data and for simulation using four different MC generators, and the MC simulation-to-data response ratios in the bottom panel correspond to the derived in situ calibration. The error bars show the statistical uncertainties. The nominal calibration is defined by the comparison with Sherpa; its difference from the Pythia result defines the ‘MC generator’ uncertainty in b. This uncertainty is defined in a single-sided way by the measured response difference and therefore it is not symmetrised for display in b but instead its full one-sided value is shown. Other uncertainties come from the event selection and MC simulation/data statistics or are propagated from the \(Z\text {+jet}\), \(\gamma \text {+jet}\), flavour, pile-up, \(\eta \) intercalibration, and punch-through studies

All uncertainties in the MJB analysis are shown in Fig. 16b. The dominant term at low \(p_{{\text {T}}} ^{\text {lead}}\) is the uncertainty from jet flavour, derived in simulation and reflecting the difference in jet response for quark-initiated and gluon-initiated jets. Two terms contribute, one reflecting the uncertainty in the fraction of gluon-initiated jets in the sample, the other based on the difference in MC simulation-derived gluon response between generators. Other independently derived uncertainties correspond to pile-up and punch-through effects and are propagated through the MJB analysis via the recoil system. The \(Z\text {+jet}\), \(\gamma \text {+jet}\), and \(\eta \) intercalibration uncertainties are propagated from the previous stages of in situ analysis. Event selection uncertainties are determined by varying each of the analysis cuts and determining the effects on the measured response ratio. Finally, the MC generator uncertainty is derived as described above by comparing the response ratio of Sherpa with Pythia as an alternative. Results using Herwig and Powheg +Pythia are shown for reference but are not included in the uncertainty definition as they are less reliable for this measurement. All uncertainties are smoothed via the bootstrapping procedure to ensure statistical significance, and the total uncertainty is found to be below \(1.5\%\) for all considered values of \(p_{{\text {T}}} ^{\text {lead}}\). The MC generator uncertainty, which is defined in a one-sided fashion from the response ratios, is symmetrised by the in situ combination process along with the other uncertainties. However, its full one-sided size is shown in Fig. 16b for easier comparison with Fig. 16a.

For EMtopo jets the intrinsic bias at low \(p_{{\text {T}}}\) is slightly smaller and more closely tracked by simulation, leading in turn to slightly reduced systematic uncertainties for jets below \(p_{{\text {T}}} \sim 700~{\text {Ge}}{\text {V}}\). Above \(p_{{\text {T}}} > 1~{\text {Te}}{\text {V}}\), in situ uncertainties propagated from lower-\(p_{{\text {T}}}\) jets have a greater impact, and therefore the uncertainty is smaller for PFlow jets than for EMtopo jets.

5.2.4 Pile-up and the in situ analyses

One of the primary changes in LHC run conditions over the course of Run 2 was an increase in pile-up. The average number of interactions per crossing (\(\mu \)) during 2015+2016 data taking was 23.7, which increased to 37.8 in 2017. The data taken during 2018 and to which the calibrations in this paper are also applied has an average of 36.1 interactions per crossing [16]. The consistency of the calibrations for events with different pile-up conditions is therefore an important feature of the methods.

Figure 17 shows individual bins in the response ratios of the \(Z\text {+jet}\) and \(\gamma \text {+jet}\) analyses separated out as a function of number of primary vertices in the event. The \(Z\text {+jet}\) results are shown for \(25~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {ref}} < 30~{\text {Ge}}{\text {V}}\) and the \(\gamma \text {+jet}\) results for \(45~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {ref}} < 65~{\text {Ge}}{\text {V}}\), in the regions where each has appropriate statistical significance. The multijet balance analysis is not shown: due to the higher \(p_{{\text {T}}}\) regime in which it operates it is more robust to pile-up effects. A linear fit to the data/simulation ratio has a slope compatible with zero within the fit uncertainties in each plot, demonstrating the stability of the in situ calibration as a function of \(N_{\text {PV}}\). This in turn illustrates the efficacy of the pile-up corrections described in Sect.  5.1.1 and shows that the inclusively derived calibration is applicable to events with a range of pile-up conditions.

Fig. 17
figure 17

Average PFlow MPF jet response as a function of \(N_{\text {PV}} \) for a \(Z\text {+jet}\) events with reference \(p_{{\text {T}}}\) derived from the reconstructed Z boson in the range \(25~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {ref}} < 30~{\text {Ge}}{\text {V}}\) and for b \(\gamma \text {+jet}\) events with reference \(p_{{\text {T}}}\) defined from the photon in the range \(45~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {ref}} < 65~{\text {Ge}}{\text {V}}\). For the \(Z\text {+jet}\) analysis, results from the \(Z\rightarrow ee\) and \(Z\rightarrow \mu \mu \) channels are combined to reduce statistical fluctuations. The black points correspond to 2015–2017 data while the pink and blue points correspond to Monte Carlo samples from two different generators. The error bars reflect the statistical uncertainties. The ratio of MC simulation to data for both generators is shown in the bottom panel

The in situ JES measurements can be used to calculate the dependence of the measured median \(p_{{\text {T}}}\) density \(\rho \) on the event topology in simulation and data and to derive an uncertainty, as mentioned in Sect. 5.1.1. The density \(\rho \) is computed as a function of \(\mu \) for each of the \(Z\text {+jet}\), \(\gamma \text {+jet}\), and dijet topologies as shown in the top panels of Fig. 18. Taking \(\rho _{2017}\) as the value of \(\rho \) for the average pile-up conditions during 2017 data taking and t1 and t2 as any two in situ measurement topologies out of \(Z\text {+jet}\), \(\gamma \text {+jet}\), and dijet, then the following metric of consistency can be defined:

$$\begin{aligned} \Delta = \left( \rho _{2017}^{\text {t1}} - \rho _{2017}^{\text {t2}} \right) _{\text {MC}} - \left( \rho _{2017}^{\text {t1}} - \rho _{2017}^{\text {t2}} \right) _{\text {data}}\,. \end{aligned}$$

The quantity \(\max (|\Delta |)\) is then the largest value of \(\Delta \) across the various topology comparisons. The total \(\rho \) topology systematic uncertainty is given by

$$\begin{aligned} \Delta _{p_{{\text {T}}}} = \max (|\Delta |) \times C^{\text {JES}}_{p_{{\text {T}}}} \times \pi R^2\,, \end{aligned}$$

where \(C^{\text {JES}}\) is the size of the MCJES correction for a jet with the relevant \(p_{{\text {T}}}\). The second panels in Fig. 18 show \(\rho _{2017}^{\text {t1}} - \rho _{2017}^{\text {t2}}\) for the comparisons (\(Z\text {+jet}\), dijet) and (\(\gamma \text {+jet}\), dijet) in both MC simulation and data. The lower panels show the difference of these two quantities between data and MC simulation, that is, \(\Delta _{Z\text {+jet},\, \text {dijet}}\) and \(\Delta _{\gamma \text {+jet}, \, \text {dijet}}\). The input to the systematic uncertainty \(\max (|\Delta |)\) is the most discrepant of the two lines in the lower panel evaluated at \(\mu =37.8\), the value in 2017 data. As Fig. 18 illustrates, this uncertainty is larger for PFlow jets than for EMtopo jets. This is understood to be due to a greater sensitivity to the underlying event when tracking information is included, which leads to greater differences among the simulated samples.

Fig. 18
figure 18

Inputs to the \(\rho \) topology uncertainty derived in the \(Z\text {+jet}\), \(\gamma \text {+jet}\), and dijet in situ analyses. The error bars show the statistical uncertainties. The top panels relate the \(p_{{\text {T}}}\) density \(\rho \) to the mean number of interactions per bunch crossing \(\mu \) in data and MC simulation for the three input analyses. The second panels show the difference between the \(Z\text {+jet}\) and dijet and between the \(\gamma \text {+jet}\) and dijet measurements. The lowermost panels show the difference between the data and MC simulation lines in the second panels: this defines the size of the topology uncertainty. The two plots show a EMtopo and b PFlow jets, illustrating why this uncertainty is larger for PFlow jets than for EMtopo jets

Fig. 19
figure 19

a The weight assigned to different techniques in the combination of in situ measurements of the relative \(p_{{\text {T}}}\) response of anti-\(k_{t}\) \(R=0.4\) particle-flow jets in data and simulation, as a function of the jet \(p_{{\text {T}}}\). For each \(p_{{\text {T}}}\) bin, the weights of the \(Z\text {+jet}\), \(\gamma \text {+jet}\), and multijet balance methods are shown. b The \(\chi ^2/N_{\text {dof}}\) metric, illustrating the compatibility of the in situ measurements being combined, as a function of jet \(p_{{\text {T}}}\). In the low \(p_{{\text {T}}}\) range, the combination is between three measurements (\(Z(\rightarrow ee)+\)jet, \(Z(\rightarrow \mu \mu )+\)jet, and \(\gamma \text {+jet}\)) of which the two \(Z\text {+jet}\) measurements have several correlated uncertainties, resulting in increased tension compared to previous calibrations

Fig. 20
figure 20

a Ratio of the PFlow+JES jet response in data to that in the nominal MC event generators as a function of jet \(p_{{\text {T}}}\) for \(Z\text {+jet}\), \(\gamma \text {+jet}\), and multijet in situ calibrations. The inner horizontal ticks in the error bars give the size of the statistical uncertainty while the outer horizontal ticks indicate the total uncertainty (statistical and systematic uncertainties added in quadrature). The final correction and its statistical and total uncertainty bands are also shown, although the statistical uncertainty is too small to be visible in most regions. b A comparison of the combined correction and its uncertainty for PFlow+JES and EM+JES jets

5.2.5 In situ combination

The data/MC simulation response ratios,

$$\begin{aligned} \left\langle p_{{\text {T}}} ^{\text {jet}}/p_{{\text {T}}} ^{\text {ref}} \right\rangle _{\text {data}} \Big / \left\langle p_{{\text {T}}} ^{\text {jet}}/p_{{\text {T}}} ^{\text {ref}} \right\rangle _{\text {MC}}\,, \end{aligned}$$

from the four different ‘absolute’ in situ measurements of \(Z(\rightarrow ee)\)+jet, \(Z(\rightarrow \mu \mu )\)+jet, \(\gamma \text {+jet}\), and the multijet balance must be combined to produce a single calibration covering the full range of jet \(p_{{\text {T}}}\) from \(17~{\text {Ge}}{\text {V}}\) to \(2.4~{\text {Te}}{\text {V}}\). The four measurements overlap one another in various \(p_{{\text {T}}}\) ranges, so this procedure must account for their relative statistical power as well as the tension between different response ratio measurements in the same \(p_{{\text {T}}} \) range. The \(Z(\rightarrow ee)\)+jet and \(Z(\rightarrow \mu \mu )\)+jet channels, though compatible within uncertainties, are treated as separate measurements for the sake of the combination since they are affected by different systematic uncertainties.

The combination procedure is briefly summarized here; for a detailed description see Ref. [5]. Each of the absolute in situ measurements is converted from a parameterisation in terms of reference object \(p_{{\text {T}}}\) into jet \(p_{{\text {T}}}\) and divided into finer bins of 1 \({\text {Ge}}{\text {V}}\) using second-order polynomial splines. A \(\chi ^2\) minimization is performed in each bin, taking as inputs the measurements available in that \(p_{{\text {T}}}\) range and their uncertainties. This minimisation functions as a weighted average, with the weight given to each input measurement decreasing as its uncertainty grows. In this way, the measurement with the smallest statistical and systematic uncertainties dominates the estimate of the response ratio in that bin.

The weights of each input measurement in this combination are shown in Fig. 19a as a function of jet \(p_{{\text {T}}}\). The \(Z\text {+jet}\) measurements dominate for jet \(p_{{\text {T}}}\) below \(\sim 500\) \({\text {Ge}}{\text {V}}\) where the statistical uncertainties on these measurements grow dramatically; the \(Z(\rightarrow \mu \mu )\)+jet is the more powerful of the two in the upper half of this range due to the size of the electron scale and resolution uncertainties affecting the \(Z(\rightarrow ee)+\)jet channel. The combination is then dominated by \(\gamma \text {+jet}\) until jet \(p_{{\text {T}}}\) of above 1 \({\text {Te}}{\text {V}}\), where the lower statistics in this channel and the decreased flavour uncertainties in the multijet balance analysis allow the latter to dominate. The final calibration curve is determined by smoothing the outputs from the minimization with a Gaussian kernel.

The \(\sqrt{\chi ^{2}/N_{\text {dof}}}\) across the measurements, before any scaling is applied, is shown in Fig. 19b. This metric shows the degree of tension between the input measurements at each point: when they are in agreement well within uncertainties the value will be below 1, while when they differ relative to their uncertainties it will be above 1. Following PDG guidelines, in bins where tension between the input measurements, quantified by \(\sqrt{\chi ^{2}/N_{\text {dof}}}\), is found to be greater than 1, the uncertainties in the measurements in that bin are scaled by the same tension factor to ensure that the overall level of agreement between methods is acceptable within uncertainties for all \(p_{{\text {T}}} \) values [53]. However, since the tensions visible at low \(p_{{\text {T}}}\) are primarily between the two \(Z\text {+jet}\) measurements, and since the MC generator and showering and topology uncertainties are fully correlated between the two channels and therefore cannot contribute to this tension, these two components are excluded from the scaling procedure. The components which are not scaled are the dominant uncertainties.

Figure 20 shows the final in situ combination as a function of jet \(p_{{\text {T}}}\). To complete the calibration, the inverse of the curve (\(R_{\text {MC}}/R_{\text {data}}\)) is taken as the scaling factor and applied to data. The combined measurement (solid line) for PFlow+JES jets is compared with each of the four absolute in situ analyses (empty shapes) in Fig. 20a. The total size of the correction is approximately 3% at low \(p_{{\text {T}}}\) and decreases to around 2% for jets above 200 \({\text {Ge}}{\text {V}}\). A comparison between the results for EM+JES and PFlow+JES jets is shown in Fig. 20b, where the overall size of both the correction and its uncertainty is seen to be slightly larger for EM+JES jets.

Each uncertainty component from the in situ analyses is individually propagated through the combination procedure. First, the relevant measured response is varied by \(1\sigma \) in the uncertainty component within its standard binning. The finer rebinning, \(\chi ^{2}\) minimization, and combination procedure is repeated, although using the weights as determined for the nominal result to prevent the varied uncertainty from decreasing the contribution of the measurement. The difference between the combined calibration curve with the systematically shifted input and the nominal calibration curve is taken as \(1\sigma \) in the varied uncertainty. Throughout this process, each individual uncertainty source is treated as fully correlated across \(\eta \) and \(p_{{\text {T}}} \) but entirely uncorrelated with all other uncertainty sources. After this step, the uncertainties from the \(Z\text {+jet}\) analyses are taken to be fully correlated with the same uncertainties propagated through the multijet balance. Other assumptions of correlation between components can similarly be made and altered after their propagation, allowing multiple different assumptions.

5.3 Systematic uncertainties

The full uncertainty in the jet energy scale consists of 125 individual terms derived from the in situ measurements, pile-up effects, flavour dependence, and estimates of additional effects as summarized in Table 2. The majority of the individual terms stem from the in situ measurements and cover the effects of analysis selection cuts, event topology dependence, and MC mis-modelling and statistical limitations, as well as the uncertainties associated with the calibration of the electrons, muons, and photons.

The \(\eta \) intercalibration analysis results in five nuisance parameters, with a sixth for 2018 data only, as discussed in Sect. 5.2.1: one covers systematic effects, one covers statistical uncertainty, and three (four in 2018) are used to parameterize the non-closure. Pile-up effects are described by four nuisance parameters which account for offsets and \(p_{{\text {T}}}\) dependence in \(\langle \mu \rangle \) and \(N_{\text {PV}}\) as well as event topology dependence of the density metric \(\rho \). The offset and \(p_{{\text {T}}}\) dependence terms are derived in data using a combination of \(Z\text {+jet}\) measurements and measurements comparing reconstructed jets with track-jets. The \(\rho \) topology term is the largest of the pile-up uncertainties and is determined by the maximum deviation in measured density between different in situ measurements under the same pile-up conditions.

Table 2 Sources of uncertainty in the jet energy scale

The two flavour dependence uncertainties are derived from simulation and account for relative flavour fractions and differing responses to quark- and gluon-initiated jets [5, 6]. The flavour composition uncertainty accounts for the differing response of quark- and gluon-initiated jets in a sample with some uncertainty on the fraction of gluon-initiated jets \(f_g\). Where \({\mathcal {R}} _q\) and \({\mathcal {R}} _g\) are the responses measured in Pythia and \(\sigma _g^f\) is the uncertainty on \(f_g\) in the sample, the flavour composition uncertainty is defined as:

$$\begin{aligned} \sigma _{\text {composition}} = \sigma _g^f \frac{|{\mathcal {R}} _q - {\mathcal {R}} _g|}{f_g {\mathcal {R}} _g + (1 - f_g){\mathcal {R}} _q}\,. \end{aligned}$$

The flavour response uncertainty accounts for the fact that, unlike quark-initiated jet response, gluon-initiated jet response is found to differ significantly between generators. This uncertainty is defined by comparison between the nominal Pythia sample and an alternative Herwig sample:

$$\begin{aligned} \sigma _{\text {response}} = f_g \left( {\mathcal {R}} _g^{\textsc {Pythia}} - {\mathcal {R}} _g^{\textsc {Herwig}}\right) . \end{aligned}$$

Figure 21 shows the gluon-jet response and the difference between quark-jet and gluon-jet responses using both Pythia and Herwig for PFlow jets. The samples are the same as those used for the multijet balance analysis and are dominated by gluon jets at low \(p_{{\text {T}}}\). For Herwig, \({\mathcal {R}} _q - {\mathcal {R}} _g\) becomes negative in the 90–600 \(p_{{\text {T}}}\) region (which appears as a bump in the \(|{\mathcal {R}} _q - {\mathcal {R}} _g|\) curve).

Fig. 21
figure 21

a Measured gluon-initiated jet response and b difference between quark- and gluon-initiated jet responses for PFlow jets using two different generators. These define the flavour response and composition uncertainties respectively

An additional uncertainty applied only to b-initiated jets covers the difference in response between jets from light- versus heavy-flavour quarks and replaces the flavour composition and response uncertainties for these heavy-flavour jets. The punch-through uncertainty accounts for mis-modelling of the GSC correction to jets which pass through the calorimeter and into the muon system, taking the difference in jet response between data and MC simulation in bins of muon detector activity as the systematic uncertainty. Both are discussed in more detail in Ref. [6]. Finally, the high-\(p_{{\text {T}}}\) ‘single particle’ uncertainty is derived from studies of the response to individual hadrons and is used to cover the region beyond 2.4 \({\text {Te}}{\text {V}}\), where the MJB analysis no longer has statistical power [29]. When calibrating MC samples simulated using AFII, an additional non-closure uncertainty is applied to account for the difference in jet response between these samples and those which used full detector simulation.

The total jet energy scale uncertainty is shown in Fig. 22a as a function of jet \(p_{{\text {T}}}\) for fixed \(\eta _{\text {jet}}=0\) and in Fig. 22b as a function of jet \(\eta \) for fixed \(p_{{\text {T}}} ^{\text {jet}} = 60~{\text {Ge}}{\text {V}}\). A dijet-like composition of the sample (that is, predominantly gluons) is assumed in computing the flavour uncertainties. The uncertainties in the \(\eta \) intercalibration analysis are labelled ‘relative in situ JES’ with the non-closure uncertainty creating the asymmetric peaks around \(\eta =\pm 2.5\). Uncertainties in all other in situ measurements are combined into the ‘absolute in situ JES’ term, which also includes the single-particle uncertainty.

Fig. 22
figure 22

Fractional jet energy scale systematic uncertainty components for anti-\(k_{t}\) \(R=0.4\) jets a as a function of jet \(p_{{\text {T}}}\) at \(\eta = 0\) and b as a function of \(\eta \) at \(p_{{\text {T}}} = 60~{\text {Ge}}{\text {V}}\), reconstructed from particle-flow objects. The total uncertainty, determined as the quadrature sum of all components, is shown as a filled region topped by a solid black line. Flavour-dependent components shown here assume a dijet flavour composition

5.3.1 Uncertainty correlations and reductions

The detail contained in 125 independent nuisance parameters is far more than is required by most analyses, so it is necessary to reduce the uncertainty description to a smaller number of terms. One could imagine a single ‘Jet energy scale’ nuisance parameter constructed by adding in quadrature all of the independent components. However, a meaningful set of correlations exist between the jet energy scale uncertainties for two jets at different \(\eta \) and \(p_{{\text {T}}}\) as a result of the structures of the nuisance parameters. In the case of reduction to a single component, the entirety of this correlation information would be lost and an unrealistic assumption – that of full correlation between the jet energy scale uncertainties for any values of \(\eta \) and \(p_{{\text {T}}}\) – would be enforced. In practice, a variety of reduced uncertainty schemes are provided to allow simplified descriptions with a minimum loss of correlation information.

The 98 uncertainty components stemming from the absolute in situ analyses are functions only of \(p_{{\text {T}}}\) and thus their behaviour can be easily represented by a smaller number of orthogonal terms. An eigenvector decomposition is performed on a covariance matrix of these uncertainty components and the largest of the resulting orthogonal terms are kept separate as new effective nuisance parameters [5]. The remaining terms are combined into a single residual nuisance parameter. To determine how many components to keep independently and how many to combine in the residual term, the covariance matrix for the reduced set is also computed and the difference in correlation in each jet \(\eta \) and \(p_{{\text {T}}} \) between the reduced set and the full set is calculated. This difference is taken as a measure of the information loss and the number of combined terms is adjusted so that the difference is below an acceptable bound (usually 0.05). Two different reduction schemes are produced: the global reduction combines all \(p_{{\text {T}}}\)-dependent in situ uncertainty components regardless of their sources and results in 8 reduced components for a total of 23 once the two-dimensional terms (not arising from the in situ analyses and not reduced) are included. The category reduction combines the \(p_{{\text {T}}}\)-dependent in situ uncertainty components in separate groups based on their origin (detector, statistical, modelling, or mixed) and results in 15 reduced components for a total of 30. The JES correlation matrix for the full set of nuisance parameters is shown in Fig. 23a. The bin-by-bin correlation loss between the full set of nuisance parameters and the category reduction is shown in Fig. 23b and is below 0.05 everywhere as required.

Fig. 23
figure 23

a The jet energy scale correlation matrix for two PFlow+JES jets at \(\eta = 0\) using the full set of 98 \(p_{{\text {T}}}\)-dependent in situ nuisance parameters and b the difference in correlation information between the full description and the category reduction. The maximum loss of correlation information is 0.02 and occurs at the (\(p_{{\text {T}}} ^{j1},p_{{\text {T}}} ^{j2}\)) location specified by the text at the bottom of the plot

While the same procedure could in principle be used for the components which depend on both \(p_{{\text {T}}}\) and \(\eta \), the complexity added by the second dimension means that nearly as many eigenvectors would be needed to adequately describe the correlations as there were original terms and so the gain would be minimal. However, many analyses still require fewer than 25 nuisance parameters and are not affected by loss of correlation information. To provide suitable uncertainties for these, a strong reduction procedure is used to group the globally reduced versions of the absolute in situ uncertainties together with the two-dimensional uncertainties into three effective nuisance parameters as detailed in Ref [7]. The three terms of the \(\eta \) intercalibration non-closure uncertainty are kept separate because their two-dimensional shapes are especially difficult to reduce and would cause an unacceptably large correlation loss.

Four different sets (scenarios) of the three effective nuisance parameters are created by varying the combinations of terms they contain. The varied sets are chosen such that the correlation loss in each is constrained to an \(\eta \)\(p_{{\text {T}}}\) range which is well described by a different set. The metric for assessing performance of the four scenarios is the uncovered correlation loss, defined as the maximum difference in correlation between any two reduced scenarios minus the minimum difference in correlation between any reduced scenario and the full set of nuisance parameters. The uncovered correlation loss is calculated for a fine grid of points in \(\eta \) and \(p_{{\text {T}}}\), ensuring no small-scale structures are missed. Contents of the effective nuisance parameters are varied, keeping systematic uncertainties with similar behaviours mostly grouped together, until a set of scenarios is found in which the maximum uncovered correlation loss is kept below 0.25 and confined to sufficiently small regions that the average correlation loss in an \(\eta \)\(p_{{\text {T}}} \) plane does not exceed 0.02. A detailed discussion of the application of strongly reduced uncertainties within physics analyses can be found in Ref. [7].

5.3.2 Uncertainties for EMtopo and PFlow jets

Although the scale of individual calibrations may vary between EMtopo and PFlow jets, the final uncertainties are similar in size. A slightly larger pile-up uncertainty contribution in PFlow jets due to the impact of the underlying event is offset by smaller in situ uncertainties, leading to a comparable total overall uncertainty. Figure 24 shows the total uncertainty in EMtopo and PFlow jets for a range of \(p_{{\text {T}}}\) values at fixed \(\eta =0\) and for a range of \(\eta \) values at fixed \(p_{{\text {T}}} = 60~{\text {Ge}}{\text {V}}\). The level of agreement is representative of other \(p_{{\text {T}}} \) and \(\eta \) ranges.

Fig. 24
figure 24

Fractional jet energy scale systematic uncertainty summed across all components for anti-\(k_{t}\) \(R=0.4\) jets a as a function of jet \(p_{{\text {T}}}\) at \(\eta = 0\) and b as a function of \(\eta \) at \(p_{{\text {T}}} = 60~{\text {Ge}}{\text {V}}\). The total uncertainty is shown for both EMtopo and PFlow jets. Contributions from topology-dependent components are calculated assuming a dijet flavour composition

6 Jet energy resolution

Precise knowledge of the jet energy resolution (JER) is important for detailed measurements of SM jet production, measurements and studies of the properties of the SM particles that decay to jets (e.g. \(W/Z \) bosons, top quarks), as well as searches for physics beyond the SM involving jets. The JER also affects the missing transverse momentum, which plays an indispensable role in many searches for new physics and measurements involving particles that decay into neutrinos, and thus rely on well-reconstructed missing momentum.

The dependence of the relative JER on the transverse momentum of the jet may be parameterized using a functional form expected for calorimeter-based resolutions, with three independent contributions, namely the noise (N), stochastic (S) and constant (C) terms [54]:

$$\begin{aligned} \frac{\sigma (p_{{\text {T}}})}{p_{{\text {T}}}} = \frac{N}{p_{{\text {T}}}}\oplus \frac{S}{\sqrt{p_{{\text {T}}}}}\oplus \,C\,. \end{aligned}$$
(4)

The noise (N) term is due to the contribution of electronic noise to the signal measured by the detector front-end electronics, as well as that due to pile-up. Since both contribute directly to the energy measured in the calorimeter but are approximately independent of the energy deposited by the showing particles, the contribution to the JER scales like \(1/p_{{\text {T}}} \). The noise term is expected to be significant in the low-\(p_{{\text {T}}} \) region, below \(\sim \)30 \({\text {Ge}}{\text {V}}\). Statistical fluctuations in the amount of energy deposited are captured by the stochastic (S) term, which represents the limiting term in the resolution up to several hundred \({\text {Ge}}{\text {V}}\) in jet \(p_{{\text {T}}}\). The S term contribution to the JER scales like \(1/\sqrt{p_{{\text {T}}}}\). The constant (C) term corresponds to fluctuations that are a constant fraction of the jet \(p_{{\text {T}}}\), such as energy depositions in passive material (e.g. cryostats and solenoid coil), the starting point of the hadron showers, and non-uniformities of response across the calorimeter. The constant term is expected to dominate the high-\(p_{{\text {T}}} \) region, above approximately 400 \({\text {Ge}}{\text {V}}\).

In order to measure the JER, jet momentum must be measured precisely. This implies that the jets must either recoil against a reference object whose momentum can be measured precisely, or be balanced against one another in a well-defined dijet system [5, 6]. Measurements using the latter approach are presented here, as well as a method for measuring the contributions to the resolution from the noise term (N) due to both pile-up and electronics. The 2017 data, corresponding to an integrated luminosity of 44 \({\hbox {fb}}^{-1}\) is used for these measurements.

6.1 Resolution measurement using dijet events

Dijet events are both plentiful and produced via a set of \(2\rightarrow 2\) processes that are theoretically well-understood. JER measurements using these events for the dijet balance method rely on the approximate scalar balance between the transverse momenta of the two leading jets. Deviations from exact balance, measured via the asymmetry, given by

$$\begin{aligned} {\mathcal {A}} \equiv \frac{p_{{\text {T}}} ^{\text {probe}}- p_{{\text {T}}} ^{\text {ref}}}{p_{{\text {T}}} ^{\text {avg}}}, \end{aligned}$$
(5)

are due to a combination of experimental resolution, the presence of additional radiation in the event, and biases due to the event selection used in the measurement. In Eq. (5), \(p_{{\text {T}}} ^{\text {ref}}\) is the \(p_{{\text {T}}}\) of a reference jet which is required to be located in a well-calibrated region of the detector (\(\eta _{\text {det}} ^{\text {ref}}\)), taken here to be the central region of the calorimeter \(0.2 \le |\eta _{\text {det}} ^{\text {ref}} | < 0.7\), where the seam at \(\eta _{\text {det}} =0\) is excluded to ensure the reference jet energy is as cleanly measured as possible. The probe jet, with transverse momentum \(p_{{\text {T}}} ^{\text {probe}}\), may be located either within this central reference region or beyond it, with \(|\eta _{\text {det}} ^{\text {probe}} | < 4.5\). The probe jet is the jet for which the resolution is to be measured and \(p_{{\text {T}}} ^{\text {avg}}\) is the mean of the probe and reference jet momenta, \(p_{{\text {T}}} ^{\text {avg}} = (p_{{\text {T}}} ^{\text {probe}} + p_{{\text {T}}} ^{\text {ref}})/2\). The standard deviation of \({\mathcal {A}}\) for a particular \((p_{{\text {T}}} ^{\text {avg}}, \eta _{\text {det}} ^{\text {probe}})\) bin is denoted by \(\sigma _{{\mathcal {A}}}\), and in the case of a measurement of the probe jet asymmetry may be expressed as

$$\begin{aligned} \sigma _{{\mathcal {A}}} ^{\text {probe}} = \frac{\sigma _{p_{{\text {T}}}} ^{\text {probe}} \oplus \, \sigma _{p_{{\text {T}}}} ^{\text {ref}}}{\langle p_{{\text {T}}} ^{\text {avg}} \rangle } = \left\langle \frac{\sigma _{p_{{\text {T}}}}}{p_{{\text {T}}}} \right\rangle ^{\text {probe}} \oplus \, \left\langle \frac{\sigma _{p_{{\text {T}}}}}{p_{{\text {T}}}} \right\rangle ^{\text {ref}}, \end{aligned}$$

where \(\sigma _{p_{{\text {T}}}} ^{\text {probe}}\) and \(\sigma _{p_{{\text {T}}}} ^{\text {ref}}\) are the standard deviations of \(p_{{\text {T}}} ^{\text {probe}}\) and \(p_{{\text {T}}} ^{\text {ref}}\), respectively, and are used to denote the JER for each of the relevant objects. For calibrated jets, \(\langle p_{{\text {T}}} ^{\text {probe}} \rangle = \langle p_{{\text {T}}} ^{\text {ref}} \rangle = \langle p_{{\text {T}}} ^{\text {avg}} \rangle \) in the reference region. The reference jet relative resolution, \(\langle \sigma _{p_{{\text {T}}}}/p_{{\text {T}}} \rangle ^{\text {ref}}\), must therefore be subtracted from the measured asymmetry distribution in order to extract the resolution of the probe jet as

$$\begin{aligned} \left\langle \frac{\sigma _{p_{{\text {T}}}}}{p_{{\text {T}}}} \right\rangle ^{\text {probe}} = \sigma _{{\mathcal {A}}} ^{\text {probe}} \ominus \, \left\langle \frac{\sigma _{p_{{\text {T}}}}}{p_{{\text {T}}}} \right\rangle ^{\text {ref}}. \end{aligned}$$
(6)

Equation (6) is valid in the probe region as well, up to a correction factor that accounts for the potential overall imbalance between the reference jet and the probe jet in that region. This correction factor is found to be negligible (\(<1\%\)) for the measurements performed here. However, the \(p_{{\text {T}}}\) balance of the measured jets, and thus the measured asymmetry distribution, is measurably affected on an event-by-event basis by physics effects such as additional radiation, non-perturbative processes including hadronization and multi-parton interactions, and others that may lead to particle losses and additions in the measured jets. Consequently, the measured dijet balance asymmetry distribution represents a convolution of the intrinsic detector resolution and the particle-level balance affected by the aforementioned effects. The determination of \(\sigma _{{\mathcal {A}}} ^{\text {probe}}\) must therefore account for such effects, for example by subtracting the particle-level quantity from the measured quantity in quadrature:

$$\begin{aligned} \left( \sigma _{{\mathcal {A}}} ^{\text {probe}}\right) _{\text {det}} = \left( \sigma _{{\mathcal {A}}} ^{\text {probe}}\right) _{\text {meas}} \ominus \, \sigma _{{\mathcal {A}}} ^{\text {truth}}. \end{aligned}$$

The results presented here use an iterative fitting procedure to extract the impact of these effects and to isolate the intrinsic detector resolution, \(\left( \sigma _{{\mathcal {A}}} ^{\text {probe}}\right) _{\text {det}}\), by assuming a Gaussian convolution of detector effects with the particle-level balance. First, the asymmetry distribution measured at particle level in MC simulation is fitted with an ad hoc function \({\mathcal {A}} ^{\text {truth}}\) based on exponential curves and found to describe it well. Second, the measured asymmetry distribution, \({\mathcal {A}} ^{\text {meas}}\), is fitted by the function

$$\begin{aligned} {\mathcal {A}} ^{\text {meas}} = {\mathcal {A}} ^{\text {truth}} \otimes {\mathcal {R}} (\mu _{{\mathcal {A}}}^{\text {det}}, \sigma _{{\mathcal {A}}} ^{\text {det}}), \end{aligned}$$

taking \({\mathcal {A}} ^{\text {truth}}\) from the particle-level fit and where \({\mathcal {R}} (\mu _{{\mathcal {A}}}^{\text {det}})\) is a Gaussian distribution with width \(\sigma _{{\mathcal {A}}} ^{\text {det}}\) representing the detector resolution for the probe jet and offset \(\mu _{{\mathcal {A}}}^{\text {det}}\) accounting for any residual non-closure in the JES calibration.

Collision data used for the dijet balance measurement are collected using specific combinations of central and forward jet triggers for each of the 11 \(p_{{\text {T}}} ^{\text {avg}}\) ranges used in the measurement. Trigger selections are required to be at least 99% efficient in the range of \(p_{{\text {T}}} ^{\text {avg}}\) in which a particular combination is used. Jets must also pass JVT selection requirements as described in Sect. 5.2.2.

Topology criteria are applied to select well-defined dijet production processes with minimal contributions due to additional radiation or higher-order processes. The azimuthal angle, \(\Delta \phi \), between the two leading jets in the event and the maximum \(p_{{\text {T}}}\) of a potential third jet, \(p_{{\text {T}}} ^{\,j_{3}}\), are constrained by the following two criteria:

$$\begin{aligned} \Delta \phi (j_{1}, j_{2})\ge & {} 2.7~\text {rad.}\,, \\ p_{{\text {T}}} ^{\,j_{3}}< & {} \max \left( 25~{\text {Ge}}{\text {V}}, 0.25\cdot p_{{\text {T}}} ^{\text {avg}} \right) . \end{aligned}$$

Example asymmetry distributions are shown in Fig. 25 in two representative bins of \(p_{{\text {T}}} ^{\text {avg}}\) and probe jet \(\eta _{\text {det}} ^{\text {probe}}\). An iterative Gaussian fit to the core of the asymmetry distribution is used to extract the JER. The result of the measurement of the relative JER and its systematic uncertainty is shown in Fig. 26 for a single narrow range of \(\eta _{\text {det}} ^{\text {probe}}\) and as a function of \(p_{{\text {T}}} ^{\text {jet}}\). The JER is observed to be slightly underestimated by MC simulation in this central region of the detector.

Fig. 25
figure 25

Asymmetry distribution measured in data and particle-level Pythia 8 for PFlow jets in two example \(p_{{\text {T}}}\) and \(\eta \) ranges. Error bars represent the statistical uncertainty. a The measured asymmetry is shown for probe jets with \(80~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {avg}} < 110~{\text {Ge}}{\text {V}}\) in the range \(0.2< |\eta _{\text {det}} ^{\text {probe}}| < 0.7\), where the distributions are symmetric by construction. b The measured asymmetry is shown for probe jets with \(300~{\text {Ge}}{\text {V}}< p_{{\text {T}}} ^{\text {avg}} < 400~{\text {Ge}}{\text {V}}\) in the range \(1.3< |\eta _{\text {det}} ^{\text {probe}}| < 1.8\). In this \(\eta _{\text {det}} ^{\text {probe}}\) range the distributions can be asymmetric. Two fits are performed iteratively: the particle-level asymmetry is modelled with an ad hoc function which is subsequently convolved with a Gaussian function in order to describe the reconstructed asymmetry. The detector resolution is then extracted from the Gaussian fit parameter

Systematic uncertainties are dominated by imprecise knowledge of the scale of the jets at low \(p_{{\text {T}}}\), which results in an approximate 1.5% uncertainty at 40 \({\text {Ge}}{\text {V}}\), whereas the non-closure of the dijet balance method itself is largely dominant at higher \(p_{{\text {T}}}\). The non-closure uncertainty is evaluated as the difference between the resolution measured using the in situ procedure applied to MC simulation and the particle-level resolution, \(\sigma (R)/R\), where \(R=p_{{\text {T}}} ^{\text {reco}}/p_{{\text {T}}} ^{\text {true}} \). Good agreement is found, resulting in an uncertainty in the relative resolution that is approximately 0.4% and generally increases with \(p_{{\text {T}}}\) due to the non-Gaussian jet response. At lower \(p_{{\text {T}}}\) the uncertainties propagated from the JES dominate. The increase in JES uncertainty around 800 \({\text {Ge}}{\text {V}}\) is a result of the single-particle uncertainty (see Sect. 5.3): the jet energy scale calibration used for the dijet energy resolution measurement is necessarily based on a smaller dataset than the one presented in this paper, allowing the two measurements to converge simultaneously, and as a result the statistics were lower and the single-particle uncertainty became dominant at a lower \(p_{{\text {T}}} ^{\text {jet}}\) value than in Fig. 22a. Additional systematic uncertainties are estimated by varying the analysis cuts and the JVT selection and by comparing the result with one obtained from an alternative MC generator (Sherpa 2.1.1).

Fig. 26
figure 26

a Relative jet energy resolution and b absolute uncertainty in the relative resolution as a function of \(p_{{\text {T}}}\) for PFlow jets in the central region of the detector, measured using the dijet balance method. The resolution in data is shown in black points with error bars indicating statistical uncertainties; the resolution in detector-level simulated events is shown by the blue curve with total systematic uncertainty given by the blue band. The systematic uncertainty is dominated by terms propagated from the JES uncertainty, while additional terms arise from the analysis selection, pile-up rejection (JVT), physics modelling (comparison with alternative generator), and non-closure effects. The bump in uncertainty around 800 \({\text {Ge}}{\text {V}}\) comes from the single-particle uncertainty

Fig. 27
figure 27

a The difference in the random cone sums, \(\Delta p_{{\text {T}}} ^{\text {RC}}\), measured in the central region (\(|\eta _{\text {det}} |<0.7\)) in randomly triggered data using PFlow objects. b Comparison between the pile-up noise term \(N^{\text {PU}}\) determined using the random cone method (black solid circles) and the expectation from MC simulation (orange squares) as extracted from the difference in quadrature of MC simulation with (red downward triangles) and without (blue upward triangles) pile-up. Results are shown at the PFlow+JES energy scale for jets in the central region of the detector (\(|\eta _{\text {det}} |<0.7\))

6.2 Noise measurement using random cones

Direct estimates of the noise term of Eq. (4) are obtained by measuring the fluctuations in the energy deposits due to pile-up using data samples that are collected by random unbiased triggers. These measurements are performed using the random cones method in which energy deposits in the calorimeter are summed at the energy scale of the constituents in circular areas analogous to the jet area for anti-\(k_{t}\) \(R=0.4\) jets. This approach is adopted due to its ability to account for any non-Gaussian behaviour of the noise contributions to the JER. Two such random cone sums, \(p_{{\text {T}}} ^{\text {c1}}\) and \(p_{{\text {T}}} ^{\text {c2}}\), are obtained at random \(\phi \) values and within opposite \(\pm \Delta \eta \) regions and the difference between them, \(\Delta p_{{\text {T}}} ^{\text {RC}}\), provides a measure of the random fluctuations of deposited energy. Multiple non-overlapping cones are selected within each event to maximize statistical power; this is demonstrated to cause no bias in the overall result. This random cone balance is given by

$$\begin{aligned} \Delta p_{{\text {T}}} ^{\text {RC}} = p_{{\text {T}}} ^{\text {c1}}- p_{{\text {T}}} ^{\text {c2}}, \end{aligned}$$

and the estimated pile-up noise is determined by the central 68% confidence interval of the distribution of \(\Delta p_{{\text {T}}} ^{\text {RC}}\), \(\sigma _{\text {RC}}\), sampled over many events as a function of both \(\eta \) and pile-up levels, as indexed by \(\mu \). Specifically, the noise term due to pile-up, \(N^{\text {PU}}\), is determined as

$$\begin{aligned} N^{\text {PU}} = \frac{\sigma _{\text {RC}}}{2\sqrt{2}}\,, \end{aligned}$$
(7)

where the width of the distribution is divided by 2 to obtain the half-width of the distribution, and by \(\sqrt{2}\) to obtain the fluctuations corresponding to just a single random cone. The distribution of \(\Delta p_{{\text {T}}} ^{\text {RC}}\) is shown in Fig. 27a. Updates to the random cone method since its initial description in Ref. [6] include removing a restriction to only a pair of back-to-back cones since this was found to have no effect on the result and taking multiple non-overlapping random cone pairs per event to maximise statistics.

The energy scale of the noise estimated by \(N^{\text {PU}}\) in Eq. (7) is the constituent energy scale and not that of the jets measured in Sect. 6.1. In order to compare the measurement of the noise term \(N^{\text {PU}}\) using the random cone method with the JER measured at the fully calibrated scale (e.g. PFlow+JES) a conversion factor is required. The nominal JES calibration factor is used to perform this conversion to the appropriate energy scale. The result is an estimate of the noise due to pile-up that may be directly compared with the measured JER.

A closure test of the random cone measurements is performed by comparing the in situ measurement of the calibrated \(N^{\text {PU}}\) with the expectation from MC simulation. Results are reported here for PFlow jets. To isolate the contribution to the JER from pile-up noise in the MC simulation, the JER is determined in simulated events both with and without pile-up and a subtraction in quadrature is performed between the extracted resolutions. The two JER determinations in MC simulation events with and without pile-up are shown in Fig. 27b and their quadratic difference is compared directly to the in situ measurement from the random cones method. Each is fitted, as shown by the dotted lines in Fig. 27b: the random cone measurement is fitted with \(N/p_{{\text {T}}} \) while the quadratic difference is fitted with \(N/p_{{\text {T}}} \oplus S/\sqrt{p_{{\text {T}}}}\) to account for non-negligible stochastic contributions. The non-closure of the method is largely due to the differences in topo-cluster formation sensitivity to pile-up and electronic noise in the presence versus absence of hard-scatter particles, and is taken as a systematic uncertainty in the result. This non-closure uncertainty is the dominant uncertainty in the JER noise term, ranging from approximately 17% in the most central region to 75% in the endcap transition region (\(2.5< |\eta | < 3.2\)).

Fig. 28
figure 28

Noise term due to pile-up estimated using the random cone method and its uncertainties as a function of \(\eta \). The dominant uncertainty is due to non-closure in the method. Additional uncertainties address the \(\sigma _{\text {RC}}\) definition, the JER conversion factor, the differences in JER between data and MC simulation, and the fit stability in extracting the \(\mu =0\) noise term. The \(\sigma _{\text {RC}}\) definition uncertainty and \(\mu =0\) MC vs data terms are asymmetric in their upwards and downwards components, while all other uncertainties are symmetrized

The total noise contribution to the JER includes not just pile-up but also electronic noise, to which the random cones are not sensitive due to the topo-clustering process. To estimate this electronic contribution, a fit is performed to the JER measured in a dedicated MC simulation sample with \(\mu =0\) and the electronic noise term is extracted as \(N^{\mu =0}\). The total noise term used in the JER combination is therefore taken to be \(N = N^{\text {PU}} \oplus N^{\mu =0} \) and is shown as a function of \(\eta \) in Fig. 28 along with its systematic uncertainties. The dominant systematic uncertainty in the random cone measurement of \(N^{\text {PU}}\) is the previously discussed non-closure uncertainty, but additional terms arise from varying the quantile of the confidence interval used to extract \(\sigma _{\text {RC}}\) and from using a different estimate of the conversion factor to the calibrated JES scale. Two systematic uncertainties apply to \(N^{\mu =0}\): a 20% relative uncertainty conservatively estimating the differences in JER between data and MC simulation and an uncertainty due to the fit parameterization and stability. The systematic uncertainties enter the combined JER fit unsymmetrized in \(\eta \) but are symmetrized during the statistical combination, and so the one-sided components are symmetrized in Fig. 28 to illustrate their final contribution to the total uncertainty.

Fig. 29
figure 29

a The relative jet energy resolution as a function of \(p_{{\text {T}}}\) for fully calibrated PFlow+JES jets. The error bars on points indicate the total uncertainties on the derivation of the relative resolution in dijet events, adding in quadrature statistical and systematic components. The expectation from Monte Carlo simulation is compared with the relative resolution as evaluated in data through the combination of the dijet balance and random cone techniques. b Absolute uncertainty on the relative jet energy resolution as a function of jet \(p_{{\text {T}}}\). Uncertainties from the two in situ measurements and from the data/MC simulation difference are shown separately

6.3 Combination of in situ jet energy resolution

A combined measurement of the JER is obtained by performing a fit to the dijet balance measurements (Sect. 6.1) using a constraint on the noise term (\(N\)) derived from the random cones measurement and \(\mu =0\) simulation sample (Sect. 6.2). The implementation of this statistical combination is performed in a manner nearly identical to that for the JES (Sect. 5.2.5), propagating uncertainties from the dijet measurement in the same way and using a similar eigenvalue decomposition to reduce the final number of nuisance parameters.

Instead of using polynomial splines to interpolate across \(p_{{\text {T}}} ^{\text {jet}}\), the JER combination uses the functional form from Eq. (4). A fit to the dijet measurement data is performed, fixing the noise term to the value measured by the random cone analysis. Dijet measurement uncertainties are taken to be fully correlated between \(\eta \) bins. Uncertainties due to the random cones measurements are determined by propagating the noise term uncertainties and repeating the fit with different values of \(N\). These uncertainties are taken to be decorrelated between central (\(|\eta | < 2.5\)) and forward (\(|\eta | > 2.5\)) regions.

The resulting combined measurement of the JER for PFlow+JES jets is shown in Fig. 29a. The dijet measurement data points are shown along with the total in situ combination, while the constraint on the noise term derived from random cones and included in that combination is demonstrated by plotting \(N/p_{{\text {T}}} \) and its uncertainties as a separate curve for illustrative purposes. Figure 29b shows the absolute uncertainties on the combined JER measurement. For each value of \(p_{{\text {T}}} ^{\text {jet}}\) and \(\eta _{\text {det}}\) a toy jet is created and the size of each JER nuisance parameter corresponding to it is retrieved and plotted.

Comparisons of the JER measurements for PFlow+JES and EM+JES jets, as a function of both \(p_{{\text {T}}} ^{\text {jet}} \) and \(\eta \), are provided in Fig. 30. The fit to the resolution as a function of \(p_{{\text {T}}}\) for the PFlow+JES jets shows an improvement in resolution over EM+JES jets at low \(p_{{\text {T}}}\).

Fig. 30
figure 30

The relative jet energy resolution for fully calibrated PFlow+JES jets (blue curve) and EM+JES jets (green curve) a as a function of \(p_{{\text {T}}} ^{\text {jet}}\) and b as a function of \(\eta \). The fit to the resolution as a function of \(p_{{\text {T}}} ^{\text {jet}}\) for the PFlow+JES jets shows an improvement in resolution over EM+JES jets at low-\(p_{{\text {T}}}\)

Figure 31 shows the total JER uncertainty in EMtopo and PFlow jets for a range of \(p_{{\text {T}}}\) values at fixed \(\eta =0.2\) and for a range of \(\eta \) values at fixed \(p_{{\text {T}}} = 30~{\text {Ge}}{\text {V}}\). The level of agreement is representative of other \(p_{{\text {T}}} \) and \(\eta \) ranges.

Fig. 31
figure 31

Fractional jet energy resolution systematic uncertainty summed across all components for anti-\(k_{t}\) \(R=0.4\) jets a as a function of jet \(p_{{\text {T}}}\) at \(\eta = 0.2\) and b as a function of \(\eta \) at \(p_{{\text {T}}} = 30~{\text {Ge}}{\text {V}}\). The total JER uncertainty is shown for both EM+JES and PFlow+JES jets

6.4 Application of JER and its systematic uncertainties

In order to ensure that the resolution of the jet energy scale in simulation matches that in data wherever possible, a smearing procedure is recommended. For regions of jet \(p_{{\text {T}}}\) in which the resolution in data is larger than in MC simulation, the simulation sample should be smeared until its average resolution matches that of data. In regions of jet \(p_{{\text {T}}}\) where resolution is smaller in data than in MC simulation, no smearing is performed, since the data should remain unaltered.

JER systematic uncertainties are propagated through physics analyses by smearing jets according to a Gaussian function with width \(\sigma _{\text {smear}}\). If \(\sigma _{\text {nom}}\) is the nominal JER of the sample, after MC simulation smearing if necessary, and \(\sigma _{\text {NP}}\) is the one-standard-deviation variation in the uncertainty component to be evaluated, then:

$$\begin{aligned} \sigma _{\text {smear}}^2 = (\sigma _{\text {nom}}+|\sigma _{\text {NP}}|)^2 - \sigma _{\text {nom}}^2\,. \end{aligned}$$

Application of JER systematic uncertainties must account for two factors: first, anti-correlations across a single uncertainty component, and second, differences in resolution between data and MC simulation.

Anti-correlation becomes an issue when a single JER component is positive in some regions of phase space and negative in others. To propagate such systematic uncertainties to analyses, smearing should be applied to the simulation when \(\sigma _{\text {NP}} > 0\) and applied to the data when \(\sigma _{\text {NP}}<0\). It should be noted that the nominal data remains unchanged as this applies only to the application of systematic uncertainties. In the case that data statistics are too low to safely smear, pseudo-data may be smeared instead.

Differences in resolution between data and MC simulation are already accounted for by the application of additional smearing to the simulation when the resolution in simulation is better than in data. When the JER is smaller in data, this difference is accounted for by applying its full value as an additional systematic uncertainty:

$$\begin{aligned} \sigma _{\text {NP}} = \sigma _{\text {nom}}^{\text {data}} - \sigma _{\text {nom}}^{\text {MC}}\,. \end{aligned}$$

This term is defined by the dijet asymmetry measurements of Sect. 6.1 and is zero for the central \(\eta \) slice shown in Fig. 29b, but for some \(p_{{\text {T}}}\) ranges in more forward detector regions it can be significant. A large value of this uncertainty for PFlow jets at \(\eta \sim 3.2\) is the source of the peaks visible in Fig. 31b.

7 Conclusions

The calibration of the jet energy scale and resolution for jets reconstructed with the anti-\(k_{t}\) algorithm with radius parameter \(R=0.4\) is presented. Jets are built from either the energy deposits that form topological clusters of calorimeter cells or a combination of charged-particle tracks and topological clusters. The measurements discussed here use 36–81 \({\hbox {fb}}^{-1}\) of data recorded with the ATLAS detector during 2015–2017 in \(pp\) collisions at a centre-of-mass energy of 13 \({\text {Te}}{\text {V}}\) at the Large Hadron Collider. It is the first full calibration of PFlow jets performed by the ATLAS collaboration, the first jet energy scale measurement in the high pile-up conditions of late Run 2 data-taking, and the first jet energy resolution measurement in 13 \({\text {Te}}{\text {V}}\) data.

A sequence of simulation-based corrections removes the contribution to the jet energy from additional proton–proton interactions in the same or nearby bunch crossings, corrects the jet so that it agrees in energy and direction with particle-level jets and, improves the jet energy resolution. Any remaining difference between simulation and data is removed with in situ techniques using well-measured reference objects, including photons, \(Z\) bosons, and other jets, such that the energy scale of fully calibrated jets is unity within uncertainties. The jet energy resolution is measured in a dijet balance system, and the contribution to the resolution from the noise term due to pile-up and electronics is also measured. The relative jet energy resolution ranges from 0.25 (0.35) to 0.04 for PFlow (EMtopo) jets as a function of jet \(p_{{\text {T}}}\).

Systematic uncertainties in the jet energy scale for central jets (\(|\eta |<1.2\)) vary from 1% for a large range of high-\(p_{{\text {T}}}\) jets (\(250<p_{{\text {T}}} <2000~{\text {Ge}}{\text {V}}\)), to 5% at very low \(p_{{\text {T}}}\) (\(20~{\text {Ge}}{\text {V}}\)) and 3.5% at very high \(p_{{\text {T}}}\) (\(>2.5~{\text {Te}}{\text {V}}\)). The absolute uncertainty on the relative jet energy resolution is found to be 1.5 at 20 \({\text {Ge}}{\text {V}}\) decreasing to 0.5 at 300 \({\text {Ge}}{\text {V}}\).