1 Introduction

The mass of the top quark \(m_{\mathrm {top}}\) is an important parameter of the Standard Model (SM). Precise measurements of \(m_{\mathrm {top}}\) provide crucial information for global fits of electroweak parameters [1,2,3] which help to assess the internal consistency of the SM and probe its extensions. In addition, the value of \(m_{\mathrm {top}}\) affects the stability of the SM Higgs potential, which has cosmological implications [4,5,6].

Many measurements of \(m_{\mathrm {top}}\) in each \(t\bar{t}\) decay channel were performed by the Tevatron and LHC collaborations. The most precise measurements per experiment in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel are \(m_{\mathrm {top}} =172.85\,\pm \,0.71\,\mathrm {(stat)}\,\pm \,0.84\,\mathrm {(syst)} \) \(\text {GeV}\) by CDF [7], \(m_{\mathrm {top}} =174.98\,\pm \,0.58\,\mathrm {(stat)}\,\pm \,0.49\,\mathrm {(syst)} \) \(\text {GeV}\) by D0 [8], \(m_{\mathrm {top}} =172.33\,\pm \,0.75\,\mathrm {(stat)}\,\pm \,1.03\,\mathrm {(syst)} \) \(\text {GeV}\) by ATLAS [9] and \(m_{\mathrm {top}} =172.35\,\pm \,0.16\,\mathrm {(stat)}\,\pm \,0.48\,\mathrm {(syst)} \) \(\text {GeV}\) by CMS [10]. Combinations are performed, by either the individual experiments, or by several Tevatron and LHC experiments [11]. In these combinations, selections of measurements from all \(t\bar{t}\) decay channels are used. The latest combinations per experiment are \(m_{\mathrm {top}} =173.16\,\pm \,0.57\,\mathrm {(stat)}\,\pm \,0.74\,\mathrm {(syst)} \) \(\text {GeV}\) by CDF [12], \(m_{\mathrm {top}} =174.95\,\pm \,0.40\,\mathrm {(stat)}\,\pm \,0.64\,\mathrm {(syst)} \) \(\text {GeV}\) by D0 [13], \(m_{\mathrm {top}} =172.84\,\pm \,0.34\,\mathrm {(stat)}\,\pm \,0.61\,\mathrm {(syst)} \) \(\text {GeV}\) by ATLAS [14] and \(m_{\mathrm {top}} =172.44\,\pm \,0.13\,\mathrm {(stat)}\,\pm \,0.47\,\mathrm {(syst)} \) \(\text {GeV}\) by CMS [10].

In this paper, an ATLAS measurement of \(m_{\mathrm {top}}\) in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel is presented. The result is obtained from \(pp\) collision data recorded in 2012 at a centre-of-mass energy of \({\sqrt{s}} =8\) \(\text {TeV}\) with an integrated luminosity of about \(20.2 \) \(\mathrm {fb}^{-1} \). The analysis exploits the decay \(t\bar{t} \rightarrow W^{+} W^{-} b\bar{b} \rightarrow \ell \nu q\bar{q}^\prime b\bar{b} \), which occurs when one \(W\) boson decays into a charged lepton (\(\ell \) is e or \(\mu \) including \(\tau \rightarrow e,\mu \) decays) and a neutrino (\(\nu \)), and the other into a pair of quarks. In the analysis presented here, \(m_{\mathrm {top}}\) is obtained from the combined sample of events selected in the electron+jets and muon+jets final states. Single-top-quark events with the same reconstructed final states contain information about the top quark mass and are therefore included as signal events.

The measurement uses a template method, where simulated distributions are constructed for a chosen quantity sensitive to the physics parameter under study using a number of discrete values of that parameter. These templates are fitted to functions that interpolate between different input values of the physics parameter while fixing all other parameters of the functions. In the final step, an unbinned likelihood fit to the observed data distribution is used to obtain the value of the physics parameter that best describes the data. In this procedure, the experimental distributions are constructed such that fits to them yield unbiased estimators of the physics parameter used as input in the signal Monte Carlo (MC) samples. Consequently, the top quark mass determined in this way corresponds to the mass definition used in the MC simulation. Because of various steps in the event simulation, the mass measured in this way does not necessarily directly coincide with mass definitions within a given renormalization scheme, e.g. the top quark pole mass. Evaluating these differences is a topic of theoretical investigations [15,16,17,18,19].

The measurement exploits the three-dimensional template fit technique presented in Ref. [9]. To reduce the uncertainty in \(m_{\mathrm {top}}\) stemming from the uncertainties in the jet energy scale (\(\mathrm {JES}\)) and the additional \(b\text {-jet}\) energy scale (\(\mathrm {bJES}\)), \(m_{\mathrm {top}}\) is measured together with the jet energy scale factor (\(\mathrm {JSF}\)) and the relative b-to-light-jet energy scale factor (\(\mathrm {bJSF}\)). Given the larger data sample than used in Ref. [9], the analysis is optimized to reject combinatorial background arising from incorrect matching of the observed jets to the daughters arising from the top quark decays, thereby achieving a better balance of the statistical and systematic uncertainties and reducing the total uncertainty. Given this new measurement, an update of the ATLAS combination of \(m_{\mathrm {top}}\) measurements is also presented.

This document is organized as follows. After a short description of the ATLAS detector in Sect. 2, the data and simulation samples are discussed in Sect. 3. Details of the event selection are given in Sect. 4, followed by the description of the reconstruction of the three observables used in the template fit in Sect. 5. The optimization of the event selection using a multivariate analysis approach is presented in Sect. 6. The template fits are introduced in Sect. 7. The evaluation of the systematic uncertainties and their statistical uncertainties are discussed in Sect. 8, and the measurement of \(m_{\mathrm {top}}\) is given in Sect. 9. The combination of this measurement with previous ATLAS results is discussed in Sect. 10 and compared with measurements of other experiments. The summary and conclusions are given in Sect. 11. Additional information about the optimization of the event selection and on specific uncertainties in the new measurement of \(m_{\mathrm {top}}\) in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel are given in Appendix A, while Appendix B contains information about various combinations performed, together with comparisons with results from other experiments.

2 The ATLAS experiment

The ATLAS experiment [20] at the LHC is a multipurpose particle detector with a forward–backward symmetric cylindrical geometry and a near \(4\pi \) coverage in the solid angle.Footnote 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer. The inner tracking detector covers the pseudorapidity range \(|\eta | < 2.5\). It consists of silicon pixel, silicon microstrip, and transition radiation tracking detectors. Lead/liquid-argon (LAr) sampling calorimeters provide electromagnetic (EM) energy measurements with high granularity. A hadronic (steel/scintillator-tile) calorimeter covers the central pseudorapidity range (\(|\eta | < 1.7\)). The endcap and forward regions are instrumented with LAr calorimeters for both the EM and hadronic energy measurements up to \(|\eta | = 4.9\). The muon spectrometer surrounds the calorimeters and is based on three large air-core toroid superconducting magnets with eight coils each. Its bending power is 2.0 to 7.5 T m. It includes a system of precision tracking chambers and fast detectors for triggering.

A three-level trigger system was used to select events. The first-level trigger is implemented in hardware and used a subset of the detector information to reduce the accepted rate to at most 75 kHz. This is followed by two software-based trigger levels that together reduced the accepted event rate to 400 Hz on average depending on the data-taking conditions during 2012.

3 Data and simulation samples

The analysis is based on \(pp\) collision data recorded by the ATLAS detector in 2012 at a centre-of-mass energy of \({\sqrt{s}} =8\) \(\text {TeV}\). The integrated luminosity is \(20.2 \) \(\mathrm {fb}^{-1} \) with an uncertainty of \(1.9\%\)  [21]. The modelling of top quark pair (\(t\bar{t}\)) and single-top-quark signal events, as well as most background processes, relies on MC simulations. For the simulation of \(t\bar{t}\) and single-top-quark events, the Powheg-Box v1 [22,23,24] program was used. Within this framework, the simulations of the \(t\bar{t}\)  [25] and single-top-quark production in the s- and t-channels [26] and the Wt-channel [27] used matrix elements at next-to-leading order (NLO) in the strong coupling constant \(\alpha _{\text {S}}\) with the NLO CT10 [28] parton distribution function (PDF) set and the \(h_{\mathrm {damp}}\) parameterFootnote 2 set to infinity. Using \(m_{\mathrm {top}}\) and the top quark transverse momentum \(p_{\text {T}}\) for the underlying leading-order Feynman diagram, the dynamic factorization and renormalization scales were set to \(\sqrt{m_{\mathrm {top}} ^2 + p_{\text {T}} ^2}\). The Pythia  (v6.425) program [29] with the P2011C [30] set of tuned parameters (tune) and the corresponding CTEQ6L1 PDFs [31] provided the parton shower, hadronization and underlying-event modelling.

For \(m_{\mathrm {top}}\) hypothesis testing, the \(t\bar{t}\) and single-top-quark event samples were generated with five different assumed values of \(m_{\mathrm {top}}\) in the range from 167.5 to 177.5 \(\text {GeV}\) in steps of 2.5 \(\text {GeV}\). The integrated luminosity of the simulated \(t\bar{t}\) sample with \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\) is about 360 \(\mathrm {fb}^{-1} \). Each of these MC samples is normalized according to the best available cross-section calculations. For \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\), the \(t\bar{t}\) cross-section is \(\sigma _{t\bar{t}}=253^{+13}_{-15}\) \(\mathrm {pb} \), calculated at next-to-next-to-leading order (NNLO) with next-to-next-to-leading logarithmic soft gluon terms [32,33,34,35,36] with the Top++ 2.0 program [37]. The PDF- and \(\alpha _{\text {S}}\)-induced uncertainties in this cross-section were calculated using the PDF4LHC prescription [38] with the MSTW2008 \(68\%\) CL NNLO PDF [39, 40], CT10 NNLO PDF [28, 41] and NNPDF2.3 5f FFN PDF [42] and were added in quadrature with the uncertainties obtained from the variation of the factorization and renormalization scales by factors of 0.5 and 2.0. The cross-sections for single-top-quark production were calculated at NLO and are \(\sigma _\mathrm {t}=87.8\,^{+3.4}_{-1.9}\) \(\mathrm {pb} \) [43], \(\sigma _{Wt}=22.4\,\pm \,1.5\) \(\mathrm {pb} \) [44] and \(\sigma _\mathrm {s}=5.6\,\pm \,0.2\) \(\mathrm {pb} \) [45] in the t-, the Wt- and the s-channels, respectively.

The Alpgen  (v2.13) program [46] interfaced to the Pythia6 program was used for the simulation of the production of \(W^{\pm }\) or \(Z\) bosons in association with jets. The CTEQ6L1 PDFs and the corresponding AUET2 tune [47] were used for the matrix element and parton shower settings. The \(W\)+jets and \(Z\)+jets events containing heavy-flavour (HF) quarks (Wbb+jets, Zbb+jets, Wcc+jets, Zcc+jets, and Wc+jets) were generated separately using leading-order (LO) matrix elements with massive bottom and charm quark s. Double-counting of HF quarks in the matrix element and the parton shower evolution was avoided via a HF overlap-removal procedure that used the \(\Delta R\) between the additional heavy quarks as the criterion. If the \(\Delta R\) was smaller than 0.4, the parton shower prediction was taken, while for larger values, the matrix element prediction was used. The \(Z\)+jets sample is normalized to the inclusive NNLO calculation [48]. Due to the large uncertainties in the overall \(W\)+jets normalization and the flavour composition, both are estimated using data-driven techniques as described in Sect. 4.2. Diboson production processes (WW, WZ and ZZ) were simulated using the Alpgen program with CTEQ6L1 PDFs interfaced to the Herwig  (v6.520) [49] and Jimmy  (v4.31) [50] programs. The samples are normalized to their predicted cross-sections at NLO [51].

All samples were simulated taking into account the effects of multiple soft \(pp\) interactions (pile-up) that are present in the 2012 data. These interactions were modelled by overlaying simulated hits from events with exactly one inelastic collision per bunch crossing with hits from minimum-bias events produced with the Pythia  (v8.160) program [52] using the A2 tune [53] and the MSTW2008 LO PDF. The number of additional interactions is Poisson-distributed around the mean number of inelastic \(pp\) interactions per bunch crossing \(\mu \). For a given simulated hard-scatter event, the value of \(\mu \) depends on the instantaneous luminosity and the inelastic \(pp\) cross-section, taken to be 73 mb [21]. Finally, the simulation sample is reweighted such as to match the pile-up observed in data.

A simulation [54] of the ATLAS detector response based on Geant4  [55] was performed on the MC events. This simulation is referred to as full simulation. The events were then processed through the same reconstruction software as the data. A number of samples used to assess systematic uncertainties were produced bypassing the highly computing-intensive full Geant4 simulation of the calorimeters. They were produced with a faster version of the simulation [56], which retained the full simulation of the tracking but used a parameterized calorimeter response based on resolution functions measured in full simulation samples. This simulation is referred to as fast simulation.

4 Object reconstruction, background estimation and event preselection

The reconstructed objects resulting from the top quark pair decay are electron and muon candidates, jets and missing transverse momentum (\(E_{\text {T}}^{\text {miss}}\)). In the simulated events, corrections are applied to these objects based on detailed data-to-simulation comparisons for many different processes, so as to match their performance in data.

4.1 Object reconstruction

Electron candidates [57] are required to have a transverse energy of \(E_{\text {T}} >25\) \(\text {GeV}\) and a pseudorapidity of the corresponding EM cluster of \(\vert \eta _\mathrm {cluster} \vert < 2.47\) with the transition region \(1.37<\vert \eta _\mathrm {cluster} \vert <1.52\) between the barrel and the endcap calorimeters excluded. Muon candidates [58] are required to have transverse momentum \(p_{\text {T}} >25\) \(\text {GeV}\) and \(\vert \eta \vert <2.5\). To reduce the contamination by leptons from HF decays inside jets or from photon conversions, referred to collectively as non-prompt (NP) leptons, strict isolation criteria are applied to the amount of activity in the vicinity of the lepton candidate [57,58,59].

Jets are built from topological clusters of calorimeter cells [60] with the anti-\(k_{t}\) jet clustering algorithm [61] using a radius parameter of \(R=0.4\). The clusters and jets are calibrated using the local cluster weighting (LCW) and the global sequential calibration (GSC) algorithms, respectively [62,63,64]. The subtraction of the contributions from pile-up is performed via the jet area method [65]. Jets are calibrated using an energy- and \(\eta \)-dependent simulation-based scheme with in situ corrections based on data [63]. Jets originating from pile-up interactions are identified via their jet vertex fraction (JVF), which is the \(p_{\text {T}}\) fraction of associated tracks stemming from the primary vertex. The requirement \(\mathrm {JVF}>0.5\) is applied solely to jets with \(p_{\text {T}} <50\) \(\text {GeV}\) and \(|\eta | <2.4\) [65]. Finally, jets are required to satisfy \(p_{\text {T}} >25\) \(\text {GeV}\) and \(|\eta | <2.5\).

Muons reconstructed within a \(\Delta R =0.4\) cone around the axis of a jet with \(p_{\text {T}} >25\) \(\text {GeV}\) are excluded from the analysis. In addition, the closest jet within a \(\Delta R =0.2\) cone around an electron candidate is removed, and then electrons within a \(\Delta R =0.4\) cone around any of the remaining jets are discarded.

The identification of jets containing reconstructed b-hadrons, called \(b\text {-tagging}\), is used for event reconstruction and background suppression. In the following, irrespective of their origin, jets tagged by the \(b\text {-tagging}\) algorithm are referred to as \(b\text {-tagged}\) jets, whereas those not tagged are referred to as untagged jets. Similarly, whether they are tagged or not, jets containing b-hadrons in simulation are referred to as \(b\text {-jets}\) and those containing only lighter-flavour hadrons from udcs-quarks, or originating from gluons, are collectively referred to as light-jets. The working point of the neural-network-based MV1 \(b\text {-tagging}\) algorithm [66] corresponds to an average \(b\text {-tagging}\) efficiency of 70\(\%\) for \(b\text {-jet}\) s in simulated \(t\bar{t}\) events and rejection factors of 5 for jets containing a c-hadron and 140 for jets containing only lighter-flavour hadrons. To match the \(b\text {-tagging}\) performance in the data, \(p_{\text {T}}\)- and \(\eta \)-dependent scale factors, obtained from dijet and \(t\bar{t} \rightarrow \mathrm {dilepton}\) events, are applied to MC jets depending on their generated quark flavour, as described in Refs. [66,67,68].

The missing transverse momentum \(E_{\text {T}}^{\text {miss}}\) is the absolute value of the vector \(\overrightarrow{E_\mathrm {T}}^\mathrm {miss} \) calculated from the negative vectorial sum of all transverse momenta. The vectorial sum takes into account all energy deposits in the calorimeters projected onto the transverse plane. The clusters are corrected using the calibrations that belong to the associated physics object. Muons are included in the calculation of the \(E_{\text {T}}^{\text {miss}}\) using their momentum reconstructed in the inner tracking detectors [69].

4.2 Background estimation

The contribution of events falsely reconstructed as \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) events due to the presence of objects misidentified as leptons (fake leptons) and NP leptons originating from HF decays, is estimated from data using the matrix-method [70]. The technique employed uses \(\eta \)- and \(p_{\text {T}}\)-dependent efficiencies for NP/fake-leptons and prompt-leptons. They are measured in a background-enhanced control region with low \(E_{\text {T}}^{\text {miss}}\) and from events with dilepton masses around the \(Z\) boson peak [71], respectively. For the \(W\)+jets background, the overall normalization is estimated from data. The estimate is based on the charge-asymmetry method [72], relying on the fact that at the LHC more \(W^{+}\) than \(W^{-}\) bosons are produced. In addition, a data-driven estimate of the \(Wb\bar{b}\), \(Wc\bar{c}\), Wc and W+light-jet fractions is performed in events with exactly two jets and at least one b-tagged jet. Further details are given in Ref. [73]. The \(Z\)+jets and diboson background processes are normalized to their predicted cross-sections as described in Sect. 3.

4.3 Event preselection

Triggering of events is based solely on the presence of a single electron or muon, and no information from the hadronic final state is used. A logical OR of two triggers is used for each of the \(t\bar{t} \rightarrow \mathrm {electron+jets}\) and \(t\bar{t} \rightarrow \mathrm {muon+jets}\) channels. The triggers with the lower thresholds of 24 \(\text {GeV}\) for electrons or muons select isolated leptons. The triggers with the higher thresholds of 60 \(\text {GeV}\) for electrons and 36 \(\text {GeV}\) for muons do not include an isolation requirement. The further selection requirements closely follow those in Ref. [9] and are

  • Events are required to have at least one primary vertex with at least five associated tracks. Each track needs to have a minimum \(p_{\text {T}}\) of 0.4 \(\text {GeV}\). For events with more than one primary vertex, the one with the largest \(\sum p_\mathrm{T}^2\) is chosen as the vertex from the hard scattering.

  • The event must contain exactly one reconstructed charged lepton, with \(E_{\text {T}} > 25\) \(\text {GeV}\) for electrons and \(p_{\text {T}} > 25\) \(\text {GeV}\) for muons, that matches the charged lepton that fired the corresponding lepton trigger.

  • In the \(t\bar{t} \rightarrow \mathrm {muon+jets}\) channel, \(E_{\text {T}}^{\text {miss}} >20\) \(\text {GeV}\) and \(E_{\text {T}}^{\text {miss}} +m_{\mathrm {T}}^{W} >60\) \(\text {GeV}\) are required.Footnote 3

  • In the \(t\bar{t} \rightarrow \mathrm {electron+jets}\) channel, more stringent requirements on \(E_{\text {T}}^{\text {miss}}\) and \(m_{\mathrm {T}}^{W}\) are applied because of the higher level of NP/fake-lepton background. The requirements are \(E_{\text {T}}^{\text {miss}} > 30\) \(\text {GeV}\) and \(m_{\mathrm {T}}^{W} >30\) \(\text {GeV}\).

  • The presence of at least four jets with \(p_{\text {T}} >25\) \(\text {GeV}\) and \(\vert \eta \vert <2.5\) is required.

  • The presence of exactly two \(b\text {-tagged}\) jets is required.

The resulting event sample is statistically independent of the ones used for the measurement of \(m_{\mathrm {top}}\) in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) channels at \({\sqrt{s}} =8\) \(\text {TeV}\) [14, 74]. The observed number of events in the data after this preselection and the expected numbers of signal and background events corresponding to the same integrated luminosity as the data are given in Table 1. For all predictions, the uncertainties are estimated as the sum in quadrature of the statistical uncertainty, the uncertainty in the integrated luminosity and all systematic uncertainties assigned to the measurement of \(m_{\mathrm {top}}\) listed in Sect. 8, except for the PDF and pile-up uncertainties, which are small. The normalization uncertainties listed below are included for the predictions shown in this section, but due to their small effect on the measured top quark mass they are not included in the final measurement.

For the signal, the \(5.7\%\) uncertainty in the \(t\bar{t}\) cross-section introduced in Sect. 3 and a \(6.0\%\) uncertainty in the single-top-quark cross-section are used. The latter uncertainty is obtained from the cross-section uncertainties given in Sect. 3 and the fractions of the various single-top-quark production processes after the selection requirements. The background uncertainties contain uncertainties of \(48\%\) in the normalization of the diboson and \(Z\)+jets production processes. These uncertainties are calculated using Berends–Giele scaling [75]. Assuming a top quark mass of \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\), the predicted number of events is consistent within uncertainties with the number observed in the data.

Table 1 The observed numbers of events in data after the event preselection and the \(\mathrm {BDT}\) selection (see Sect. 6). In addition, the expected numbers of signal events for \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\) and background events corresponding to the same integrated luminosity as the data are given. The uncertainties in the predicted number of events take into account the statistical and systematic sources explained in Sect. 4.3. Two significant digits are used for the uncertainties in the predicted events

5 Reconstruction of the three observables

As in Ref. [9], a full kinematic reconstruction of the event is done with a likelihood fit using the KLFitter package [76, 77]. The KLFitter algorithm relates the measured kinematics of the reconstructed objects to the leading-order representation of the \(t\bar{t}\) system decay using \(t\bar{t} \rightarrow \ell \nu b_\mathrm {lep} \,q_1 q_2 b_\mathrm {had} \). In this procedure, the measured jets correspond to the quark decay products of the \(W\) boson, \(q_1\) and \(q_2\), and to the \(b\text {-quarks}\), \(b_\mathrm {lep}\) and \(b_\mathrm {had}\), produced in the semi-leptonic and hadronic top quark decays, respectively.

The event likelihood is the product of Breit–Wigner (BW) distributions for the \(W\) bosons and top quarks and transfer functions (TFs) for the energies of the reconstructed objects that are input to KLFitter. The W boson BW distributions use the world combined values of the W boson mass and decay width from Ref. [3]. A common mass parameter \(m_{\mathrm {top}} ^{\mathrm {reco}}\) is used for the BW distributions describing the semi-leptonically and hadronically decaying top quarks and is fitted event-by-event. The top quark width varies with \(m_{\mathrm {top}} ^{\mathrm {reco}}\) according to the SM prediction [3]. The TFs are derived from the Powheg+Pythia \(t\bar{t}\) signal MC simulation sample at an input mass of \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\). They represent the experimental resolutions in terms of the probability that the observed energy at reconstruction level is produced by a given parton-level object for the leading-order decay topology and in the fit constrain the variations of the reconstructed objects.

The input objects to the event likelihood are the reconstructed charged lepton, the missing transverse momentum and up to six jets. These are the two \(b\text {-tagged}\) jets and the four untagged jets with the highest \(p_{\text {T}}\). The x- and y-components of the missing transverse momentum are starting values for the neutrino transverse-momentum components, and its longitudinal component \(p_{\nu ,z}\) is a free parameter in the kinematic likelihood fit. Its starting value is computed from the \(W\rightarrow \ell \nu \) mass constraint. If there are no real solutions for \(p_{\nu ,z}\), a starting value of zero is used. If there are two real solutions, the one giving the largest likelihood value is taken.

Maximizing the event-by-event likelihood as a function of \(m_{\mathrm {top}} ^{\mathrm {reco}}\) establishes the best assignment of reconstructed jets to partons from the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) decay. The maximization is performed by testing all possibilities for assigning \(b\text {-tagged}\) jets to \(b\text {-quark}\) positions and untagged jets to light-quark positions. With the above settings of the reconstruction algorithm, compared with the settingsFootnote 4 used in Ref. [9], a larger fraction of correct assignments of reconstructed jets to partons from the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) decay is achieved. The performance of the reconstruction algorithm is discussed in Sect. 6.

The value of \(m_{\mathrm {top}} ^{\mathrm {reco}}\) obtained from the kinematic likelihood fit is used as the observable primarily sensitive to the underlying \(m_{\mathrm {top}}\). The invariant mass of the hadronically decaying \(W\) boson \(m_{W}^{\mathrm {reco}}\), which is sensitive to the \(\mathrm {JES}\), is calculated from the assigned jets of the chosen permutation. Finally, an observable called \(R_{bq}^{\mathrm {reco}}\), designed to be sensitive to the \(\mathrm {bJES}\), is computed as the scalar sum of the transverse momenta of the two \(b\text {-tagged}\) jets divided by the scalar sum of the transverse momenta of the two jets associated with the hadronic \(W\) boson decay:

$$\begin{aligned} R_{bq}^{\mathrm {reco}}&= \frac{p_\mathrm {T}^{b_\mathrm{had}} + p_\mathrm {T}^{b_\mathrm{lep}}}{p_\mathrm {T}^{q_1} + p_\mathrm {T}^{q_2}}. \end{aligned}$$

The values of \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\) are computed from the jet four-vectors as given by the jet reconstruction instead of using the values obtained in the kinematic likelihood fit. This ensures the maximum sensitivity to the jet calibration for light-jets and \(b\text {-jets}\).

Some distributions of the observed event kinematics after the event preselection and for the best permutation are shown in Fig. 1. Given the good description of the observed number of events by the prediction shown in Sect. 4.3 and that the measurement of \(m_{\mathrm {top}}\) is mostly sensitive to the shape of the distributions, the comparison of the data with the predictions is based solely on the distributions normalized to the number of events observed in data. The systematic uncertainty assigned to each bin is calculated from the sum in quadrature of all systematic uncertainties discussed in Sect. 4.3. Within uncertainties, the predictions agree with the observed distributions in Fig. 1, which shows the transverse momentum of the lepton, the average transverse momentum of the jets, the transverse momentum of the hadronically decaying top quark \(p_{\mathrm {T, had}}\), the transverse momentum of the \(t\bar{t}\) system, the logarithm of the event likelihood of the best permutation and the distance \(\Delta R\) of the two untagged jets \(q_1\) and \(q_2\) assigned to the hadronically decaying \(W\) boson. The distributions of transverse momenta predicted by the simulation, e.g. the \(p_{\mathrm {T, had}}\) distribution shown in Fig. 1c, show a slightly different trend than observed in data, with the data being softer. This difference is fully covered by the uncertainties. This trend was also observed in Ref. [14] for the \(p_{\mathrm {T}, \ell b}\) distribution in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) channel and in the measurement of the differential \(t\bar{t}\) cross-section in the lepton+jets channel [78].

Fig. 1
figure 1

Distributions for the events passing the preselection. The data are shown together with the signal-plus-background prediction, normalized to the number of events observed in the data. The hatched area is the uncertainty in the prediction as described in the text. The rightmost bin contains all entries with values above the lower edge of this bin, similarly the leftmost bin contains all entries with values below the upper edge of this bin. a shows the transverse momentum of the lepton, b shows the average transverse momentum of the jets, c shows the transverse momentum of the hadronically decaying top quark, d shows the transverse momentum of the \(t\bar{t}\) system, e shows the logarithm of the event likelihood of the best permutation and f shows the distance \(\Delta R\) of the two untagged jets \(q_1\) and \(q_2\) from the hadronically decaying \(W\) boson

In anticipation of the template parameterization described in Sect. 7, the following restrictions on the three observables are applied: \(125 \le m_{\mathrm {top}} ^{\mathrm {reco}} \le 200~\text {GeV}\), \(55 \le m_{W}^{\mathrm {reco}} \le 110~\text {GeV}\), and \(0.3 \le R_{bq}^{\mathrm {reco}} \le 3\). Since in this analysis only the best permutation is considered, events that do not pass these requirements are rejected. This removes events in the tails of the three distributions, which are typically poorly reconstructed with small likelihood values and do not contain significant information about \(m_{\mathrm {top}}\). The resulting templates have simpler shapes, which are easier to model analytically with fewer parameters. The preselection with these additional requirements is referred to as the standard selection to distinguish it from the boosted decision tree (BDT) optimization for the smallest total uncertainty in \(m_{\mathrm {top}}\), discussed in the next section.

6 Multivariate analysis and BDT event selection

For the measurement of \(m_{\mathrm {top}}\), the event selection is refined enriching the fraction of events with correct assignments of reconstruction-level objects to their generator-level counterparts which should be better measured and therefore lead to smaller uncertainties. The optimization of the selection is based on the multivariate BDT algorithm implemented in the TMVA package [79]. The reconstruction-level objects are matched to the closest parton-level object within a \(\Delta R\) of 0.1 for electrons and muons and 0.3 for jets. A matched object is defined as a reconstruction-level object that falls within the relevant \(\Delta R\) of any parton-level object of that type, and a correct match means that this generator-level object is the one it originated from. Due to acceptance losses and reconstruction inefficiencies, not all reconstruction-level objects can successfully be matched to their parton-level counterparts. If any object cannot be unambiguously matched, the corresponding event is referred to as unmatched. The efficiency for correctly matched events \(\epsilon _{\mathrm {cm}}\) is the fraction of correctly matched events among all the matched events, and the selection purity \(\pi _{\mathrm {cm}}\) is the fraction of correctly matched events among all selected events, regardless of whether they could be matched or not.

The BDT algorithm is exploited to enrich the event sample in events that have correct jet-to-parton matching by reducing the remainder, i.e. the sum of incorrectly matched and unmatched events. Using the preselection, the BDT algorithm is trained on the simulated \(t\bar{t}\) signal sample with \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\). Many variables were studied and only those with a separationFootnote 5 larger than \(0.1\%\) are used in the training. The 13 variables chosen for the final training are given in Table 2. For all input variables to the BDT algorithm, good agreement between the MC predictions and the data is found, as shown in Fig. 1e, f for the examples of the likelihood of the chosen permutation and the opening angle \(\Delta R\) of the two untagged jets associated with the \(W\) boson decay. These two variables also have the largest separation for the correctly matched events and the remainder. The corresponding distributions for the two event classes are shown in Fig. 2a, b. These figures show a clear separation of the correctly matched events and the remainder. Half the simulation sample is used to train the algorithm and the other half to assess its performance. The significant difference between the distributions of the output value \(r_{\mathrm {BDT}}\) of the BDT classifier between the two classes of events in Fig. 2c shows their efficient separation by the BDT algorithm. In addition, reasonable agreement is found for the \(r_{\mathrm {BDT}}\) distributions in the statistically independent test and training samples. The \(r_{\mathrm {BDT}}\) distributions in simulation and data in Fig. 2d agree within the experimental uncertainties. The above findings justify the application of the BDT approach to the data.

Table 2 The input variables to the BDT algorithm sorted by their separation
Fig. 2
figure 2

Input and results of the BDT training on \(t\bar{t}\) signal events for the preselection. a shows the logarithm of the event likelihood of the best permutation (\(\ln L\)) for the correctly matched events and the remainder. Similarly, b shows the distribution of the \(\Delta R\) between the two untagged jets assigned to the \(W\) boson decay. c shows the distribution of the BDT output (\(r_{\mathrm {BDT}}\)) for the two classes of events for both the training (histograms) and test samples (points with statistical uncertainties). The compatibility in terms of the \(\chi ^2\) probability is also listed. The distributions peaking at around \(r_{\mathrm {BDT}} =0.1\) are for the correctly matched events, the ones to the left are for incorrectly or unmatched events. The ratio figure shows the difference between the number of events in the training and test samples divided by the statistical uncertainty in this difference. Finally, d shows the comparison of the \(r_{\mathrm {BDT}}\) distributions observed in data and MC simulation. The hatched area includes the uncertainties as detailed in the text. The uncertainty bars correspond to the statistical uncertainties in the data

Fig. 3
figure 3

Various classes of \(m_{\mathrm {top}}\) uncertainties as a function of the minimum requirement on the BDT output \(r_{\mathrm {BDT}}\) and for the standard selection. The total uncertainty (solid line) is the sum in quadrature of the statistical (dotted line) and total systematic uncertainty (short dash-dotted line). The total systematic uncertainty consists of the total experimental (dashed line) and total signal-modelling uncertainty (long dash-dotted line). The uncertainties in the background estimate are included in the total experimental uncertainty. The minimum requirement on \(r_{\mathrm {BDT}}\) defining the \(\mathrm {BDT}\) selection is indicated by the vertical black dashed line. All uncertainties are included except for the method and the pile-up uncertainties

The full \(m_{\mathrm {top}}\) analysis detailed in Sect. 8 is performed, except for the evaluation of the small method and pile-up uncertainties described in Sect. 8, for several minimum requirements on \(r_{\mathrm {BDT}}\) in the range of \([-0.10, 0.05]\) in steps of 0.05 to find the point with smallest total uncertainty. The total uncertainty in \(m_{\mathrm {top}}\) together with the various classes of uncertainty sources as a function of \(r_{\mathrm {BDT}}\) evaluated in the \(\mathrm {BDT}\) optimization are shown in Fig. 3. The minimum requirement \(r_{\mathrm {BDT}} =-0.05\) provides the smallest total uncertainty in \(m_{\mathrm {top}}\). The resulting numbers of events for this BDT selection are given in Table 1. Compared with the preselection, \(\epsilon _{\mathrm {cm}}\) is increased from \(0.71\) to \(0.82\), albeit at the expense of a significant reduction in the number of selected events. The purity \(\pi _{\mathrm {cm}}\) is increased from \(0.28\) to \(0.41\). In addition, the intrinsic resolution in \(m_{\mathrm {top}}\) of the remaining event sample is improved, i.e. the statistical uncertainty in \(m_{\mathrm {top}}\) in Fig. 3 is almost constant as a function of \(r_{\mathrm {BDT}}\); in particular, it does not scale with the square root of the number of events retained. For the signal sample with \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\), the template fit functions for the standard selection and the \(\mathrm {BDT}\) selection, together with their ratios, are shown in Fig. 12 in Appendix A. The shape of the signal modelling uncertainty derives from a sum of contributions with different shapes. The curves from the signal Monte Carlo generator and colour reconnection uncertainties decrease, the one from the underlying event uncertainty is flat, the one from the initial- and final-state QCD radiation has a valley similar to the sum of all contributions, and finally the one from the hadronization uncertainty rises.

Some distributions of the observed event kinematics after the \(\mathrm {BDT}\) selection are shown in Fig. 4. Good agreement between the MC predictions and the data is found, as seen for the preselection in Fig. 1. The examples shown are the observed \(W\) boson transverse mass for the semi-leptonically decaying top quark in Fig. 4a and the three observables of the \(m_{\mathrm {top}}\) analysis (within the ranges of the template fit) in Fig. 4b–d. The sharp edge observed at 30 \(\text {GeV}\) in Fig. 4a originates from the different selection requirements for the \(W\) boson transverse mass in the electron+jets and muon+jets final states.

Fig. 4
figure 4

Distributions for the events passing the BDT selection. The data are shown, together with the signal-plus-background prediction normalized to the number of events observed in the data. The hatched area is the uncertainty in the prediction described in the text. The rightmost bin contains all entries with values above the lower edge of this bin, similarly the leftmost bin contains all entries with values below the upper edge of this bin. a shows the \(W\) boson transverse mass for the semi-leptonic top quark decay. The remaining figures show the three observables used for the determination of \(m_{\mathrm {top}}\), where b shows the reconstructed top quark mass \(m_{\mathrm {top}} ^{\mathrm {reco}}\), c shows the reconstructed invariant mass of the \(W\) boson \(m_{W}^{\mathrm {reco}}\) and d shows the reconstructed ratio of jet transverse momenta \(R_{bq}^{\mathrm {reco}}\). The three distributions are shown within the ranges of the template fit

7 Template fit

This analysis uses a three-dimensional template fit technique which determines \(m_{\mathrm {top}}\) together with the jet energy scale factors \(\mathrm {JSF}\) and \(\mathrm {bJSF}\). The aim of the multi-dimensional fit to the data is to measure \(m_{\mathrm {top}}\) and, at the same time, to absorb the mean differences between the jet energy scales observed in data and MC simulated events into jet energy scale factors. By using \(\mathrm {JSF}\) and \(\mathrm {bJSF}\), most of the uncertainties in \(m_{\mathrm {top}}\) induced by \(\mathrm {JES}\) and \(\mathrm {bJES}\) uncertainties are transformed into additional statistical components caused by the higher dimensionality of the fit. This method reduces the total uncertainty in \(m_{\mathrm {top}}\) only for sufficiently large data samples. In this case, the sum in quadrature of the additional statistical uncertainty in \(m_{\mathrm {top}}\) due to the \(\mathrm {JSF}\)  (or \(\mathrm {bJSF}\)) fit and the residual \(\mathrm {JES}\)-induced (or \(\mathrm {bJES}\)-induced) systematic uncertainty is smaller than the original \(\mathrm {JES}\)-induced (or \(\mathrm {bJES}\)-induced) uncertainty in \(m_{\mathrm {top}}\). This situation was already realized for the \({\sqrt{s}} =7\) \(\text {TeV}\) data analysis [9] and is even more advantageous for the much larger data sample of the \({\sqrt{s}} =8\) \(\text {TeV}\) data analysis. Since \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) are global factors, they do not completely absorb the \(\mathrm {JES}\) and \(\mathrm {bJES}\) uncertainties which have \(p_{\text {T}}\)- and \(\eta \)-dependent components.

For simultaneously determining \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\), templates are constructed from the MC samples. Templates of \(m_{\mathrm {top}} ^{\mathrm {reco}}\) are constructed with several input \(m_{\mathrm {top}}\) values used in the range 167.5–177.5 \(\text {GeV}\) and for the sample at \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\) also with independent input values for \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) in the range 0.96–1.04 in steps of 0.02. Statistically independent MC samples are used for different input values of \(m_{\mathrm {top}}\). The templates with different values of \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) are constructed by scaling the energies of the jets appropriately. In this procedure, \(\mathrm {JSF}\) is applied to all jets, while \(\mathrm {bJSF}\) is solely applied to \(b\text {-jets}\) according to the generated quark flavour. The scaling is performed after the various correction steps of the jet calibration but before the event selection. This procedure results in different events passing the \(\mathrm {BDT}\) selection from one energy scale variation to another. However, many events are in all samples, resulting in a large statistical correlation of the samples with different jet scale factors. Similarly, templates of \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\) are constructed with the above listed input values of \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\).

Independent signal templates are derived for the three observables for all \(m_{\mathrm {top}}\)-dependent samples, consisting of the \(t\bar{t}\) signal events and single-top-quark production events. This procedure is adopted because single-top-quark production carries information about the top quark mass, and in this way, \(m_{\mathrm {top}}\)-independent background templates can be used. The signal templates are simultaneously fitted to the sum of a Gaussian and two Landau functions for \(m_{\mathrm {top}} ^{\mathrm {reco}}\), to the sum of two Gaussian functions for \(m_{W}^{\mathrm {reco}}\) and to the sum of two Gaussian and one Landau function for \(R_{bq}^{\mathrm {reco}}\). This set of functions leads to an unbiased estimate of \(m_{\mathrm {top}}\), but is not unique. For the background, the \(m_{\mathrm {top}} ^{\mathrm {reco}}\) distribution is fitted to a Landau function, while both the \(m_{W}^{\mathrm {reco}}\) and the \(R_{bq}^{\mathrm {reco}}\) distributions are fitted to the sum of two Gaussian functions.

In Fig. 5a–c, the sensitivity of \(m_{\mathrm {top}} ^{\mathrm {reco}}\) to the fit parameters \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) is shown by the superposition of the signal templates and their fits for three input values per varied parameter. In a similar way, the sensitivity of \(m_{W}^{\mathrm {reco}}\) to \(\mathrm {JSF}\) is shown in Fig. 5d. The dependences of \(m_{W}^{\mathrm {reco}}\) on the input values of \(m_{\mathrm {top}}\) and \(\mathrm {bJSF}\) are negligible and are not shown. Consequently, to increase the size of the simulation sample, the fit is performed on the sum of the \(m_{W}^{\mathrm {reco}}\) distributions of the samples with different input top quark masses. Finally, the sensitivity of \(R_{bq}^{\mathrm {reco}}\) to the input values of \(m_{\mathrm {top}}\) and \(\mathrm {bJSF}\) is shown in Fig. 5e, f. The dependence of \(R_{bq}^{\mathrm {reco}}\) on \(\mathrm {JSF}\)  (not shown) is much weaker than the dependence on \(\mathrm {bJSF}\).

Fig. 5
figure 5

Template parameterizations for signal events, composed of \(t\bar{t}\) and single-top-quark production events. ac show the sensitivity of \(m_{\mathrm {top}} ^{\mathrm {reco}}\) to \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\), d shows the sensitivity of \(m_{W}^{\mathrm {reco}}\) to \(\mathrm {JSF}\) and e, f show the sensitivity of \(R_{bq}^{\mathrm {reco}}\) to \(m_{\mathrm {top}}\) and \(\mathrm {bJSF}\). Each template is overlaid with the corresponding probability density function from the combined fit to all templates described in the text. The ratios shown are calculated relative to the probability density function of the central sample with \(m_{\mathrm {top}} = 172.5\) \(\text {GeV}\), \(\mathrm {JSF} = 1\) and \(\mathrm {bJSF} = 1\)

For the signal, the parameters of the fitting functions for \(m_{\mathrm {top}} ^{\mathrm {reco}}\) depend linearly on \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\). The parameters of the fitting functions for \(m_{W}^{\mathrm {reco}}\) depend linearly on \(\mathrm {JSF}\). Finally, the parameters of the fitting functions for \(R_{bq}^{\mathrm {reco}}\) depend linearly on \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\). For the background, the dependences of the parameters of the fitting functions are identical to those for the signal, except that they do not depend on \(m_{\mathrm {top}}\) and that those for \(R_{bq}^{\mathrm {reco}}\) do not depend on \(\mathrm {JSF}\).

Signal and background probability density functions \(P_{\mathrm {top}}^{\mathrm {sig}}\) and \(P_{\mathrm {top}}^{\mathrm {bkg}}\) for the \(m_{\mathrm {top}} ^{\mathrm {reco}}\), \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\) distributions are used in an unbinned likelihood fit to the data for all events, \(i=1,\dots N\). The likelihood function maximized is

$$\begin{aligned}&L _\mathrm {shape}^{\ell \mathrm {+jets}}(m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}, f_{\mathrm {bkg}}) \nonumber \\&\quad =\prod _{i=1}^{N} P_{\mathrm {top}} (m_{\mathrm {top}} ^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}, f_{\mathrm {bkg}}) \nonumber \\&\qquad \times \,P_{W} (m_{W}^{\mathrm {reco}, i} \,\vert \,\mathrm {JSF}, f_{\mathrm {bkg}}) \nonumber \\&\qquad \times \,P_{R_{bq}} (R_{bq}^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}, f_{\mathrm {bkg}}), \end{aligned}$$
(1)

with

$$\begin{aligned}&P_{\mathrm {top}} (m_{\mathrm {top}} ^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}, f_{\mathrm {bkg}}) \\&\quad = (1-f_{\mathrm {bkg}})\cdot P_{\mathrm {top}}^{\mathrm {sig}} (m_{\mathrm {top}} ^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}) \\&\qquad +f_{\mathrm {bkg}} \cdot P_{\mathrm {top}}^{\mathrm {bkg}} (m_{\mathrm {top}} ^{\mathrm {reco}, i} \,\vert \,\mathrm {JSF}, \mathrm {bJSF}), \\&P_{W} (m_{W}^{\mathrm {reco}, i} \,\vert \,\mathrm {JSF}, f_{\mathrm {bkg}}) \\&\quad = (1-f_{\mathrm {bkg}})\cdot P_{W}^{\mathrm {sig}} (m_{W}^{\mathrm {reco}, i} \,\vert \,\mathrm {JSF}) \\&\qquad +f_{\mathrm {bkg}} \cdot P_{W}^{\mathrm {bkg}} (m_{W}^{\mathrm {reco}, i} \,\vert \,\mathrm {JSF}),\quad \text {and} \\&P_{R_{bq}} (R_{bq}^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}, f_{\mathrm {bkg}})\\&\quad = (1-f_{\mathrm {bkg}})\cdot P_{R_{bq}}^{\mathrm {sig}} (R_{bq}^{\mathrm {reco}, i} \,\vert \,m_{\mathrm {top}}, \mathrm {JSF}, \mathrm {bJSF}) \\&\qquad +f_{\mathrm {bkg}} \cdot P_{R_{bq}}^{\mathrm {bkg}} (R_{bq}^{\mathrm {reco}, i} \,\vert \, \mathrm {bJSF}) \end{aligned}$$

where the fraction of background events is denoted by \(f_{\mathrm {bkg}}\). The parameters determined by the fit are \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\), while \(f_{\mathrm {bkg}}\) is fixed to its expectation shown in Table 1. It was verified that the correlations between \(m_{\mathrm {top}} ^{\mathrm {reco}}\), \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\) of \(\rho (m_{\mathrm {top}} ^{\mathrm {reco}}, m_{W}^{\mathrm {reco}})= 0.05 \), \(\rho (m_{\mathrm {top}} ^{\mathrm {reco}}, R_{bq}^{\mathrm {reco}})= 0.18 \), and \(\rho (m_{W}^{\mathrm {reco}}, R_{bq}^{\mathrm {reco}})= -0.13 \), are small enough that formulating the likelihood in Eq. (1) as a product of three one-dimensional likelihoods does not bias the result.

Pseudo-experiments are used to verify the internal consistency of the fitting procedure and to obtain the expected statistical uncertainty for the data. For each set of parameter values, \(500\) pseudo-experiments are performed, each corresponding to the integrated luminosity of the data. To retain the correlation of the three observables for the three-dimensional fit, individual events are used. Because this exceeds the number of available MC events, results are corrected for oversampling [80]. The results of pseudo-experiments for different input values of \(m_{\mathrm {top}}\) are obtained from statistically independent samples, while the results for different \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) are obtained from statistically correlated samples as explained above. For each fitted quantity and each variation of input parameters, the residual, i.e. the difference between the input value and the value obtained by the fit, is compatible with zero. The three expected statistical uncertainties are

$$\begin{aligned} \sigma _{\mathrm {stat}} (m_{\mathrm {top}})&= 0.389 \pm 0.004 ~\text {GeV}, \\ \sigma _{\mathrm {stat}} (\mathrm {JSF})&= 0.00115 \pm 0.00001,\quad \text {and}\\ \sigma _{\mathrm {stat}} (\mathrm {bJSF})&= 0.0046 \pm 0.0001, \end{aligned}$$

where the values quoted are the mean and RMS of the distribution of the statistical uncertainties in the fitted quantities from pseudo-experiments. The widths of the pull distributions are below unity for \(m_{\mathrm {top}}\) and the two jet scale factors, which results in an overestimation of the uncertainty in \(m_{\mathrm {top}}\) of up to 7\(\%\). Since this leads to a conservative estimate of the uncertainty in \(m_{\mathrm {top}}\), no attempts to mitigate this feature are made.

8 Uncertainties affecting the \(\mathbf {m_{\mathrm {top}}}\) determination

Table 3 Systematic uncertainties in \(m_{\mathrm {top}}\). The measured values of \(m_{\mathrm {top}}\) are given together with the statistical and systematic uncertainties in \(\text {GeV}\) for the standard and the BDT event selections. For comparison, the result in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel at \({\sqrt{s}} =7\) \(\text {TeV}\) from Ref. [9] is also listed. For each systematic uncertainty listed, the first value corresponds to the uncertainty in \(m_{\mathrm {top}}\), and the second to the statistical precision in this uncertainty. An integer value of zero means that the corresponding uncertainty is negligible and therefore not evaluated. Statistical uncertainties quoted as 0.00 are smaller than 0.005. The statistical uncertainty in the total systematic uncertainty is calculated from uncertainty propagation. The last line refers to the sum in quadrature of the statistical and systematic uncertainties

This section focuses on the treatment of uncertainty sources of a systematic nature. The same systematic uncertainty sources as in Ref. [9] are investigated. If possible, the corresponding uncertainty in \(m_{\mathrm {top}}\) is evaluated by varying the respective quantities by \(\pm 1 \sigma \) from their default values, constructing the corresponding event sample and measuring the average \(m_{\mathrm {top}}\) change relative to the result from the nominal MC sample with \(500\) pseudo-experiments each, drawn from the full MC sample. In the absence of a \(\pm 1 \sigma \) variation, e.g. for the evaluation of the uncertainty induced by the choice of signal MC generator, the full observed difference is assigned as a symmetric systematic uncertainty and further treated as a variation equivalent to a \(\pm 1 \sigma \) variation. Wherever a \(\pm 1 \sigma \) variation can be performed, half the observed difference between the \(+1\sigma \) and \(-1\sigma \) variation in \(m_{\mathrm {top}}\) is assigned as an uncertainty if the \(m_{\mathrm {top}}\) values obtained from the variations lie on opposite sides of the nominal result. If they lie on the same side, the maximum observed difference is taken as a symmetric systematic uncertainty. Since the systematic uncertainties are derived from simulation or data samples with limited numbers of events, all systematic uncertainties have a corresponding statistical uncertainty, which is calculated taking into account the statistical correlation of the considered samples, as explained in Sect. 8.5. The statistical uncertainty in the total systematic uncertainty is dominated by the limited sizes of the simulation samples. The resulting systematic uncertainties are given in Table 3 independent of their statistical significance. Further information is given in Tables 8, 9, 10, 11 and 12 in Appendix A. This approach follows the suggestion in Ref. [81] and relies on the fact that, given a large enough number of considered uncertainty sources, statistical fluctuations average out.Footnote 6 The uncertainty sources are designed to be uncorrelated with each other, and thus the total uncertainty is taken as the sum in quadrature of uncertainties from all sources. The individual uncertainties are compared in Table 3 for three cases: the standard selection for the \({\sqrt{s}} =7\) \(\text {TeV}\) [9] and 8 \(\text {TeV}\) data and the \(\mathrm {BDT}\) selection for \({\sqrt{s}} =8\) \(\text {TeV}\) data. Many uncertainties in \(m_{\mathrm {top}}\) obtained with the standard selection at the two centre-of-mass energies agree within their statistical uncertainties such that the resulting total systematic uncertainties are almost identical. Consequently, repeating the \({\sqrt{s}} =7\) \(\text {TeV}\) analysis on \({\sqrt{s}} =8\) \(\text {TeV}\) data would have only improved the statistical precision. The picture changes when comparing the uncertainties in \({\sqrt{s}} =8\) \(\text {TeV}\) data for the standard selection and the \(\mathrm {BDT}\) selection. In general, the experimental uncertainties change only slightly, with the largest reduction observed for the JES uncertainty. In contrast, a large improvement comes from the reduced uncertainties in the modelling of the \(t\bar{t}\) signal processes as shown in Table 3. This, together with the improved intrinsic resolution in \(m_{\mathrm {top}}\), more than compensates for the small loss in precision caused by the increased statistical uncertainty. The individual sources of systematic uncertainties and the evaluation of their effect on \(m_{\mathrm {top}}\) are described in the following.

8.1 Statistics and method calibration

Uncertainties related to statistical effects and the method calibration are discussed here.

Statistical: The quoted statistical uncertainty consists of three parts: a purely statistical component in \(m_{\mathrm {top}}\) and the contributions stemming from the simultaneous determination of \(\mathrm {JSF}\) and \(\mathrm {bJSF}\). The purely statistical component in \(m_{\mathrm {top}}\) is obtained from a one-dimensional template method exploiting only the \(m_{\mathrm {top}} ^{\mathrm {reco}}\) observable, while fixing the values of \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) to the results of the three-dimensional analysis. The contribution to the statistical uncertainty in the fitted parameters due to the simultaneous fit of \(m_{\mathrm {top}}\) and \(\mathrm {JSF}\) is estimated as the difference in quadrature between the statistical uncertainty in a two-dimensional fit to \(m_{\mathrm {top}} ^{\mathrm {reco}}\) and \(m_{W}^{\mathrm {reco}}\) while fixing the value of \(\mathrm {bJSF}\) and the one-dimensional fit to the data described above. Analogously, the contribution of the statistical uncertainty due to the simultaneous fit of \(m_{\mathrm {top}}\) together with \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) is defined as the difference in quadrature between the statistical uncertainties obtained in the three-dimensional and the two-dimensional fits to the data. This separation allows a comparison of the statistical sensitivities of the \(m_{\mathrm {top}}\) estimator used in this analysis, to those of analyses exploiting a different number of observables in the fit. In addition, the sensitivity of the estimators to the global jet energy scale factors can be compared directly. These uncertainties are treated as uncorrelated uncertainties in \(m_{\mathrm {top}}\) combinations. Together with the systematic uncertainty in the residual jet energy scale uncertainties discussed below, they directly replace the uncertainty in \(m_{\mathrm {top}}\) from the jet energy scale variations present without the in situ determination.

Method: The residual difference between fitted and generated \(m_{\mathrm {top}}\) when analysing a template from a MC sample reflects the potential bias of the method. Consequently, the largest observed fitted \(m_{\mathrm {top}}\) residual and the largest observed statistical uncertainty in this quantity, in any of the five signal samples with different assumed values of \(m_{\mathrm {top}}\), is assigned as the method calibration uncertainty and its corresponding statistical uncertainty, respectively. This also covers effects from limited numbers of simulated events in the templates and potential deficiencies in the template parameterizations.

8.2 Modelling of signal processes

The modelling of \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) events incorporates a number of processes that have to be accurately described, resulting in systematic effects, ranging from the \(t\bar{t}\) production to the hadronization of the showered objects.

Thanks to the restrictive event-selection requirements, the contribution of non-\(t\bar{t}\) processes, comprising the single-top-quark process and the various background processes, is very low. The systematic uncertainty in \(m_{\mathrm {top}}\) from the uncertainty in the single-top-quark normalization is estimated from the corresponding uncertainty in the theoretical cross-section given in Sect. 3. The resulting systematic uncertainty is small compared with the systematic uncertainty in the \(t\bar{t}\) production and is consequently neglected. For the modelling of the signal processes, the consequence of including single-top-quark variations in the uncertainty evaluation was investigated for various uncertainty sources and found to be negligible. Therefore, the single-top-quark variations are not included in the determination of the signal event uncertainties.

Signal Monte Carlo generator: The full observed difference in fitted \(m_{\mathrm {top}}\) between the event samples produced with the Powheg-Box and MC@NLO  [82, 83] programs is quoted as a systematic uncertainty. For the renormalization and factorization scales the Powheg-Box sample uses the function given in Sect. 3, while the MC@NLO sample uses \(\mu _\mathrm {R, F} =\sqrt{m_{\mathrm {top}} ^2 + 0.5 (p_{\mathrm{T},t}^2 + p_{\mathrm{T}, \bar{t}}^2)}\). Both samples are generated with a top quark mass of \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\) with the CT10 PDFs in the matrix-element calculation and use the Herwig and Jimmy programs with the ATLAS AUET2 tune [47].

Hadronization: To cover the choice of parton shower and hadronization models, samples produced with the Powheg-Box program are showered with either the Pythia6 program using the P2011C tune or the Herwig and Jimmy programs using the ATLAS AUET2 tune. This includes different approaches in shower modelling, such as using a \(p_{\text {T}}\)-ordered parton showering in the Pythia program or angular-ordered parton showering in the Herwig program, the different parton shower matching scales, as well as fragmentation functions and hadronization models, such as choosing the Lund string model [84, 85] implemented in the Pythia program or the cluster fragmentation model [86] used in the Herwig program. The full observed difference between the samples is quoted as a systematic uncertainty.

As shown in Fig. 1, the distributions of transverse momenta in data are slightly softer than those in the Powheg+Pythia MC simulation samples. Similarly to what was observed in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) channel for the \(p_{\mathrm {T}, \ell b}\) distribution, in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel the Powheg+Herwig sample is much closer to the data for several distributions of transverse momenta. The \(p_{\mathrm {T, had}}\) distribution is much better described by the Powheg+Herwig sample as was also observed in Ref. [78]. In addition, but to a lesser extent, the MC@NLO sample used to assess the signal Monte Carlo generator uncertainty and the samples to assess the initial- and final-state QCD radiation uncertainty discussed next also lead to a softer distribution in simulation. Given this, the observed difference in the \(p_{\mathrm {T, had}}\) distribution is covered by a combination of the signal-modelling uncertainties given in Table 3.

Despite the fact that the \(\mathrm {JES}\) and \(\mathrm {bJES}\) are estimated independently using dijet and other non-\(t\bar{t}\) samples [63], some double-counting of hadronization-uncertainty-induced uncertainties in the \(\mathrm {JES}\) and \(m_{\mathrm {top}}\) cannot be excluded. This was investigated closely for the ATLAS top quark mass measurement in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel at \({\sqrt{s}} =7\) \(\text {TeV}\). The results in Ref. [87] revealed that the amount of double-counting of \(\mathrm {JES}\) and hadronization effects for the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel is small.

Initial- and final-state QCD radiation (ISR/FSR): ISR/FSR leads to a higher jet multiplicity and different jet energies than the hard process, which affects the distributions of the three observables. The uncertainties due to ISR/FSR modelling are estimated with samples generated with the Powheg-Box program interfaced to the Pythia6 program for which the parameters of the generation are varied to span the ranges compatible with the results of measurements of \(t\bar{t}\) production in association with jets [88,89,90]. This uncertainty is evaluated by comparing two dedicated samples that differ in several parameters, namely the QCD scale \(\Lambda _\mathrm {QCD}\), the transverse momentum scale for space-like parton-shower evolution \(Q^2_\mathrm {max}\), the \(h_{\mathrm {damp}}\) parameter [91] and the P2012 RadLo and RadHi tunes [30]. In Ref. [90], it was shown that a number of final-state distributions are better accounted for by the Powheg+Pythia samples with \(h_{\mathrm {damp}} =m_{\mathrm {top}} \). Therefore, these samples are used for evaluating this uncertainty, taking half the observed difference between the up variation and the down variation sample. Because the parameterizations for the template fit to data are obtained from Powheg+Pythia samples using \(h_{\mathrm {damp}} =\infty \), it was verified that, considering the method uncertainty quoted in Table 3, applying the same functions to the \(h_{\mathrm {damp}} =m_{\mathrm {top}} \) samples leads to a result compatible with the input top quark mass.

Underlying event: To reduce statistical fluctuations in the evaluation of this systematic uncertainty, the difference in underlying-event modelling is assessed by comparing a pair of Powheg-Box samples based on the same partonic events generated with the CT10 PDFs. A sample with the P2012 tune is compared with a sample with the P2012 mpiHi tune [30], with both tunes using the same CTEQ6L1 PDFs [92] for parton showering and hadronization. The Perugia 2012 mpiHi tune provides more semi-hard multiple parton interactions and is used for this comparison with identical colour reconnection parameters in both tunes. The full observed difference is assigned as a systematic uncertainty.

Colour reconnection: This systematic uncertainty is estimated using a pair of samples with the same partonic events as for the underlying-event uncertainty evaluation but with the P2012 tune and the P2012 loCR tune [30] for parton showering and hadronization. The full observed difference is assigned as a systematic uncertainty.

Parton distribution function (PDF): The PDF systematic uncertainty is the sum in quadrature of three contributions. These are the sum in quadrature of the differences in fitted \(m_{\mathrm {top}}\) for the 26 eigenvector variations of the CT10 PDF and two differences in \(m_{\mathrm {top}}\) obtained from reweighting the central CT10 PDF set to the MSTW2008 PDF [39] and the NNPDF2.3 PDF [42].

8.3 Modelling of background processes

Uncertainties in the modelling of the background processes are taken into account by variations of the corresponding normalizations and shapes of the distributions.

Background normalization: The normalizations are varied for the data-driven background estimates according to their uncertainties. For the negligible contribution from diboson production, no normalization uncertainty is evaluated.

Background shape: For the \(W\)+jets background, the shape uncertainty is evaluated from the variation of the heavy-flavour fractions. The corresponding uncertainty is small. Given the very small contribution from \(Z\)+jets, diboson and NP/fake-lepton backgrounds, no shape uncertainty is evaluated for these background sources.

8.4 Detector modelling

The level of understanding of the detector response and of the particle interactions therein is reflected in numerous systematic uncertainties.

Jet energy scale (JES): The \(\mathrm {JES}\) is measured with a relative precision of about \(1\%\) to \(4\%\), typically falling with increasing jet \(p_{\text {T}}\) and rising with increasing jet \(\vert \eta \vert \) [93, 94]. The total \(\mathrm {JES}\) uncertainty consists of more than 60 subcomponents originating from the various steps in the jet calibration. The number of these nuisance parameters is reduced with a matrix diagonalization of the full \(\mathrm {JES}\) covariance matrix including all nuisance parameters for a given category of the \(\mathrm {JES}\) uncertainty components.

The analyses of \({\sqrt{s}} =7\) \(\text {TeV}\) and \({\sqrt{s}} =8\) \(\text {TeV}\) data make use of the EM+JES and LCW+GSC [93] jet calibrations, respectively. The two calibrations feature different sets of nuisance parameters, and the LCW+GSC calibration generally has smaller uncertainties than the EM+JES calibration. While the pile-up correction for the jet calibration for \({\sqrt{s}} =7\) \(\text {TeV}\) data only depends on the number of primary vertices (\(n_\mathrm {vtx}\)) and the mean number of interactions per bunch crossing (\(\mu \)), a pile-up subtraction method based on jet area is introduced for the \({\sqrt{s}} =8\) \(\text {TeV}\) data. Terms to account for uncertainties in the pile-up estimation are added. They depend on the jet \(p_{\text {T}}\) and the local transverse momentum density. In addition, the punch-through uncertainty, i.e. an uncertainty for jets that penetrate through to the muon spectrometer, is added. The final reduced number of nuisance parameters for the \({\sqrt{s}} =8\) \(\text {TeV}\) analysis is 25. The JES-uncertainty-induced uncertainty in \(m_{\mathrm {top}}\) is the dominant systematic uncertainty for all results shown in Table 3. When only a one-dimensional fit to \(m_{\mathrm {top}} ^{\mathrm {reco}}\) or a two-dimensional fit to \(m_{\mathrm {top}} ^{\mathrm {reco}}\) and \(m_{W}^{\mathrm {reco}}\) is done, this uncertainty is \(0.99\)  \(\text {GeV}\) or \(0.74\)  \(\text {GeV}\), respectively.

Relative b-to-light-jet energy scale (bJES): The \(\mathrm {bJES}\) uncertainty is an additional uncertainty for the remaining differences between \(b\text {-jets}\) and light-jets after the global \(\mathrm {JES}\) is applied, and therefore the corresponding uncertainty is uncorrelated with the \(\mathrm {JES}\) uncertainty. An additional uncertainty of \(0.2\%\) to \(1.2\%\) is assigned to \(b\text {-jets}\), with the lowest uncertainty for \(b\text {-jets}\) with high transverse momenta [63]. Due to the determination of \(\mathrm {bJSF}\), the \(\mathrm {bJES}\) uncertainty leads to a very small contribution to the uncertainty in \(m_{\mathrm {top}}\) in Table 3. However, performing only a two-dimensional fit to \(m_{\mathrm {top}} ^{\mathrm {reco}}\) and \(m_{W}^{\mathrm {reco}}\) would result in an uncertainty of \(0.47\)  \(\text {GeV}\) from this source.

Jet energy resolution (JER): The JER uncertainty is determined following an eigenvector decomposition strategy similar to the \(\mathrm {JES}\) systematic uncertainties [93, 94]. The 11 components take into account various effects evaluated from simulation-to-data comparisons including calorimeter noise terms in the forward region. The corresponding uncertainty in \(m_{\mathrm {top}}\) is the sum in quadrature of the components of the eigenvector decomposition.

Jet reconstruction efficiency (JRE): This uncertainty is evaluated by randomly removing \(0.23\%\) of the jets with \(p_{\text {T}} < 30\) \(\text {GeV}\) from the simulated events prior to the event selection to reflect the precision with which the data-to-simulation JRE ratio is known [62]. The fitted \(m_{\mathrm {top}}\) difference between the varied sample and the nominal sample is taken as a systematic uncertainty.

Jet vertex fraction (JVF): When summing the scalar \(p_{\text {T}}\) of all tracks in a jet, the JVF is the fraction contributed by tracks originating at the primary vertex. The uncertainty in \(m_{\mathrm {top}}\) is evaluated by varying the requirement on the JVF within its uncertainty [65].

\(\varvec{b}\)-tagging: Mismodelling of the \(b\text {-tagging}\) efficiency and mistag rate is accounted for by the application of jet-specific scale factors to simulated events [66]. These scale factors depend on jet \(p_{\text {T}}\), jet \(\eta \) and the underlying quark flavour. The ones used in this analysis are derived from dijet and \(t\bar{t} \rightarrow \mathrm {dilepton}\)  [66] events. They are the same as those used for the measurement of \(m_{\mathrm {top}}\) in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) channel [14]. Similarly to the \(\mathrm {JES}\) uncertainties, the \(b\text {-tagging}\) uncertainties are estimated by using an eigenvector approach, based on the \(b\text {-tagging}\) calibration analysis [66,67,68]. They include the uncertainties in the \(b\text {-tagging}\), \(c/\tau \)-tagging and mistagging scale factors. This uncertainty in \(m_{\mathrm {top}}\) is derived by varying the scale factors within their uncertainties and adding the resulting fitted differences in quadrature. In this procedure, uncertainties that are considered both in the \(b\text {-tagging}\) calibration and as separate sources in the \(m_{\mathrm {top}}\) analysis are taken into account simultaneously by applying the corresponding varied \(b\text {-tagging}\) scale factors together with the varied sample when assessing the corresponding uncertainty in \(m_{\mathrm {top}}\). The final uncertainty is the sum in quadrature of these independent components. Compared with the result from \({\sqrt{s}} =7\) \(\text {TeV}\) data, this uncertainty is reduced by about one third for both the standard and \(\mathrm {BDT}\) event selections in accordance with the improvements made in the calibrations of the \(b\text {-tagging}\) algorithm [66, 67].

Leptons: The lepton uncertainties are related to the electron energy or muon momentum scale and resolution, as well as trigger, isolation and identification efficiencies. These are measured very precisely in high-purity \(J/\psi \rightarrow \ell ^{+} \ell ^{-} \) and \(Z\rightarrow \ell ^{+} \ell ^{-} \) data [57, 58, 95]. For each component, the corresponding uncertainty is propagated to the analysis by variation of the respective quantity. The changes are propagated to the \(E_{\text {T}}^{\text {miss}}\) as well.

Missing transverse momentum: The remaining contribution to the missing-transverse-momentum uncertainty stems from the uncertainties in calorimeter-cell energies associated with low-\(p_{\text {T}}\) jets (\(7~\text {GeV}< p_{\text {T}} < 20~\text {GeV}\)) without any corresponding reconstructed physics object or from pile-up interactions. They are accounted for as described in Ref. [69]. The corresponding uncertainty in \(m_{\mathrm {top}}\) is small.

Pile-up: Besides the component treated in the \(\mathrm {JES}\) uncertainty, the residual dependence of the fitted \(m_{\mathrm {top}}\) on the amount of pile-up activity and a possible mismodelling of pile-up in MC simulation is determined. For this, the \(m_{\mathrm {top}}\) dependence in bins of \(n_\mathrm {vtx}\) and \(\mu \) is determined for data and MC simulated events. Within the statistical uncertainties, the slopes of the linear dependences of \(m_{\mathrm {top}}\) observed in data and predicted by the MC simulation are compatible. The same is true for \(\mathrm {JSF}\) and \(\mathrm {bJSF}\). The final effect on the measurement is assessed by a convolution of the linear dependence with the respective \(n_\mathrm {vtx}\) and \(\mu \) distributions observed for data and MC simulated events. The maximum of the \(n_\mathrm {vtx}\) and \(\mu \) effects is assigned as an uncertainty due to pile-up. The pile-up conditions differ between the \({\sqrt{s}} =7\) and 8 \(\text {TeV}\) data. For the BDT selection of \({\sqrt{s}} =8\) \(\text {TeV}\) data used here, the average of the mean number of inelastic \(pp\) interactions per bunch crossing is \(\langle \mu \rangle =20.3\) and the average number of reconstructed primary vertices is about \(n_\mathrm {vtx} =9.4\), to be compared with \(\langle \mu \rangle =8.8\) and \(n_\mathrm {vtx} =7.0\) for \({\sqrt{s}} =7\) \(\text {TeV}\) data [65]. The corresponding uncertainty is somewhat larger than for \({\sqrt{s}} =7\) \(\text {TeV}\) data but still small.

8.5 Statistical precision of systematic uncertainties

The systematic uncertainties quoted in Table 3 carry statistical uncertainties themselves. In view of a combination with other measurements, the statistical precision \(\sigma \) from a comparison of two samples (1 and 2) is determined for each uncertainty source based on the statistical correlation \(\rho _{12}\) of the underlying samples using \(\sigma ^{2}_{} = \sigma ^{2}_{1} +\sigma ^{2}_{2}-2\rho _{12} \sigma _{1} \sigma _{2} \). The statistical correlation is expressed as a function of the fraction of shared events of both samples \(\rho _{12} =\sqrt{N_{\mathrm {12}}/N_{\mathrm {1}} \cdot N_{\mathrm {12}}/N_{\mathrm {2}}}=N_{\mathrm {12}}/\sqrt{N_{\mathrm {1}} \cdot N_{\mathrm {2}}}\), with \(N_{\mathrm {1}} \) and \(N_{\mathrm {2}} \) being the unweighted numbers of events in the two samples and \(N_{\mathrm {12}} \) being the unweighted number of events present in both samples. The size of the MC sample at \(m_{\mathrm {top}} =172.5\) \(\text {GeV}\) results in a statistical precision in \(m_{\mathrm {top}}\) of about 0.1 \(\text {GeV}\). Most estimations are based on the same sample with only a change in a single parameter, such as lepton energy scale uncertainties. This leads to a high correlation of the central \(m_{\mathrm {top}}\) values and a correspondingly low statistical uncertainty in their difference. Others, which do not share the same generated events or exhibit other significant differences, have a lower correlation, and the corresponding statistical uncertainty is higher, such as in the case of the signal-modelling uncertainty. The statistical uncertainty in the total systematic uncertainty is calculated from the individual statistical uncertainties by the propagation of uncertainties.

9 Results

For the \(\mathrm {BDT}\) selection, the likelihood fit to the data results in

$$\begin{aligned} m_{\mathrm {top}}&= 172.08 \pm 0.39 \,\mathrm {(stat)}\,\text {GeV}, \\ \mathrm {JSF}&= 1.005 \pm 0.001 \,\mathrm {(stat)}, \quad \text {and} \\ \mathrm {bJSF}&= 1.008 \pm 0.005 \,\mathrm {(stat)}. \end{aligned}$$

The statistical uncertainties are taken from the parabolic approximation of the likelihood profiles. The expected statistical uncertainties, calculated in Sect. 7, are compatible with those. The correlation matrices of the three variables with i = 0, 1 and 2 corresponding to \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) are

$$\begin{aligned} \rho _{\mathrm {stat}}&= \left( \begin{array}{rrr} 1 &{} -0.27 &{} -0.92 \\ -0.27 &{} 1 &{} -0.02 \\ -0.92 &{} -0.02 &{} 1 \end{array} \right) \quad \text{ and }\\ \rho _{\mathrm {tot}}&= \left( \begin{array}{rrr} 1 &{} -0.30 &{} -0.39 \\ -0.30 &{} 1 &{} -0.42 \\ -0.39 &{} -0.42 &{} 1 \end{array} \right) . \end{aligned}$$

The upper matrix corresponds to the correlations for statistical uncertainties only, while the lower matrix is obtained by additionally taking into account all systematic uncertainties.

Figure 6 shows the \(m_{\mathrm {top}} ^{\mathrm {reco}}\), \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\) distributions in the data with statistical uncertainties together with the corresponding fitted probability density functions for the background alone and for the sum of signal and background. The uncertainty band attached to the fit to data is obtained in the following way. At each point in \(m_{\mathrm {top}} ^{\mathrm {reco}}\), \(m_{W}^{\mathrm {reco}}\) and \(R_{bq}^{\mathrm {reco}}\), the band contains 68\(\%\) of all fit function values obtained by randomly varying \(m_{\mathrm {top}}\), \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) within their total uncertainties and taking into account their correlations. The waist in the uncertainty band is caused by the usage of normalized probability density functions. The band visualises the variations of the three template fit functions caused by all the uncertainties in \(m_{\mathrm {top}}\) listed in Table 3. The total uncertainty in all three fitted parameters is dominated by their systematic uncertainty. Therefore, the band shown is much wider than the band that would be obtained by fitting to the distributions with statistical uncertainties only.

Fig. 6
figure 6

Results of the likelihood fit to the data. The figures show the data distributions of the three observables with statistical uncertainties together with the fitted probability density function for the background alone (barely visible at the bottom of the figure) and for the sum of signal and background. The uncertainty band corresponds to the one standard deviation total uncertainty in the fit function. It is based on the total uncertainty in the three fitted parameters as explained in the text. a shows the distribution of the reconstructed top quark mass \(m_{\mathrm {top}} ^{\mathrm {reco}}\), b shows the distribution of the reconstructed \(W\) boson mass \(m_{W}^{\mathrm {reco}}\) and c shows the reconstructed ratio of jet transverse momenta \(R_{bq}^{\mathrm {reco}}\)

The measured value of \(m_{\mathrm {top}}\) in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel at \({\sqrt{s}} =8\) \(\text {TeV}\) is

$$\begin{aligned} m_{\mathrm {top}}&=172.08 \,\pm \,0.39 \,\mathrm {(stat)}\,\pm \,0.82 \,\mathrm {(syst)} ~\text {GeV}\end{aligned}$$

with a total uncertainty of \(0.91\)  \(\text {GeV}\). The statistical precision of the systematic uncertainty is \(0.06\)  \(\text {GeV}\). This result corresponds to a \(19\%\) improvement on the result obtained using the standard selection on the same data. Compared with the result in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel at \({\sqrt{s}} =7\) \(\text {TeV}\), the improvement is \(29\%\). On top of the smaller statistical uncertainty, the increased precision is mainly driven by smaller theory modelling uncertainties achieved by the \(\mathrm {BDT}\) selection. The larger number of events in the \({\sqrt{s}} =8\) \(\text {TeV}\) dataset is effectively traded for lower systematic uncertainties, resulting in a significant gain in total precision. The new ATLAS result in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel is more precise than the result from the CDF experiment, but less precise than the CMS and D0 results, measured in the same channel, as shown in Fig. 14b in Appendix B.

10 Combination with previous ATLAS results

This section presents the combination of the six \(m_{\mathrm {top}}\) results of the ATLAS analyses in the \(t\bar{t} \rightarrow \mathrm {dilepton}\), \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) channels at centre-of-mass energies of \({\sqrt{s}} =7\) and 8 \(\text {TeV}\). The treatment of the results that are input to the combinations are described, followed by a detailed explanation of the evaluation of the estimator correlations for the various sources of systematic uncertainty. The compatibilities of the measured \(m_{\mathrm {top}}\) values are investigated using a pairwise \(\chi ^2 \) for all pairs of measurements and by evaluating the compatibility of selected combinations. Finally, the six results are combined, displaying the effect of individual results on the combined result.

10.1 Inputs to the combination and categorization of uncertainties

The measured values of the individual analyses and their statistical and systematic uncertainties are given in Table 4. For each result, the evaluated systematic uncertainties are shown together with their statistical uncertainties. The statistical uncertainties in the total systematic uncertainties and the total uncertainties are obtained from the propagation of uncertainties.Footnote 7

For the combinations to follow, the combined uncertainties for the previous results, namely \(t\bar{t} \rightarrow \mathrm {dilepton} \) and \(t\bar{t} \rightarrow \mathrm {lepton+jets} \) at \({\sqrt{s}} =7\) \(\text {TeV}\) from Ref. [9], \(t\bar{t} \rightarrow \mathrm {all\,jets} \) at \({\sqrt{s}} =7\) \(\text {TeV}\) from Ref. [96], \(t\bar{t} \rightarrow \mathrm {dilepton} \) at \({\sqrt{s}} =8\) \(\text {TeV}\) from Ref. [14] and \(t\bar{t} \rightarrow \mathrm {all\,jets} \) at \({\sqrt{s}} =8\) \(\text {TeV}\) from Ref. [74], were all re-evaluated. In all cases, the numbers agree to within 0.01 \(\text {GeV}\) with the original publications, which in any case is the rounding precision due to the precision of some of the inputs. On top of this, the results listed in Table 4 differ in some aspects from the original publications as explained below.

Table 4 The six measured values of \(m_{\mathrm {top}}\) (\(i=0, \ldots , 5\)) and their statistical and systematic uncertainty sources k, numbered as given in the first column. For the individual measurements, the systematic uncertainty in \(m_{\mathrm {top}}\) and its associated statistical uncertainty are given for each source of uncertainty. The last line refers to the sum in quadrature of the statistical and systematic uncertainties. The statistical uncertainties in the total systematic uncertainties and in the total uncertainties are calculated from the propagation of uncertainties. Systematic uncertainties listed as 0 are not evaluated, while an empty cell indicates an uncertainty not applicable to the corresponding measurement. Statistical uncertainties quoted as 0.00 are smaller than 0.005

The combination follows the approach developed for the combination of \({\sqrt{s}} =7\) \(\text {TeV}\) analyses in Ref. [9], including the evaluation of the correlations given in Sect. 10.2 below. The treatment of uncertainty categories for the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) measurements at \({\sqrt{s}} =7\) \(\text {TeV}\) exactly follows Ref. [9]. The uncertainty categorizations for the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurements at \({\sqrt{s}} =7\) and 8 \(\text {TeV}\) from Refs. [74, 96] closely follow this categorization but have some extra, analysis-specific sources of uncertainty, as shown in Table 4. In addition, the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) result at \({\sqrt{s}} =8\) \(\text {TeV}\) from Ref. [74] is based on a different treatment of the PDF-uncertainty-induced uncertainty in \(m_{\mathrm {top}}\). To allow the evaluation of the estimator correlations also for this uncertainty in \(m_{\mathrm {top}}\), for this combination, the respective uncertainty is newly evaluated according to the prescription given in Sect. 8.

For the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) result at \({\sqrt{s}} =7\) \(\text {TeV}\) the statistical precisions in the systematic uncertainties were not evaluated in Ref. [9] but were calculated for this combination. For the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) result at \({\sqrt{s}} =8\) \(\text {TeV}\) in Ref. [74], for some of the sources, the statistical uncertainty in the systematic uncertainty was not evaluated, such that the quoted statistical uncertainty in the total systematic uncertainty is a lower limit.

For the mapping of uncertainty categories for data taken at different centre-of-mass energies, the choice of Ref. [14] is employed. The most complex cases are the uncertainties involving eigenvector decompositions, such as the \(\mathrm {JES}\) and \(b\text {-tagging}\) scale factor uncertainties, and the uncertainty categories that do not apply to all input measurements. The JES-uncertainty-induced uncertainty in \(m_{\mathrm {top}}\) is obtained from a number of \(\mathrm {JES}\) subcomponents. Some \(\mathrm {JES}\) subcomponents have an equivalent at the other centre-of-mass energy and others do not. As in Ref. [14], the \(\mathrm {JES}\) subcomponents without an equivalent at the other centre-of-mass energy are treated as independent, resulting in vanishing estimator correlations for that part of the covariance matrix. For the remaining subcomponents, the estimator correlations are partly positive and partly negative. As an example, for the flavour part of the JES-uncertainty-induced uncertainty in \(m_{\mathrm {top}}\), the two most precise results, the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) measurements at \({\sqrt{s}} =8\) \(\text {TeV}\), are negatively correlated. Consequently, for this pair, the resulting estimator correlation for the total JES-induced uncertainty in \(m_{\mathrm {top}}\) is also negative. At the quoted precision, the two assumptions about the equivalence of the \(\mathrm {JES}\) subcomponents between the datasets at the two centre-of-mass energies, i.e. the weak and strong correlation scenarios described in Table 10 in Appendix A, leave the combined value and uncertainty unchanged.

Following Ref. [14], the \({\sqrt{s}} =7\) and 8 \(\text {TeV}\) measurements are treated as uncorrelated for the nuisance parameters of the \(b\text {-tagging}\), \(c/\tau \)-tagging, mistagging and JER uncertainties. In Ref. [14] it was shown that a correlated treatment of the flavour-tagging nuisance parameters results in an insignificant change in the combination. For the statistical, method calibration, MC-based background shape at \({\sqrt{s}} =7\) and 8 \(\text {TeV}\), and the pile-up uncertainties in \(m_{\mathrm {top}}\), the measurements are assumed to be uncorrelated. Details of the evaluation of the correlations for all remaining systematic uncertainties are discussed below.

10.2 Mathematical framework and evaluation of estimator correlations

All combinations are performed using the best linear unbiased estimate (BLUE) method [97, 98] in a C++ implementation described in Ref. [99]. The BLUE method uses a linear combination of the inputs to combine measurements. The coefficients (BLUE weights) are determined via the minimization of the variance of the combined result. They can be used to construct measures for the importance of a given single measurement in the combination [98]. For any combination, the measured values \(x_{i}\), the list of uncertainties \(\sigma _{ik}\) and the correlations \(\rho _{ijk}\) of the estimators (ij) for each source of uncertainty (k) have to be provided. For all uncertainties, a Gaussian probability distribution function is assumed. For the uncertainties in \(m_{\mathrm {top}}\) for which the measurements are correlated, when using \(\pm 1\sigma \) variations of a systematic effect, e.g. when changing the \(\mathrm {bJES}\) by \(\pm 1\sigma \), there are two possibilities. When simultaneously applying a variation for a systematic uncertainty, e.g. \(+1\sigma \) for the \(\mathrm {bJES}\), to a pair (ij) of measurements, e.g. the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) and \(t\bar{t} \rightarrow \mathrm {dilepton}\) measurements at \({\sqrt{s}} =8\) \(\text {TeV}\), both analyses can result in a larger or smaller \(m_{\mathrm {top}}\) value than the one obtained for the nominal case (full correlation, \(\rho _{ijk} =+1\)), or one analysis can result in a larger and the other in a smaller value (full anti-correlation, \(\rho _{ijk} =-1\)). Consequently, an uncertainty from a source only consisting of a single variation, such as the bJES-uncertainty-induced uncertainty or the uncertainty related to the choice of MC generator for signal events, results in a correlation of \(\rho _{ijk} =\pm 1\). The estimator correlations for composite uncertainties are evaluated by calculating the correlation from the subcomponents. As an example, for the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) result at \({\sqrt{s}} =8\) \(\text {TeV}\), the subcomponents of the \(\mathrm {JES}\) uncertainty are shown in Table 10 in Appendix A. For any pair of measurements (ij), this evaluation is done by adding the covariance terms of the subcomponents k with \(\rho _{ijk} =\pm 1\) and dividing by the total uncertainties for that source. The resulting estimator correlation is

$$\begin{aligned} \rho _{ij}&= \frac{\sum _{k=1}^{N_\mathrm {comp}}\rho _{ijk} \sigma _{ik} \sigma _{jk}}{\sigma _{i} \sigma _{j}}. \end{aligned}$$

The quantity \(\sigma ^{2}_{i} =\sum _{k=1}^{N_\mathrm {comp}}\sigma ^{2}_{ik} \) is the sum of the single subcomponent variances in analysis i. This procedure is applied to all uncertainty sources that consist of more than one subcomponent to reduce the large list of uncertainty subcomponents per estimator of \(\mathcal O\)(100) to a suitable number of uncertainty sources, i.e. to those given in Table 4. Since the full covariance matrix is independent of how the subsets are chosen, this does not affect the combination.

Fig. 7
figure 7

The pairwise shifts in \(m_{\mathrm {top}}\) when simultaneously varying a pair of measurements for a systematic uncertainty or a subcomponent of a systematic uncertainty. a, b show the correlations of the \(t\bar{t} \rightarrow \mathrm {dilepton}\) measurement at \({\sqrt{s}} =8\) \(\text {TeV}\) with the two other measurements at the same centre-of-mass energy for all sources of uncertainty for which the estimators are correlated. c shows the correlations of the present measurement with the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurement at \({\sqrt{s}} =8\) \(\text {TeV}\), while d shows the correlations of the present measurement with the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) measurement at \({\sqrt{s}} =7\) \(\text {TeV}\). The crosses indicate the statistical uncertainty in the systematic uncertainties. The solid points indicate the fully correlated cases, and the open points indicate the anti-correlated ones

For the three analyses, the evaluated shifts in \(m_{\mathrm {top}}\) per uncertainty subcomponent are referred to as \(\Delta m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(\Delta m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(\Delta m_{\mathrm {top}}^{\mathrm {all\,jets}}\). They are shown in Fig. 7 for the various uncertainty subcomponents in selected pairs of analyses. The pairs using the results from \({\sqrt{s}} =8\) \(\text {TeV}\) data are shown in Fig. 7a–c, while Fig. 7d is for the two analyses in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel at the two centre-of-mass energies. Each point represents the observed shifts for a systematic uncertainty or a subcomponent of a systematic uncertainty together with a cross, indicating the corresponding statistical precision in the systematic uncertainty in the two results. The solid points indicate the fully correlated cases, and the open points indicate the anti-correlated ones.Footnote 8

For many significant sources of uncertainty in Fig. 7a, the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) and \(t\bar{t} \rightarrow \mathrm {dilepton}\) measurements are anti-correlated. As shown in Ref. [9], this is caused by the in situ determination of the \(\mathrm {JSF}\) and \(\mathrm {bJSF}\) in the three-dimensional \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) analysis. In contrast, for most sources of uncertainty, a positive estimator correlation is observed for the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurements at \({\sqrt{s}} =8\) \(\text {TeV}\), shown in Fig. 7b. The prominent exception is the hadronization-uncertainty-induced uncertainty in \(m_{\mathrm {top}}\), i.e. the single largest uncertainty in the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurement at \({\sqrt{s}} =8\) \(\text {TeV}\), for which the two measurements are anti-correlated. On the contrary, the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurements at \({\sqrt{s}} =8\) \(\text {TeV}\), shown in Fig. 7c, are positively correlated for this uncertainty. Finally, the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) measurements at the two centre-of-mass energies in Fig. 7d show a rather low correlation. The correlations per source of uncertainty and the total estimator correlations are summarized in Table 5.

Table 5 The pairwise correlations \(\rho _{ijk}\) of the six measurements \(i, j=0, \ldots , 5\) of \(m_{\mathrm {top}}\) for each source of systematic uncertainty \(k=0, \ldots , 22\), along with the total estimator correlations and the compatibility of the measurements using \(\chi ^2_{ij}\) from Eq. (2). The indices i and j are 0 for \(t\bar{t} \rightarrow \mathrm {dilepton} \) at \({\sqrt{s}} =7\) \(\text {TeV}\), 1 for \(t\bar{t} \rightarrow \mathrm {lepton+jets} \) at \({\sqrt{s}} =7\) \(\text {TeV}\), 2 for \(t\bar{t} \rightarrow \mathrm {all\,jets} \) at \({\sqrt{s}} =7\) \(\text {TeV}\), 3 for \(t\bar{t} \rightarrow \mathrm {dilepton} \) at \({\sqrt{s}} =8\) \(\text {TeV}\), 4 for \(t\bar{t} \rightarrow \mathrm {lepton+jets} \) at \({\sqrt{s}} =8\) \(\text {TeV}\), and 5 for \(t\bar{t} \rightarrow \mathrm {all\,jets} \) at \({\sqrt{s}} =8\) \(\text {TeV}\). The correspondence of the indices \(k=0, \ldots , 22\) and the sources of systematic uncertainty are given in Table 4. Correlations that are assigned, or cannot be evaluated because one uncertainty in the covariance term is zero at the quoted precision, are given as integer values, while evaluated correlations are shown as real values

The improvement in the combination obtained by the use of evaluated correlations compared with using estimator correlations assigned solely by physics assessments (here referred to as assigned correlations) is quantified using an example. Using the choices of assigned correlations from Ref. [11] for the ATLAS results in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channels at \({\sqrt{s}} =7\) \(\text {TeV}\) listed in Table 4 gives a combined value of \(m_{\mathrm {top}} =172.91 \,\pm \,0.50 \,\mathrm {(stat)}\,\pm \,1.05 \,\mathrm {(syst)} \) \(\text {GeV}\) compared with \(m_{\mathrm {top}} =172.99 \,\pm \,0.48 \,\mathrm {(stat)}\,\pm \,0.78 \,\mathrm {(syst)} \) \(\text {GeV}\). The significant improvement in the precision of the combination demonstrates the particular importance of evaluating the correlations.

For the combinations presented in this paper, most estimator correlations could be evaluated. The most prominent exception is for the \(b\text {-tagging}\) uncertainty, where the \(t\bar{t} \rightarrow \mathrm {all\,jets} \) measurement at \({\sqrt{s}} =8\) \(\text {TeV}\) is based on a different \(b\text {-tagging}\) algorithm and calibration than the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) measurements at \({\sqrt{s}} =8\) \(\text {TeV}\). It was verified that assignments of the estimator correlations of \(\rho _{i5k} \in [-1, 1]\), with \(i=3, 4\) and \(k=17\), yield insignificant differences in the full combination. Estimator correlations of \(\rho _{i5} = 1\) are assigned for this case, as this choice gives the largest uncertainty in the combination. A similar situation arises for the data-driven all-jets background uncertainty in the two \(t\bar{t} \rightarrow \mathrm {all\,jets}\) measurements, where the method used for the background estimate is similar but not identical for the two measurements. Consequently, the conservative ad hoc assignment of \(\rho _{25k} =1\) was also made for this source \(k=11\).

10.3 Compatibility of the inputs and selected combinations

Before any combination is performed, the compatibility of the input results is verified. For each pair of results, their compatibility is expressed by the ratio of the squared difference between the pair of measured values and the uncertainty in this difference [98] as

$$\begin{aligned} \chi ^2_{ij}&= \frac{(x_i-x_j)^2}{\sigma _i^2+\sigma _j^2-2\rho _{ij} \sigma _i\sigma _j}. \end{aligned}$$
(2)

The corresponding values are given in Table 5. Analysing the \(\chi ^2_{ij}\) values reveals good \(\chi ^2\) probabilities, with the smallest \(\chi ^2\) probability being \(P(\chi ^2,1) =15\% \). The largest sum of \(\chi ^2_{ij}\) values by far is observed for the \(t\bar{t} \rightarrow \mathrm {all\,jets}\) result at \({\sqrt{s}} =7\) \(\text {TeV}\).

The dependences of the combined values and their uncertainties on the total correlation for pairwise combinations of results are analysed. The dependences for pairs of the three results from \({\sqrt{s}} =8\) \(\text {TeV}\) data are shown in Fig. 8. The largest information gain is achieved by combining the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) results at \({\sqrt{s}} =8\) \(\text {TeV}\), shown in Fig. 8a, b, which are anti-correlated, i.e. \(\rho =-0.19 \).

Fig. 8
figure 8

The combined values (left) and uncertainties (right) of the combination of pairs of individual results at \({\sqrt{s}} =8\) \(\text {TeV}\), shown as functions of the total correlation \(\rho \) (solid lines). The combination of the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) results is shown in the top row. The middle row is for the combination of the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) results. Finally, the combination of the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) and \(t\bar{t} \rightarrow \mathrm {all\,jets}\) results is shown in the bottom row. For comparison, the corresponding values for the input results are also shown (dashed lines)

Based on Tables 4 and 5, selected combinations are analysed, yielding the results given in Table 6 and shown in Fig. 9. The BLUE weights and the pullsFootnote 9 of the results are given in Table 7.

Table 6 The results for selected combinations based on the six results at \({\sqrt{s}} =7\) and 8 \(\text {TeV}\). The left two columns of results show the combination of the three results at \({\sqrt{s}} =7\) \(\text {TeV}\) (\(m_{\mathrm {top}}^{\mathrm {7TeV}}\)) or at \({\sqrt{s}} =8\) \(\text {TeV}\) (\(m_{\mathrm {top}}^{\mathrm {8TeV}}\)), both combinations neglecting the results at the respective other centre-of-mass energy. The middle three columns show the combination of the six results as if pairs of measurements would determine a decay-specific top quark mass, namely \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(m_{\mathrm {top}}^{\mathrm {all\,jets}}\). Finally, shown on the right is the combination of the three most important results in the combination denoted by \(m_{\mathrm {top}}^{\mathrm {(3)}}\) and the combination of all results, i.e. the ATLAS result for \(m_{\mathrm {top}}\) shown in Fig. 10. For each combination, the uncertainty is given for each source of uncertainty. Uncertainties quoted as 0.00 are smaller than 0.005, while empty cells indicate uncertainties that do not apply to the respective combination. Finally, the total systematic uncertainty and the sum in quadrature of the statistical and systematic uncertainties are given. Both are quoted including the precision at which the respective uncertainty is known
Table 7 The BLUE weights and the pulls of the results for the combinations reported in Table 6. The upper part refers to the independent combinations of the three results per centre-of-mass energy resulting in uncorrelated results \(m_{\mathrm {top}}^{\mathrm {7TeV}}\) and \(m_{\mathrm {top}}^{\mathrm {8TeV}}\). The middle part is for the combination of the three observables from pairs of results per \(t\bar{t}\) decay channel, resulting in correlated results \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(m_{\mathrm {top}}^{\mathrm {all\,jets}}\). The lower part refers to the combination of the three most important results \(m_{\mathrm {top}}^{\mathrm {(3)}}\) and of all results \(m_{\mathrm {top}}\)

To investigate the difference in precision of combined results obtained from \({\sqrt{s}} =7\) and 8 \(\text {TeV}\) results, two independent combinations of the three results per centre-of-mass energy are performed. For each decay channel, the results at \({\sqrt{s}} =8\) \(\text {TeV}\) are significantly more precise than those at \({\sqrt{s}} =7\) \(\text {TeV}\). In addition, the two most precise results per centre-of-mass energy are significantly less correlated at \({\sqrt{s}} =8\) \(\text {TeV}\) than at \({\sqrt{s}} =7\) \(\text {TeV}\). Consequently, the size of the uncertainty of the combined result at \({\sqrt{s}} =8\) \(\text {TeV}\) (\(m_{\mathrm {top}}^{\mathrm {8TeV}}\)) is \(39\%\) smaller than the one obtained from the results at \({\sqrt{s}} =7\) \(\text {TeV}\) (\(m_{\mathrm {top}}^{\mathrm {7TeV}}\)). As shown in Fig. 13a, b in Appendix B, for both centre-of-mass energies, the combination is dominated by the results in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) and \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channels.

To investigate whether the measured \(m_{\mathrm {top}}\) depends on the \(t\bar{t}\) decay mode, a combination of the six results is performed in which the results in the three \(t\bar{t}\) decay channels are treated as determining potentially different masses, namely \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(m_{\mathrm {top}}^{\mathrm {all\,jets}}\). In such a combination, results obtained in one decay channel influence the combined result in another decay channel by means of their estimator correlation. Therefore, for each observable, e.g. \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), by construction the sum of weights of the results in the corresponding decay channel equals unity, while for each of the other decay channels the sum of weights of the results equals zero [100]. The combination yields compatible results for the three masses listed in Table 6. Consequently, the data do not show any sign of a decay-channel-dependent \(m_{\mathrm {top}}\). The correlation matrix of the three observables 0, 1 and 2 corresponding to \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(m_{\mathrm {top}}^{\mathrm {all\,jets}}\) is

$$\begin{aligned} \rho _{m_{\mathrm {top}}}&= \left( \begin{array}{rrr} 1 &{} -0.14 &{} 0.43 \\ -0.14 &{} 1 &{} -0.05 \\ 0.43 &{} -0.05 &{} 1 \end{array} \right) , \end{aligned}$$

and the smallest \(\chi ^2\) probability of any pair of combined results for determining the same \(m_{\mathrm {top}}\) is \(P(\chi ^2,1) = 11\% \). As shown in Fig. 13c–e in Appendix B, for the combination of the three observables, the results based on \({\sqrt{s}} =7\) \(\text {TeV}\) data lead to significant improvements on their more precise counterparts obtained from \({\sqrt{s}} =8\) \(\text {TeV}\) data, apart from the \(t\bar{t} \rightarrow \mathrm {dilepton}\) channel.

Given that no dependence of \(m_{\mathrm {top}}\) on the centre-of-mass energy or the \(t\bar{t}\) decay channel is expected, the above examples of combinations are merely additional investigations of the compatibility of the input results. The compatibility combinations are summarized in Fig. 9 and listed in Table 6. For all combinations, the values quoted in Fig. 9 are the combined value, the statistical uncertainty, the systematic uncertainty, the total uncertainty and the statistical uncertainty in the total uncertainty.

Fig. 9
figure 9

The compatibility combinations performed for the six ATLAS results of \(m_{\mathrm {top}}\). The figure shows the combined results listed in Table 6, which are obtained from data taken at different centre-of-mass energies and for the three decay channels, in comparison to the new ATLAS result. The values quoted are the combined value, the statistical uncertainty, the systematic uncertainty, the total uncertainty and the uncertainty in the total uncertainty. The results are compared with the new ATLAS combination listed in the last line and shown as the grey vertical bands

10.4 The combined result of \(m_{\mathrm {top}}\)

The use of the statistical uncertainties in the systematic uncertainties has two main advantages. Firstly, it allows a determination of the uncertainties in the evaluation of the total correlations of the estimators, avoiding the need to perform ad hoc variations. Secondly, it enables the monitoring of the evolution of the combined result in relation to the precision in its uncertainty while including results, thereby evaluating their influence on the combination. The significance of the individual results in the combination is shown in Fig. 10. The individual results are shown in Fig. 10a. Their combination is displayed in Fig. 10b where, following Ref. [98], starting from the most precise result, i.e. the \(t\bar{t} \rightarrow \mathrm {dilepton}\) measurement at \({\sqrt{s}} =8\) \(\text {TeV}\), results are added to the combination one at a time according to their importance, and the combined result is reported. Each following line of this figure shows the combined result when adding the result listed to the input of the combination, indicated by the ‘+’ in front of the name of the added estimate. The last line in Fig. 10b shows the new ATLAS combined value of \(m_{\mathrm {top}}\).

Fig. 10
figure 10

The combination of the six ATLAS results of \(m_{\mathrm {top}}\) according to importance Ref. [98]. a shows the inputs to the combination. b shows results of the combination when successively adding results to the most precise one. The values quoted are the combined value, the statistical uncertainty, the systematic uncertainty, the total uncertainty and the uncertainty in the total uncertainty. In this figure, each line shows the combined result when adding the result listed to the combination indicated by a ‘+’. The new ATLAS combination is given in the last line, and shown in both figures as the vertical grey bands

The inclusion of the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) result at \({\sqrt{s}} =8\) \(\text {TeV}\) leads to the result quoted in the second line, which improves the combined uncertainty by much more than the statistical precision in the uncertainty of the most precise result. The same is found when adding the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) result at \({\sqrt{s}} =7\) \(\text {TeV}\) and comparing with the statistical uncertainty in the previous combination, albeit at a much reduced significance. The corresponding result obtained from these three results, denoted by \(m_{\mathrm {top}}^{\mathrm {(3)}}\), is also listed in Table 6.

The improvement in the combination by applying the \(\mathrm {BDT}\) selection to the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) analysis at \({\sqrt{s}} =8\) \(\text {TeV}\) is sizeable. This is seen from repeating the combination of \(m_{\mathrm {top}}^{\mathrm {(3)}}\) but using the result from the standard selection from Table 3. With this, the correlation of the \({\sqrt{s}} =8\) \(\text {TeV}\) \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) result with the \({\sqrt{s}} =8\) \(\text {TeV}\) \(t\bar{t} \rightarrow \mathrm {dilepton}\) result changes from \(-0.19\) to \(-0.02\). The resulting uncertainty in the combination is \(0.59 \,\pm \,0.05 \) \(\text {GeV}\), i.e. the combination is \(18\%\) less precise than \(m_{\mathrm {top}}^{\mathrm {(3)}}\) obtained using the result from the \(\mathrm {BDT}\) selection. Adding the remaining results reduces the quoted combined uncertainty by \(0.02\)  \(\text {GeV}\), which is smaller than the statistical precision in the uncertainty of the previously achieved result of \(m_{\mathrm {top}}^{\mathrm {(3)}}\).

The changes in statistical uncertainties in the combined value and its uncertainty due to variations of the input systematic uncertainties within their uncertainties are evaluated for two cases, namely for \(m_{\mathrm {top}}^{\mathrm {(3)}}\) and for the combination of all results. Following Ref. [14], the distributions of the combined values and uncertainties are calculated from 500 combinations, where for each combination, the sizes of the uncertainties as well as the correlations are newly evaluated. Due to the re-evaluation of the correlation, the resulting distributions are not Gaussian and are also not exactly centred around the combined value and the combined uncertainty. For \(m_{\mathrm {top}}^{\mathrm {(3)}}\), the root mean square of the distribution of the combined value is \(0.03\)  \(\text {GeV}\), and that of the distribution of its uncertainty is \(0.04\)  \(\text {GeV}\). The corresponding values for the new ATLAS combination are \(0.07\)  \(\text {GeV}\) and \(0.03\)  \(\text {GeV}\), respectively.

The full breakdown of uncertainties for the new combined ATLAS result for \(m_{\mathrm {top}}\) is reported in the last column of Table 6. The combined result is

$$\begin{aligned} m_{\mathrm {top}}&=172.69 \,\pm \,0.25 \,\mathrm {(stat)}\,\pm \,0.41 \,\mathrm {(syst)} ~\text {GeV}\end{aligned}$$

with a total uncertainty of \(0.48 \pm 0.03 \) \(\text {GeV}\), where the quoted uncertainty in this uncertainty is statistical. This means that the uncertainty in this combined result is only known to this precision, which, given its size, is fully adequate.

The \(\chi ^2\) probability of \(m_{\mathrm {top}}^{\mathrm {(3)}}\) is \(78\%\). Driven by the larger pulls of the remaining three results listed in Table 7, the \(\chi ^2\) probability of \(64\%\) for the new ATLAS combination of \(m_{\mathrm {top}}\) is lower but still good. The new ATLAS combined result of \(m_{\mathrm {top}}\) provides a \(44\%\) improvement relative to the most precise single input result, which is the \(t\bar{t} \rightarrow \mathrm {dilepton}\) analysis at \({\sqrt{s}} =8\) \(\text {TeV}\). With a relative precision of \(0.28\%\), it improves on the previous combination in Ref. [14] by \(31\%\) and supersedes it. As shown in Appendix B, the new ATLAS combined result of \(m_{\mathrm {top}}\) is more precise than the results from the CDF and D0 experiments, and has a precision similar to the CMS combined result.

Fig. 11
figure 11

Comparison of indirect determinations and direct measurements of the top quark and \(W\) boson masses. The direct ATLAS measurements of \(m_{W}\) and \(m_{\mathrm {top}}\) are shown as the horizontal and vertical bands, respectively. Their 68\(\%\) and 95\(\%\) confidence-level (CL) contours are compared with the corresponding results from the electroweak fit

In Fig. 11, the 68\(\%\) and 95\(\%\) confidence-level contours of the indirect determination of \(m_{W}\) and \(m_{\mathrm {top}}\) from the global electroweak fit in Ref. [2] are compared with the corresponding confidence-level contours of the direct ATLAS measurements of the two masses. The top quark mass used in this figure was obtained above, while the \(W\) boson mass is taken from Ref. [101]. The electroweak fit uses as input the LHC combined result of the Higgs boson mass of \(m_{H} = 125.09 \pm 0.24\) \(\text {GeV}\) from Ref. [102]. There is good agreement between the direct ATLAS mass measurements and their indirect determinations by the electroweak fit.

11 Conclusion

The top quark mass is measured via a three-dimensional template method in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel and combined with previous ATLAS \(m_{\mathrm {top}}\) measurements at the LHC.

For the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) analysis from \({\sqrt{s}} =8\) \(\text {TeV}\) proton–proton collision data with an integrated luminosity of about \(20.2 \) \(\mathrm {fb}^{-1} \), the event selection of the corresponding \({\sqrt{s}} =7\) \(\text {TeV}\) analysis is refined. An optimization employing a \(\mathrm {BDT}\) selection to efficiently suppress less-well-reconstructed events results in a significant reduction in total uncertainty, driven by a significant decrease in theory-modelling-induced uncertainties. With this approach, the measured value of \(m_{\mathrm {top}}\) is

$$\begin{aligned} m_{\mathrm {top}}&=172.08 \,\pm \,0.39 \,\mathrm {(stat)}\,\pm \,0.82 \,\mathrm {(syst)} ~\text {GeV}\end{aligned}$$

with a total uncertainty of \(0.91 \pm 0.06 \) \(\text {GeV}\), where the quoted uncertainty in the total uncertainty is statistical. The precision is limited by systematic uncertainties, mostly by uncertainties in the calibration of the jet energy scale, \(b\text {-tagging}\) and the Monte Carlo modelling of signal events. This result is more precise than the result from the CDF experiment, but less precise than the CMS and D0 results, measured in the same channel.

The correlations of six measurements of \(m_{\mathrm {top}}\), performed in the three \(t\bar{t}\) decay channels from \({\sqrt{s}} =7\) and 8 \(\text {TeV}\) ATLAS data, are evaluated for all sources of the systematic uncertainty. Using a dedicated mapping of uncertainty categories, combinations are performed, where measurements are added one at a time according to their importance. Treating the pairs of measurements in the three \(t\bar{t}\) decay channels as determining potentially different masses, namely \(m_{\mathrm {top}}^{\mathrm {dilepton}}\), \(m_{\mathrm {top}}^{\mathrm {\ell +jets}}\) and \(m_{\mathrm {top}}^{\mathrm {all\,jets}}\), yields consistent values within uncertainties, i.e. the data do not show any sign of a decay-channel-dependent \(m_{\mathrm {top}}\).

The combined result of \(m_{\mathrm {top}}\) from the six measurements is

$$\begin{aligned} m_{\mathrm {top}}&= 172.69 \,\pm \,0.25 \,\mathrm {(stat)}\,\pm \,0.41 \,\mathrm {(syst)} ~\text {GeV}\end{aligned}$$

with a total uncertainty of \(0.48 \pm 0.03 \) \(\text {GeV}\), where the quoted uncertainty in the total uncertainty is statistical. This combination is dominated by three input measurements: the measurement in the \(t\bar{t} \rightarrow \mathrm {dilepton}\) channel from \({\sqrt{s}} =8\) \(\text {TeV}\) data and the two measurements in the \(t\bar{t} \rightarrow \mathrm {lepton+jets}\) channel from \({\sqrt{s}} =8\) and 7 \(\text {TeV}\) data. With a relative precision of \(0.28\%\), this new ATLAS combination of \(m_{\mathrm {top}}\) is more precise than the result from the CDF and D0 experiments and has a precision similar to the CMS combined result. This result supersedes the previous combined ATLAS result.

With this precision in \(m_{\mathrm {top}}\) achieved, precise knowledge of the relation between the mass definition of the experimental analysis and the pole mass is becoming relevant. The combined result is mostly limited by the uncertainties in the calibration of the jet energy scales, \(b\text {-tagging}\) and in the Monte Carlo modelling of signal events.