1 Introduction

The top quark [1, 2] is the most massive known fundamental constituent of matter. Its unexplained large mass suggests an important connection to the electroweak symmetry breaking mechanism. The measurement of the top–antitop (\(t\bar{t}\)) quark production cross-section (\(\sigma_{ t\bar{t}}\)) in various decay channels allows a precision test of perturbative QCD. In addition, the \(t\bar{t}\) production process is an important background for Standard Model (SM) Higgs boson searches, and in searches for physics beyond the SM. Also, a rich set of possible new particles and interactions might appear at the Large Hadron Collider (LHC) and modify the production and/or decay of top quarks.

The inclusive \(t\bar{t}\) production cross-section has been measured by the ATLAS and CMS Collaborations with increasing precision [36] in a variety of channels using data collected in 2010 and 2011. The unprecedented number of available \(t\bar{t}\) events (tens of thousands) enables detailed investigations of the properties of top quark production in terms of characteristic variables of the \(t\bar{t}\) system. This paper focuses on three observables of the \(t\bar{t}\) system: the invariant mass (\(m_{ t\bar{t}}\)), the transverse momentum (\(p_{\mathrm{T}, t\bar{t}}\)) and the rapidity (\(y_{ t\bar{t}}\)). To enable direct comparisons to theoretical models the differential distributions are unfolded for detector effects and corrected for acceptance effects. Theoretical predictions for the \(t\bar{t}\) invariant mass distribution accurate to next-to-next-to-leading logarithm (NNLL) and next-to-leading order (NLO) are currently available [7], with a typical uncertainty of around 12 % at \(m_{ t\bar {t}} \simeq 1~\mbox{TeV}\). Comparisons of mass, transverse momentum, and rapidity distributions are also made between unfolded data and NLO predictions taken from the MCFM generator [8]. In addition, the data are compared to predictions from the MC@NLO [9, 10] and ALPGEN [11] generators with particular choices of parameter settings.

The \(m_{ t\bar{t}}\) distribution is sensitive to particles beyond the SM, such as new s-channel resonances that can modify the shape of the differential production cross-section in different ways depending on their spin and colour properties [12]. In addition to Tevatron experiment searches [1318], both the ATLAS and CMS Collaborations have performed direct searches for specific narrow and wide resonances that extend mass limits to the TeV region [1921]. The CDF Collaboration has performed a measurement of the differential cross-section as a function of \(m_{t\bar{t}}\) [22] using the data collected in proton-antiproton (\(p\bar {p}\)) collisions at a centre of mass energy (\(\sqrt{s}\)) of 1.96 TeV. The result is consistent with the SM expectation as predicted by PYTHIA (version 6.216) [23]. A potentially intriguing deviation from the SM prediction is observed in the measured forward–backward angular asymmetry between t and \(\bar{t}\) quarks produced together in \(p\bar{p}\) collisions at the Tevatron [24, 25], particularly in events with large \(m_{t\bar{t}}\) [24]. Nearly all new physics scenarios that could explain this deviation should be observable at the LHC as a resonant or non-resonant enhancement with respect to the SM in \(t\bar{t}\) production at large \(m_{t\bar{t}}\) [26].

2 Detector, data and simulation samples

The ATLAS detector [27] at the LHC covers nearly the entire solid angle around the collision point. It consists of an inner tracking detector (ID) comprising a silicon pixel detector, a silicon microstrip detector, and a transition radiation tracker, providing tracking capability within pseudorapidityFootnote 1 |η|<2.5. The ID is surrounded by a thin superconducting solenoid providing a 2 T axial magnetic field, and by liquid argon (LAr) electromagnetic (EM) sampling calorimeters with high granularity. An iron/scintillator tile calorimeter provides hadronic energy measurements in the central pseudorapidity range (|η|<1.7). The end-cap and forward regions are instrumented with LAr calorimeters for both electromagnetic and hadronic energy measurements up to |η|<4.9. The calorimeter system is surrounded by a muon spectrometer incorporating three superconducting toroid magnet assemblies.

A three-level trigger system is used to select high-p T events. The level-1 trigger is implemented in hardware and uses a subset of the detector information to reduce the rate to a design value of at most 75 kHz. This is followed by two software-based trigger levels, which together reduce the event rate to about 300 Hz. This analysis uses LHC proton–proton (pp) collisions at \(\sqrt{s} = 7~\mbox{TeV}\) collected by the ATLAS detector between March and August 2011, corresponding to an integrated luminosity of 2.05 fb−1.

Simulated top quark pair events are generated using the MC@NLO Monte Carlo (MC) generator version 3.41 with the NLO parton distribution function (PDF) set CTEQ6.6 [28], where the top quark mass is set to 172.5 GeV. Renormalization and factorization scales are set to the same value: the square root of the average of the t and \(\bar{t}\) quarks squared transverse energies. Parton showering and the underlying event are modelled using HERWIG [29] and JIMMY [30] using the AUET1 tune [31], respectively. The \(t\bar{t}\) sample is normalized to a cross-section of 164.6 pb, obtained with an approximate NNLO prediction [32]. Single top events are also generated using MC@NLO [33, 34], while the production of W/Z bosons in association with jets is simulated using the ALPGEN generator interfaced to HERWIG and JIMMY with CTEQ6L1 PDFs [35]. W+jets events containing \(b\bar{b}\) pairs, \(c\bar{c}\) pairs and single c-quark (heavy flavour) were generated separately using matrix elements with massive b- and c-quarks. An overlap-removal procedure is used to avoid double counting due to heavy quarks from the parton shower. Diboson events (WW, WZ, ZZ) are generated using HERWIG with MRST LO PDFs [36].

All Monte Carlo simulation samples are generated with multiple pp interactions per bunch crossing (pile-up). These simulated events are re-weighted so that the distribution of the average number of interactions per pp bunch crossing in simulation matches that observed in the data. This average number varies between data-taking periods and ranges typically between 4 and 8. The samples are then processed through the GEANT4 [37] simulation of the ATLAS detector [38] and the standard ATLAS reconstruction software.

3 Event selection

Events are selected in the lepton (electron or muon) + jets channel. The reconstruction of \(t\bar{t}\) events in the detector is based on the identification and reconstruction of electrons, muons, jets and missing transverse momentum. The definitions of these objects are identical to those used in Ref. [39]. The same event selection as in Ref. [39] is used with the addition of a requirement on the kinematic likelihood resulting from the event reconstruction described in Sect. 5.

3.1 Object definitions

Electron candidates are defined as energy deposits in the EM calorimeter associated with well-reconstructed tracks of charged particles in the ID. The candidates are required to meet stringent identification criteria based on EM shower shape information, track quality variables and information from the transition radiation tracker [40]. All candidates are required to have E T>25 GeV and |η clu|<2.47, where η clu is the pseudorapidity of the EM calorimeter cluster associated with the electron. Candidates in the transition region between the barrel and end-cap calorimeters 1.37<|η clu|<1.52 are rejected.

Muon candidates are reconstructed by combining track segments in different layers of the muon chambers. Such segments are assembled starting from the outermost layer, with a procedure that takes material effects into account, and are then matched with tracks found in the ID. The candidates are then re-fitted, exploiting the full track information from both the muon spectrometer and the ID, and are required to have p T>20 GeV and |η|<2.5.

Jets are reconstructed with the anti-k t algorithm [41] with a distance parameter of 0.4 using clusters formed from calorimeter cells with significant energy deposits (“topoclusters”) at the EM scale. The jet energy is then corrected to the hadronic scale using p T- and η-dependent correction factors derived from simulation and validated with data [42].

The missing transverse momentum and its magnitude \(E_{\mathrm{T}}^{\mathrm{miss}}\) are derived from topoclusters at the EM scale and corrected on the basis of the energy scale of the associated physics object, if any [43]. Contributions from muons are included using their momentum measured from the tracking and muon spectrometer systems. The remaining clusters not associated with high-p T objects are added at the EM scale.

Both the electron and muon candidates are required to be isolated to reduce the backgrounds from hadrons mimicking lepton signatures and leptons from heavy-flavour decays. For electron candidates, the total transverse energy deposited in the calorimeter in a cone of ΔR=0.2 around the electron candidate is required to be less than 3.5 GeV after correcting for the energy associated with the electron and for energy deposited by pile-up. For muon candidates, the isolation is defined in a cone of ΔR=0.3 around the muon direction. In that region both the sum of track transverse momenta for tracks with p T>1 GeV and the total energy deposited in the calorimeter are required to be less than 4 GeV, after subtracting the contributions from the muon itself.

Jets within ΔR=0.2 of an electron candidate are removed to avoid double counting electrons as jets. Subsequently, muons within ΔR=0.4 of the centre of a jet with p T>20 GeV are removed in order to reduce the contamination caused by muons from hadron decays.

The reconstruction of \(t\bar{t}\) events is aided by the ability to tag jets from the hadronization of b-quarks using the combination of two b-tagging algorithms [44]. One b-tagger derives the properties of vertices related to b- and c-hadron decays inside jets by assuming the vertices to lie on a line connecting them to the primary vertex.Footnote 2 A likelihood discriminant between b-, c- and light-quark jets is derived by using the number, the masses, the track energy fraction, the flight-length significances and the track multiplicities of the reconstructed vertices as inputs. The other b-tagging algorithm employs the transverse and longitudinal impact parameter significances of each track within the jet to derive a likelihood that the jet originates from a b-quark. The results of the two taggers are combined, using a neural network, into a single discriminating variable. The combined tagger operating point chosen for the present analysis corresponds to a 70 % tagging efficiency for b-jets in simulated \(t\bar{t}\) events, while light-flavour jets (c-jets) are suppressed by approximately a factor of 100 (5).

3.2 Selection of \(t\bar{t}\) candidates

The lepton + jets channel selection requires the appropriate single-electron or single-muon trigger to have fired (with thresholds at 20 GeV and 18 GeV respectively). Events passing the trigger selection are required to contain exactly one reconstructed electron (muon) with E T>25 GeV (p T>20 GeV). The events are required to have at least one reconstructed primary vertex. The primary vertex, corresponding to that with highest Σ trk \(p_{\mathrm{T},\mathrm{trk}}^{2}\) is required to be reconstructed from at least five high-quality tracks. Jet quality criteria are applied to the data and events are discarded if any jet with p T>20 GeV is identified to be due to calorimeter noise or activity out of time with respect to the LHC beam crossings [42]. The \(E_{\mathrm{T}}^{\mathrm{miss}}\) is required to be larger than 20 (35) GeV in the μ+jets (e+jets) channel. The W boson transverse mass (\(m_{\mathrm {T}}^{W}\)), derived from the lepton transverse momentum and the \(E_{\mathrm{T}}^{\mathrm {miss}}\) [45], is required to be larger than 60 GeV–\(E_{\mathrm{T}}^{\mathrm{miss}}\) (25 GeV) in the μ+jets (e+jets) channel. The requirements for the e+jets channel is more stringent in order to reduce the larger fake-lepton background. Events are required to have at least four jets with p T>25 GeV and |η|<2.5, where at least one of these jets is required to be b-tagged. Finally, events are retained only if they have a kinematic likelihood ln(L)>−52 resulting from the event reconstruction described in Sect. 5.

4 Background determination

The main expected backgrounds in the lepton + jets channel are W+jets  which can give rise to the same final state as the \(t\bar{t}\) signal, and fake leptons. They are both estimated using auxiliary measurements. The other backgrounds are of electroweak origin and are estimated from simulation. All background determination methods are identical to those used in Ref. [39].

4.1 Fake-lepton background

The multijet background with misidentified and non-prompt leptons (referred to collectively as fake leptons) in both the e+jets and μ+jets channels is evaluated with a matrix method, which relies on defining loose and tight lepton samples [3, 45] and measuring the fractions of real (ε real) and fake (ε fake) loose leptons that are selected as tight leptons. The fraction ε real is measured using data control samples of Z boson decays to two leptons, while ε fake is measured from data control regions dominated by the contributions of fake leptons. Contributions from W+jets and Z+jets backgrounds are subtracted in the control regions using Monte Carlo simulation.

For the μ+jets channel, the loose data sample is defined by discarding the isolation requirements in the default muon selection. The fake-muon efficiencies are derived from a low-\(m_{\mathrm{T}}^{W}\) control region, \(m_{\mathrm{T}}^{W}< 20~\mbox{GeV}\), with an additional requirement \(E_{\mathrm{T}}^{\mathrm {miss}}+ m_{\mathrm{T}}^{W}< 60~\mbox{GeV}\). The efficiencies for real and fake muons are parameterized as a function of muon |η| and of the leading jet p T.

For the e+jets channel, the loose data sample is defined by selecting events with electrons meeting looser identification criteria. The 3.5 GeV electron isolation requirement is loosened to 6 GeV. The fake-electron efficiencies are determined using a low-\(E_{\mathrm{T}}^{\mathrm{miss}}\) control region (\(5~\mbox{GeV}< E_{\mathrm{T}}^{\mathrm{miss}}< 20~\mbox{GeV}\)). The efficiencies for real and fake-electrons are parameterized as a function of electron |η|.

4.2 W+jets background estimation

The W+jets background estimation consists of three steps.

The first step is to determine the flavour composition of the W+jets background in the signal region before b-tagging. Since the theoretical prediction for heavy flavour fractions in W+jets suffers from large uncertainties, a data-driven approach was developed to constrain these fractions with inputs from MC simulation. Samples with a lower jet multiplicity, obtained from the selection described in Sect. 3.2, but requiring exactly one or two jets instead of four or more jets, are analysed.

The numbers \(W^{\mathrm{data}}_{i,\text{pre-tag}}, W^{\mathrm{data}}_{i, \mathrm{tagged}}\), of W+i-jets events in these samples (with i=1,2), before and after applying the b-tagging requirement, are calculated from the observed events by subtracting the small contributions from other Standard Model processes—electroweak (WW, WZ, ZZ, and Z+jets) and top quark (\(t\bar{t}\) and single top) processes—predicted by the simulation and by subtracting the fake-lepton background obtained as described in Sect. 4.1.

A system of three equations—expressing the number of W+1-jet events after b-tagging and W+2-jets events before and after b-tagging—can be written with eight independent flavour fractions as the unknowns, corresponding to fractions of \(Wb\bar{b}+\mathrm{jets}\), \(Wc\bar{c}+\mathrm{jets}\), Wc+jets and W+light-jets events in the one- and two- jet bins before b-tagging. In the equations involving tagged events, the simulation prediction is used to include the eight tagging probabilities of the different W+jets event types. For each flavour, the fractions in the one-jet and two-jet bins are related using the simulation’s prediction of their ratio. These predictions reduce the number of independent fractions to four. Finally, the ratio of the \(Wc\bar{c}+\mathrm{jets}\) to the \(Wb\bar{b}+\mathrm{jets}\) fractions in the two-jet bin is fixed to the value obtained from simulated events in order to obtain three independent fractions in the three equations.

The resulting scale factors for the heavy flavour fractions in simulated W+jets events are 1.63±0.76 for \(Wb\bar{b}+\mathrm{jets}\) and \(Wc\bar {c}+\mathrm{jets}\) events and 1.11±0.35 for Wc+jets events. These are applied to the relevant Monte Carlo samples. The uncertainties on these scale factors are derived from systematic variations of the inputs to the method (see Sect. 6.2). The fraction of W+light-jets events is scaled by a factor 0.83 to keep the total number of pre-tagged Monte Carlo W+jets events fixed. When applied to the signal region, an additional 25 % uncertainty is applied to these fractions, corresponding to the uncertainty in the Monte Carlo prediction for the ratio of flavour fractions in different jet multiplicities.

The second step is to determine the overall normalization of W+jets background in events with four or more jets before b-tagging. At the LHC the rate of W ++jets events is larger than that of W +jets events because there are more up-type valence quarks in the proton than down-type valence quarks. The ratio of W ++jets to W +jets cross-sections is predicted much more precisely than the total W+jets cross-section [4648]. This asymmetry is used to measure the total W+jets background from the data. To a good approximation, processes other than W+jets give equal numbers of positively and negatively charged leptons. Consequently the total number of W+jets events in the selected sample can be estimated as

$$ W_{\ge4, \text{pre-tag}} = N_{W^+} + N_{W^-} = \biggl( {r_{\mathrm{MC}}+1 \over r_{\mathrm{MC}}-1} \biggr) \bigl(D^+ - D^- \bigr). $$
(1)

The charge-asymmetric single top contribution is estimated from simulation and subtracted. The values D +(D ) are the total numbers of events in data meeting the selection criteria described in Sect. 3.2, before the b-tagging and likelihood requirement, with positively (negatively) charged leptons. The value of \(r_{\mathrm{MC}} \equiv{N(pp\rightarrow W^{+} + X ) \over N(pp\rightarrow W^{-} + X )}\) is derived from Monte Carlo simulation, using the same event selection. The ratio r MC is 1.56±0.06 in the e+jets channel and 1.65±0.08 in the μ+jets channel. The largest uncertainties on r MC derive from uncertainties in PDFs, the jet energy scale, and the heavy-flavour fractions in W+jets events.

Finally, in the third step, the number of W+jets events passing the selection with at least one b-tagged jet is determined to be [45]

$$ W_{\ge4, \mathrm{tagged}} = W_{\ge4, \text{pre-tag}} \cdot f_{2,\mathrm{tagged}} \cdot k_{2\to\ge4}. $$
(2)

The value \(f_{2,\mathrm{tagged}}\equiv W^{\text{data}}_{2,\text{tagged}}/ W^{\text{data}}_{2, \text{pre-tag}}\) is the fraction of W+2 jets events meeting the requirement of having at least one b-tagged jet, and \(k_{2\to\ge4}\equiv f^{\mathrm{MC}}_{\ge4,\mathrm{tagged}}/f^{\mathrm{MC}}_{2,\mathrm{tagged}}\) is the ratio of the fractions of simulated W+jets events passing the requirement of at least one b-tagged jet, for at least four and exactly two jets, respectively. The value of f 2,tagged is found to be 0.063±0.005 in the e+jets channel and 0.068±0.005 in the μ+jets channel. The ratio k 2→≥4 is found to be 2.52±0.36 in the e+jets channel and 2.35±0.34 in the μ+jets channel. The uncertainties include both systematic contributions and contributions arising from the limited number of simulated events.

4.3 Other backgrounds

The numbers of background events from single top production, Z+jets and diboson events are evaluated using Monte Carlo simulation. The prediction for Z+jets events are normalized to the approximate NNLO cross-sections as determined by the FEWZ program [49], using the MSTW2008NLO PDFs [46, 50]. The prediction for diboson events is normalized to the NLO cross-section as determined by the MCFM program [51] using the MSTW2008NLO PDFs. The approximate NNLO cross-section results from Refs. [5254] are used to normalize the predictions for single top events.

5 Reconstruction

Measurements of differential cross-sections in top quark pair events require full kinematic reconstruction of the \(t\bar{t}\) system. The reconstruction is performed using a likelihood fit of the measured objects to a theoretical leading-order representation of the \(t\bar {t}\) decay. The same reconstruction method as in Ref. [39] is used. The likelihood is the product of three factors. The first factor is the product of Breit–Wigner distributions for the production of W bosons and top quarks, given the four-momenta of the true \(t\bar{t}\) decay products. The second factor is the product of transfer functions representing the probabilities for the given true energies of the \(t\bar{t}\) decay products to be observed as the energies of reconstructed jets, leptons and as missing transverse energy. The third factor is the probability to b-tag a certain jet, given the parton it is associated with. The pole masses of the W bosons and the top quarks in the Breit–Wigner distributions are set to 80.4 GeV and 172.5 GeV, respectively.

The likelihood is maximized by varying the energies of the partons, the energy of the charged lepton, and the components of the neutrino three-momentum. The maximization is performed over all possible assignments of jets to partons, and the assignment with the largest likelihood is used for all further studies. The distributions of the jet multiplicity are shown in Figs. 1 (a–b) after all selection requirements, with the exception of the requirements on the likelihood and on the jet multiplicity. The four-momenta of the top quarks are then obtained by summing the four momenta of the decay products resulting from the kinematic fit. The unconstrained z component of the neutrino momentum is a free parameter in the fit.

Fig. 1
figure 1

Distributions of (ab) jet multiplicity, (cd) negative logarithm of the likelihood obtained from the kinematic fit described in the text and (ef) invariant mass of the three reconstructed objects assigned to the hadronic top quark decay, obtained from the kinematic fit by relaxing the requirement on the value of the top quark mass (here named Hadronic top mass). In (cd) the bin corresponding to the largest −ln(likelihood) value includes events with −ln(likelihood)>70 and the associated prediction. In (ef) the bin corresponding to the largest Hadronic top mass value includes events with Hadronic top mass >346 GeV and the associated prediction. In (ef) the top quark mass pole value is set to be the same in the Breit–Wigners describing the masses of the leptonic and hadronic top quarks, but it is not fixed to the value of 172.5 GeV. Data are compared to expectation from Monte Carlo simulation and data-driven expectation. All selection criteria are applied, except for (ab) for which only the likelihood requirement and the requirement on jet multiplicity are not applied and for (cd) for which only the likelihood requirement is not applied. The band represents the 68 % confidence level interval of total uncertainty on the prediction

Simulation studies aimed at enhancing the fraction of reconstructed \(t\bar{t}\) events that are consistent with the \(t\bar{t}\) decay assignment hypothesis are used to determine a requirement on the likelihood of the kinematic fit. The likelihood distribution for the events after selection, except for the likelihood requirement ln(L)>−52, is shown in Figs. 1 (c–d). The likelihood optimally encapsulates all relevant information about the data agreement with simulation. Figures 1 (e–f) show the distributions of the invariant mass of the three reconstructed objects assigned to the hadronic top quark decay, obtained from the kinematic fit by relaxing the requirement on the value of the top quark mass, after all selection requirements. In these distributions the top quark mass pole value is set to be the same in the Breit–Wigners describing the masses of the leptonic and hadronic top quarks, but it is not fixed to the value of 172.5 GeV. Further studies on the performance of the kinematic fit can be found in Ref. [55]. Distributions of the reconstructed invariant mass, transverse momentum and rapidity of the reconstructed top–antitop pair, after all selection requirements, are shown in Fig. 2.

Fig. 2
figure 2

Distributions of the reconstructed (ab) \(t\bar{t}\) mass, \(m_{t\bar{t}}\), (cd) the \(t\bar{t}\) transverse momentum, \(p_{\mathrm{T}, t\bar{t}}\), and (ef) the \(t\bar{t}\) rapidity, \(y_{ t\bar{t}}\), before background subtraction and unfolding. In (ab) and (cd) the bin corresponding to the largest \(m_{t\bar{t}}\) (\(p_{\mathrm{T}, t\bar{t}}\)) value includes events with \(m_{t\bar{t}}\) (\(p_{\mathrm{T}, t\bar{t}}\)) larger than 2700 GeV (700 GeV). The largest reconstructed \(m_{t\bar{t}}\) in the μ+jets channel is 2603 GeV. Data are compared to the expectation derived from simulation and data-driven estimates. All selection criteria are applied for the (a, c, e) e+jets and (b, d, f) μ+jets channels. The uncertainty bands include all contributions given in Sect. 6 except those from PDF and theory normalization

The numbers of expected and observed data events in each channels after pre-tag, tagged and full event selection are listed in Table 1. The data agrees with the expectation within the systematic uncertainties.

Table 1 Numbers of predicted and observed events. The selection is shown after applying pre-tag, tagged, and the full selection criteria including the likelihood requirement. The quoted uncertainties include all uncertainties given in Sect. 6 except those from PDF and theory normalization. The numbers correspond to an integrated luminosity of 2.05 fb−1 in both e+jets and μ+jets samples

6 Systematic uncertainties

For each systematic effect the analysis is re-run with the variation corresponding to the one standard deviation change in each bin. The varied distributions are obtained for the upward and downward shift for each effect, and for each channel separately. If the direction of the variation is not defined (as in the case of the estimate resulting from the difference of two models), the estimated variation is assumed to be the same size in the upward and the downward direction and is symmetrized. The baseline distribution and the shifted distributions are the input to the pseudo-experiment calculation (see Sect. 8) that performs unfolding, efficiency correction, and enables combination of the e+jets and μ+jets channels.

The sources of systematic uncertainties are arranged in approximately independent groups. They are further categorized into detector modelling, and modelling of signal and background processes. The estimation of the variations resulting from the systematic uncertainty sources is the same as Ref. [39].

6.1 Detector modelling

Muon and electron trigger, reconstruction and selection efficiencies are measured in data using Z and W decays and incorporated into the simulation using weighted events. Each simulated event is weighted with the appropriate ratio (scale factor) of the measured efficiency to the simulated one. The uncertainties on the scale factors are estimated by varying the lepton and signal selections and background uncertainties. For lepton triggers the systematic uncertainties are about 1 %. The same procedure is used for lepton momentum scale and resolution scale factors resulting in uncertainties of 1–1.5 %. The corresponding scale factor uncertainties for electron (muon) reconstruction and identification efficiency are 1 % (0.5 %) and 3 % (0.5 %) respectively.

Information collected from collision data, test-beam data, and simulation is used to determine the jet energy scale; its uncertainty ranges from 2.5 % to 8 %, varying with jet p T and η [42]. The uncertainties include flavour composition of the sample and mis-measurements due to nearby jets. Pile-up gives an additional uncertainty of 2.5 % (5 %) in the central (forward) region. An extra uncertainty of up to 2.5 % is added to account for the fragmentation of b-quarks. The jet energy resolution and reconstruction efficiency are measured in data using the same methods as in Refs. [42, 56]. Jet energy resolution uncertainties range from 9–17 % for jet p T≃30 GeV to about 5–9 % for jet p T>180 GeV depending on jet η. The jet reconstruction efficiency uncertainty is 1–2 %. The uncertainties from the energy scale and resolution corrections on leptons and jets are propagated to the uncertainties on missing transverse momentum. Uncertainties on \(E_{\mathrm{T}}^{\mathrm{miss}}\) also include contributions arising from calorimeter cells not associated to jets and from soft jets (those in the range 7 GeV<p T<20 GeV). The b-tagging efficiency scale factors have uncertainties between 6 % to 15 %, and mis-tag rate scale factor uncertainties range from 10 % to 21 %. The scale factors are derived from data and parameterized as a function of jet p T.

A small region of the liquid argon calorimeter could not be read out in a subset of the data corresponding to 42 % of the total dataset. Corresponding data and simulated events where a jet with p T>20 GeV is close to the failing region are rejected. This requirement rejects about 6 % of the events. A systematic uncertainty is derived from variations of the p T-threshold of the jets by 20 % resulting from studies of the response of jets close to the failing region, using dijet p T balance in data.

The uncertainty on the measured luminosity is 3.7 % [57, 58].

6.2 Signal and background modelling

Sources of systematic uncertainty for the signal are the choice of generator, parton shower model, hadronization and underlying event model, the choice of PDF, and the tuning of initial- and final-state radiation. Predictions from the MC@NLO and POWHEG [59, 60] generators are compared to determine the generator dependence. The parton showering is assessed by comparing POWHEG samples interfaced to HERWIG and PYTHIA, respectively. The amount of initial- and final-state radiation is varied by modifying parameters in AcerMC [61] interfaced to PYTHIA. The parameters are varied in a range comparable to those used in the Perugia Soft/Hard tune variations [62]. The present initial-state radiation variations are to be considered generous: the spread of the resulting theoretical predictions for jet activity in \(t\bar{t}\) events is often wider than the experimental uncertainties in precision measurements performed by ATLAS in LHC pp collisions at \(\sqrt{s} = 7~\mbox{TeV}\) [63]. The impact of the PDF uncertainties is studied using the procedure described in Refs. [28, 6466].

Background processes are either estimated by simulation or using auxiliary measurements, see Sect. 4. The uncertainty on the fake-lepton background is estimated by varying the requirements on the low-\(m_{\mathrm{T}}^{W}\) and low-\(E_{\mathrm{T}}^{\mathrm{miss}}\) control regions, taking into account the statistical uncertainty and background corrections. The total uncertainty is estimated to be 100 %. The normalization of W+jets background is derived from auxiliary measurements using the asymmetric production of positively and negatively charged W bosons in W+jets events. The total uncertainties are estimated to be 21 % and 23 % in the four-jet bin, for the electron and muon channels respectively. These uncertainties are estimated by evaluating the effect on both r MC and k 2→≥4 from the jet energy scale uncertainty and different PDF and generator choices. Systematic uncertainties on the shape of W+jets distributions are assigned based on differences in simulated events generated with different factorization and parton matching scales. Scaling factors correcting the fraction of heavy-flavour contributions in simulated W+jets samples are derived from auxiliary measurements (see Sect. 4.2). The systematic uncertainties are found by changing the normalizations of the non-W processes within their uncertainties when computing \(W^{\mathrm{data}}_{i,\text{pre-tag}}\) and \(W^{\mathrm{data}}_{i, \mathrm{tagged}}\), as well as taking into account the impact of uncertainties in b-tagging efficiencies. The uncertainties are 47 % for \(Wb\bar{b}+\mathrm{jets}\) and \(Wc\bar{c}+\mathrm{jets}\) contributions and 32 % for Wc+jets contributions. In the μ+jets channel the fractional contributions of \(Wb\bar{b}+\mathrm{jets}\), \(Wc\bar{c}+\mathrm{jets}\) and Wc+jets samples to the total W+jets prediction are estimated to be 9 %, 17 % and 12 % (36 %, 25 % and 17 %) respectively, before (after) the b-tagging requirement. In the e+jets channel the fractional contributions of \(Wb\bar{b}+\mathrm{jets}\), \(Wc\bar{c}+\mathrm{jets}\) and Wc+jets samples to the total W+jets prediction are estimated to be 9 %, 17 % and 13 % (35 %, 25 % and 17 %) respectively, before (after) the b-tagging requirement. The normalization of Z+jets events is estimated using Berends–Giele-scaling [67]. The uncertainty in the normalization is 48 % in the four-jet bin and increases with the jet multiplicity. The uncertainties on the normalization of the small background contributions from diboson and single top production are estimated to be about 5 % [46, 50, 51] and 10 % [5254], respectively.

The statistical uncertainty on the Monte Carlo prediction due to limited Monte Carlo sample size is included as a systematic uncertainty in each bin for each process.

7 Cross-section unfolding

7.1 Unfolding procedure

The underlying binned true differential cross-section distributions (σ j ) are obtained from the reconstructed events using an unfolding technique that corrects for detector effects. The unfolding starts from the reconstructed event distribution (N i ), where the backgrounds (B i ) have been subtracted. The unfolding uses a response matrix (R ij ), see Eq. (3), derived from simulated \(t\bar{t}\) events, which maps the binned generated events to the binned reconstructed events. The kinematic properties of the generated t and \(\bar {t}\) partons in simulated \(t\bar{t}\) events define the “true” properties of the \(t\bar{t}\) events.

In its simplest form the unfolding equation can be written as

$$ N_i = \sum_{j} R_{ij} \sigma_j \mathcal{L} + B_i = \sum _{j} M_{ij} A_j \sigma_j \mathcal{L} + B_i, $$
(3)

where \(\mathcal{L}\) is the integrated luminosity, M ij is the bin migration matrix (see Fig. 3), and A j is the acceptance for inclusive \(t\bar{t}\) events. The leptonic branching fractions are set according to Ref. [68].

Fig. 3
figure 3

Migration matrices for (ab) \(m_{t\bar{t}}\), (cd) \(p_{\mathrm{T}, t\bar{t}}\), and (ef) \(y_{ t\bar{t}}\) estimated from simulated \(t\bar{t}\) events passing all (left) e+jets and (right) μ+jets selection criteria. The unit of the matrix elements is the probability for an event generated at a given value to be reconstructed at another value

The estimated acceptances for simulated \(t\bar{t}\) events as a function of \(m_{t\bar{t}}\), \(p_{\mathrm{T}, t\bar{t}} \) and \(y_{ t\bar{t}} \) are reported in Table 2. The overall acceptances before the requirement on the likelihood value are comparable to previous measurements [45]. The additional requirement on the likelihood value is expected to retain a large fraction of the previously selected \(t\bar{t}\) events (see Table 1). A finely binned illustration of the acceptances is shown in Fig. 4. The reduction in acceptance associated with high \(m_{t\bar{t}}\) and \(p_{\mathrm{T}, t\bar{t}} \) values is predominantly due to the presence of increasingly non-isolated leptons coupled to lower jet multiplicity as \(t\bar{t}\) decay products are forced in a closer space region by the boost at large top quark p T. In the case of high \(|y_{ t\bar{t}}|\) it is mainly due to jets falling outside of the required pseudorapidity range (see Sect. 3.2).

Fig. 4
figure 4

Acceptance as a function of (a) \(t\bar{t}\) mass, \(m_{t\bar{t}}\), (b\(t\bar{t}\) transverse momentum, \(p_{\mathrm{T}, t\bar{t}}\), and (c\(t\bar{t}\) rapidity, \(y_{ t\bar{t}}\). The acceptance is defined according to Eq. (3) for inclusive \(t\bar{t}\) events after all selection requirements. The leptonic branching fractions are set according to Ref. [68]. The error bars show only the uncertainty due to limited Monte Carlo sample size

Table 2 Table of acceptances for \(m_{t\bar{t}}\), \(p_{\mathrm{T}, t\bar{t}}\) and \(y_{ t\bar{t}}\). The acceptance is defined according to Eq. (3) for inclusive \(t\bar{t}\) events after all selection requirements. The leptonic branching fractions are set according to Ref. [68]. In the case of \(y_{ t\bar{t}}\), acceptances in positive and negative symmetric bins are consistent within uncertainties

The cross-section σ j is then extracted by solving Eq. (3)

$$ \sigma_j = \frac{\sum_{i} (M^{-1})_{ji} (N_i - B_i)}{A_j \mathcal{L}}. $$
(4)

The bin size is optimized using pseudo-experiments drawn from simulated events including systematic uncertainties. The adopted optimization strategy is to choose as small a bin size as possible without substantially increasing the total uncertainty after unfolding. This effectively means keeping about 68 % of the events on the diagonal of the migration matrix, and requiring that the condition numberFootnote 3 of the migration matrix is O(1). The finely binned distributions before unfolding reported in Fig. 2 show good agreement between reconstructed data and the MC and data-driven predictions.

To evaluate the performance of the unfolding procedure, and to estimate the systematic uncertainties, Eq. (4) has been extended to the following form to allow detailed studies using pseudo-experiments

$$ \sigma_j(d_k) = \frac{\sum_{i} (M^{-1})_{ji}(d_k)[P(N_i) - B_i(d_k)]}{ A_j(d_k)\mathcal{L}(d_k)}, $$
(5)

where P(N i ) is the Poisson distribution with mean N i , and d k are continuous variables representing the systematic uncertainties, drawn from a Gaussian distribution with zero mean and unit standard deviation. A cross-section estimate σ j is extracted for a given variable (\(m_{t\bar{t}}\), \(p_{\mathrm{T}, t\bar{t}}\), \(y_{ t\bar {t}}\)) from each pseudo-experiment. The distribution of σ j resulting from the pseudo-experiments is an estimator of the probability density of all possible outcomes of the measurement. Two thousand pseudo-experiments are used to extract the cross-section values. The 68 % confidence interval provides the cross-section uncertainty. The parametric dependence on d k in (M −1) ij , and other functions, is approximated using the linear term in the Taylor expansion, treating positive and negative derivative estimates separately.

A closure test is performed by unfolding simulated (folded) events where d k =0. The deviation of the unfolded cross-section from the known true cross-section input, used for the detector simulation folding, is consistent with zero within 1 % uncertainty. The most important test of the unfolding is to test the ability to unfold a distribution significantly different from the Monte Carlo expectation. This is done by re-weighting simulated \(t\bar{t}\) events so that the number of events in a single bin of true \(m_{t\bar{t}}\) is doubled. The observed linearity of the response to these “delta-like” pulses is within 1 %. The same test was also performed using a regularized unfolding technique based on Singular Value Decomposition [69]. The size of the “delta-like” pulses was then found to be substantially reduced (at least by 30 %) after unfolding, even under the mildest regularization conditions. Given the bias from this particular unfolding implementation which does not allow to reduce the regularization any further, all final results are derived using the plain matrix inversion described above. The increased statistical uncertainty of this unregularized result is tolerated given that the total uncertainty is dominated by systematic effects.

7.2 Combination of channels

The unfolded cross-sections from the two channels, e+jets and μ+jets, are combined using a weighted mean which includes the full covariance matrix between the channels. Since the covariance matrix is used in the weighting, the estimate is a best linear unbiased estimator of the cross-section. The covariance matrix is determined in simulated events using the same pseudo-experiment procedure outlined in the previous section and derived from Eq. (5).

8 Results

To reduce systematic uncertainties only relative cross-sections (differential cross-section normalized to the measured inclusive cross-section) are reported. The full procedure for the differential measurement is also contracted down to one bin to perform the measurement of the inclusive cross-section by using Eq. (3) and Eq. (4). In this case the measurement is reduced to a standard “cut-and-count” technique (as used for the first ATLAS \(t\bar{t}\) cross-section measurement [45]) and the response matrix is reduced to the standard acceptance correction. The total inclusive cross-section, combining e+jets and μ+jets channels, is found to be \(\sigma_{ t\bar{t}} = 160 \pm 25~\mbox{pb}\). The quoted uncertainty includes both statistical and systematic contributions and it is dominated by the systematic component. The result is compatible with the expected \(t\bar{t}\) inclusive cross-section and with previous measurements [36].

The relative differential cross-section results are listed in Table 3 as a function of \(m_{t\bar{t}}\), \(p_{\mathrm{T}, t\bar{t}}\) and \(y_{ t\bar {t}}\). Both single-channel results and results from the combination are shown. The correlation coefficients between the measured bins of the combined result are estimated using five thousand pseudo-experiments, see Table 4. The covariance matrices are derived by combining the correlation coefficients with the uncertainties for the respective measurements reported in Table 3 for \(m_{t\bar{t}}\), \(p_{\mathrm{T}, t\bar{t}}\) and \(y_{ t\bar{t}}\) respectively. A graphical representation for the combined results is shown in Fig. 5. The measurements are reported with their full uncertainty, combining statistical and systematic effects, and they are compared to NLO predictions from MCFM [8] for all variables; NLO+NNLL predictions from Ref. [7] are included for 1/σ \(d\sigma/d m_{t\bar{t}}\). Theory uncertainty bands include uncertainties on parton distribution functions, the strong coupling constant α S and on factorization and renormalization scales. For the NLO predictions, the uncertainty from PDFs and α S is set to the maximal spread of the predictions from three different NLO PDF sets (CTEQ6.6, MSTW2008NLO and NNPDF2.0) according to the PDF-specific recipe in Refs. [28, 6466]. Renormalization and factorization scales are set to the top quark mass value of 172.5 GeV and associated uncertainties are derived from an upward and downward scale variation of a factor of two. The overall NLO uncertainty is obtained by summing the contributions from PDFs and α S to the contributions from scales in quadrature for variations in the same direction. For the NLO+NNLL estimates the uncertainties are derived according to the approach of Ref. [7]. The uncertainty on the MSTW2008NNLO PDFs and α S at the 68 % confidence level is combined in quadrature with the uncertainties derived from the variations of the factorization scale and the renormalization scales. For 1/σ \(d\sigma/d m_{t\bar{t}}\) the scale uncertainties are dominant. Predictions from MC@NLO and ALPGEN are shown for fixed settings of the generators’ parameters. The settings for MC@NLO are given in Sect. 2. ALPGEN is version 2.13 using the CTEQ6L1 PDF with the top quark mass set to 172.5 GeV. Renormalization and factorization scales are set to the same value: the square root of the sum of the squared transverse energies of the final state partons. The matching parameters [70] for up to five extra partons are set to \(E_{T}^{\mathrm{clus}} = 20~\mbox{GeV}\) and R match=0.7. Parton showering and underlying event are simulated by HERWIG and JIMMY respectively, using the generator tune AUET1 [31].

Fig. 5
figure 5

Relative differential cross-section versus (ab) \(m_{t\bar{t}}\), (c\(p_{\mathrm{T}, t\bar{t}}\) and (d\(y_{ t\bar{t}}\). Note that the histograms are a graphical representation of Table 3. This means that only the bin ranges along the x-axis (and not the position of the vertical error bar) can be associated to the relative differential cross-section values on the y-axis. The relative cross-section in each bin shown in Table 3 is compared to the NLO prediction from MCFM [8]. For \(m_{t\bar{t}}\) the results are also compared with the NLO+NNLL prediction from Ref. [7]. The measured uncertainty represents 68 % confidence level including both statistical and systematic uncertainties. The bands represent theory uncertainties (see Sect. 8 for details). Predictions from MC@NLO and ALPGEN are shown for fixed settings of the generators’ parameters (details are found in Sect. 8)

Table 3 Relative differential cross-section (top) 1/σ \(d\sigma/d m_{t\bar{t}}\), (middle) 1/σ \(d\sigma/d p_{\mathrm{T}, t\bar {t}}\) and (bottom) 1/σ \(d\sigma/d y_{ t\bar{t}}\) measured in the e+jets, μ+jets and the combined +jets channel
Table 4 Correlation coefficients between bins of the relative differential cross-section (top) 1/σ \(d\sigma/d m_{t\bar{t}}\), (middle) 1/σ \(d\sigma/d p_{\mathrm{T}, t\bar{t}}\) and (bottom) 1/σ \(d\sigma/d y_{ t\bar{t}}\) in the combined +jets channel

The impact of the different uncertainty sources on the final results is estimated and shown in Table 5. For 1/σ \(d\sigma/d m_{t\bar{t}}\) the relative statistical uncertainty varies from about 2 % at low \(m_{t\bar{t}}\) to about 20 % at the highest \(m_{t\bar{t}}\), while the systematic uncertainty ranges between 10 % at intermediate \(m_{t\bar{t}}\) values to about 37 % at the highest \(m_{t\bar{t}}\). In relation to 1/σ \(d\sigma/d p_{\mathrm{T}, t\bar{t}}\) the relative statistical uncertainty ranges between about 4 % at low \(p_{\mathrm{T}, t\bar {t}}\) and about 12 % at the highest \(p_{\mathrm{T}, t\bar{t}}\) values, while the systematic uncertainty increases from about 13 % to 20 % in the same interval. In the case of 1/σ \(d\sigma/d y_{ t\bar{t}} \) the relative statistical uncertainty increases from about 3 % at low \(y_{ t\bar {t}}\) to about 5 % at the highest \(y_{ t\bar{t}}\) values, while the systematic uncertainty changes from 4 % to 10 % over the same interval. Jet-related uncertainties are dominant for \(m_{t\bar {t}}\) and \(p_{\mathrm{T}, t\bar{t}}\), while for \(y_{ t\bar{t}}\) the dominant contributions are from fake-leptons and final-state radiation in addition to the jet uncertainties.

Table 5 Percentage uncertainties on (top) 1/σ \(d\sigma /d m_{t\bar{t}}\), (middle) 1/σ \(d\sigma/d p_{\mathrm{T}, t\bar {t}}\) and (bottom) 1/σ \(d\sigma/d y_{ t\bar{t}}\) in the combined +jets channel

No significant deviations from the SM expectations provided by the different MC generators are observed. The SM prediction for the relative cross-section distribution can be tested against the measured values by using the covariance matrix between the measured bins of the combined results.

9 Conclusions

Using a dataset of 2.05 fb−1, the relative differential cross-section for \(t\bar{t}\) production is measured as a function of three properties of the \(t\bar{t}\) system: mass (\(m_{t\bar {t}}\)), p T (\(p_{\mathrm{T}, t\bar{t}}\)) and rapidity (\(y_{ t\bar{t}}\)). The background-subtracted, detector-unfolded values of 1/σ \(d\sigma/d m_{t\bar{t}}\), 1/σ \(d\sigma/d p_{\mathrm{T}, t\bar{t}}\) and 1/σ \(d\sigma/d y_{ t\bar{t}} \) are reported together with their respective covariance matrices, and compared to theoretical calculations. The measurement uncertainties range typically between 10 % and 20 % and are generally dominated by systematic effects. No significant deviations from the SM expectations provided by the different MC generators are observed.