1 Introduction

At the Large Hadron Collider (LHC), top quarks are copiously produced in quark–antiquark pairs (\(t\bar{t}\)). This process has been extensively studied in proton–proton collisions at 7 and \(8\,\text {TeV}\), and recently at \(13\,\text {TeV}\) [1, 2] centre-of-mass energy. Measurements of the associated production of \(t\bar{t}\) with a Z boson (\(t\bar{t} Z\)) allow the extraction of information about the neutral-current coupling of the top quark. The production rate of a top-quark pair with a massive vector boson could be altered in the presence of physics beyond the Standard Model (SM), such as vector-like quarks [3, 4], strongly coupled Higgs bosons [5] or technicolour [6,7,8,9,10], and therefore the measurements of \(\sigma _{t\bar{t} Z}\) and \(\sigma _{t\bar{t} W}\) are important checks of the validity of the SM at this new energy regime. The \(t\bar{t} Z\) and \(t\bar{t} W\) processes have been established by ATLAS [11] and CMS [12] using the Run-1 dataset at \(\sqrt{s} = 8\,\text {TeV}\), with measured cross sections compatible with the SM prediction and having uncertainties of \(\sim \!\!30\%\). At \(\sqrt{s} = 13\,\text {TeV}\), the SM cross sections of the \(t\bar{t} Z\) and \(t\bar{t} W\) processes increase by factors of 3.5 and 2.4, respectively, compared to \(\sqrt{s} = 8\,\text {TeV}\). The cross sections, computed at next-to-leading-order (NLO) QCD precision, using MadGraph5_aMC@NLO (referred to in the following as MG5_aMC), are \(\sigma _{t\bar{t} Z} = 0.84\,\text {pb}\) and \(\sigma _{t\bar{t} W} = 0.60\,\text {pb}\) with an uncertainty of \(\sim \!\!12\%\) [13, 14], primarily due to higher-order corrections, estimated by varying the renormalisation and factorisation scales.

This paper presents measurements of the \(t\bar{t} Z\) and \(t\bar{t} W\) cross sections using \(3.2\,\text{ fb }^{-1}\) of proton–proton (pp) collision data at \(\sqrt{s} = 13\,\text {TeV}\) collected by the ATLAS detector in 2015. The final states of top-quark pairs produced in association with a Z or a W boson comprise up to four isolated, prompt leptons.Footnote 1 Decay modes with two same-sign (SS) charged muons, or three or four leptons are considered in this analysis. The analysis strategy follows the strategy adopted for the \(8\,\text {TeV}\) dataset [11], excluding the lower sensitivity SS dilepton channels. Table 1 lists the analysis channels and the targeted decay modes of the \(t\bar{t} Z\) and \(t\bar{t} W\) processes. Each channel is divided into multiple analysis regions in order to enhance the sensitivity to the signal. Simultaneous fits are performed to the signal regions and selected control regions in order to extract the cross sections for \(t\bar{t} Z\) and \(t\bar{t} W\) production. Additional validation regions are defined to check that the background estimate agrees with the data and are not used in the fit.

Table 1 List of \(t\bar{t} W\) and \(t\bar{t} Z\) decay modes and analysis channels targeting them

2 The ATLAS detector

The ATLAS detector [15] consists of four main subsystems: an inner tracking system, electromagnetic (EM) and hadronic calorimeters, and a muon spectrometer (MS). The inner detector (ID) consists of a high-granularity silicon pixel detector, including the newly installed Insertable B-Layer [16], which is the innermost layer of the tracking system, and a silicon microstrip tracker, together providing precision tracking in the pseudorapidityFootnote 2 range \(|\eta |<2.5\) and of a transition radiation tracker covering \(|\eta |<2.0\). All the systems are immersed in a 2T magnetic field provided by a superconducting solenoid. The EM sampling calorimeter uses lead and liquid argon (LAr) and is divided into barrel (\(|\eta |<1.475\)) and endcap (\(1.375<|\eta |<3.2\)) regions. Hadron calorimetry is provided by a steel/scintillator-tile calorimeter, segmented into three barrel structures, in the range \(|\eta |<1.7\), and by two copper/LAr hadronic endcap calorimeters that cover the region \(1.5<|\eta |<3.2\). The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules, optimised for EM and hadronic measurements respectively, covering the region \(3.1<|\eta |<4.9\). The muon spectrometer measures the deflection of muon tracks in the range \(|\eta |<2.7\) using multiple layers of high-precision tracking chambers located in toroidal magnetic fields. The field integral of the toroids ranges between 2.0 and 6.0Tm for most of the detector. The muon spectrometer is also instrumented with separate trigger chambers covering \(|\eta |<2.4\). A two-level trigger system, using custom hardware followed by a software-based trigger level, is used to reduce the event rate to an average of around 1 kHz for offline storage.

3 Data and simulated event samples

The data were collected with the ATLAS detector during 2015 with a bunch spacing of 25 ns and a mean number of 14 pp interactions per bunch crossing (pile-up). With strict data-quality requirements, the integrated luminosity considered corresponds to \(3.2\,\text{ fb }^{-1}\) with an uncertainty of \(2.1\%\) [17].

Monte Carlo simulation samples (MC) are used to model the expected signal and background distributions in the different control, validation and signal regions described below. The heavy-flavour decays involving \(b-\) and \(c-\)quarks, particularly important to this measurement, are modelled using the EvtGen  [18] program, except for processes modelled using the Sherpa generator. In all samples the top-quark mass is set to \(172.5\,\text {GeV}\) and the Higgs boson mass is set to \(125\,\text {GeV}\). The response of the detector to stableFootnote 3 particles is emulated by a dedicated simulation [19] based either fully on Geant  [20] or on a faster parameterisation [21] for the calorimeter response and Geant for other detector systems. To account for additional pp interactions from the same and close-by bunch crossings, a set of minimum-bias interactions generated using Pythia v8.210 [22], referred to as Pythia 8 in the following, with the A2 [23] set of tuned MC parameters (A2 tune) is superimposed on the hard-scattering events. In order to reproduce the same pile-up levels present in the data, the distribution of the number of additional pp interactions in the MC samples is reweighted to match the one in the data. All samples are processed through the same reconstruction software as the data. Simulated events are corrected so that the object identification, reconstruction and trigger efficiencies, energy scales and energy resolutions match those determined from data control samples.

The associated production of a top-quark pair with one or two vector bosons is generated at leading order (LO) with MG5_aMC interfaced to Pythia 8, with up to two (\(t\bar{t} W\)), one (\(t\bar{t} Z\)) or no (\(t\bar{t} WW\)) extra partons included in the matrix elements. The \(\gamma ^{*}\) contribution and the \(Z/\gamma ^{*}\) interference are included in the \(t\bar{t} Z\) samples. The A14  [24] set of tuned MC parameters (A14 tune) is used together with the NNPDF2.3LO parton distribution function (PDF) set [25]. The samples are normalised using cross sections computed at NLO in QCD [26].

The t-channel production of a single top quark in association with a Z boson (\(tZ\)) is generated using MG5_aMC interfaced with Pythia v6.427 [27], referred to as Pythia 6 in the following, with the CTEQ6L1 PDF [28] set and the Perugia2012 [29] set of tuned MC parameters at NLO in QCD. The \(Z/\gamma ^{*}\) interference is included, and the four-flavour scheme is used in the computation.

The \(Wt\)-channel production of a single top quark together with a Z boson (\(tWZ\)) is generated with MG5_aMC and showered with Pythia 8, using the NNPDF3.0NLO PDF set [30] and the A14 tune. The generation is performed at NLO in QCD using the five-flavour scheme. Diagrams containing a top-quark pair are removed to avoid overlap with the \(t\bar{t} Z\) process.

Diboson processes with four charged leptons (\(4\ell \)), three charged leptons and one neutrino (\(\ell \ell \ell \nu \)) or two charged leptons and two neutrinos (\(\ell \ell \nu \nu \)) are simulated using the Sherpa 2.1 generator [31]. The matrix elements include all diagrams with four electroweak vertices. They are calculated for up to one (\(4\ell , \ell \ell \nu \nu \)) or no additional partons (\(\ell \ell \ell \nu \)) at NLO and up to three partons at LO using the Comix  [32] and OpenLoops  [33] matrix element generators and merged with the Sherpa parton shower using the ME+PS@NLO prescription [34]. The CT10nlo PDF set [35] is used in conjunction with a dedicated parton-shower tuning developed by the Sherpa authors. The NLO cross sections calculated by the generator are used to normalise diboson processes. Alternative diboson samples are simulated using the Powheg-Box v2 [36] generator, interfaced to the Pythia 8 parton shower model, and for which the CT10nlo PDF set is used in the matrix element, while the CTEQ6L1 PDF set is used for the parton shower along with the AZNLO [37] set of tuned MC parameters.

The production of three massive vector bosons with subsequent leptonic decays of all three bosons is modelled at LO with the Sherpa 2.1 generator and the CT10 PDF set [35]. Up to two additional partons are included in the matrix element at LO and the full NLO accuracy is used for the inclusive process.

Electroweak processes involving the vector-boson scattering (VBS) diagram and producing two same-sign leptons, two neutrinos and two partons are modelled using Sherpa 2.1 at LO accuracy and the CT10 PDF set. Processes of orders four and six in the electroweak coupling constant are considered, and up to one additional parton is included in the matrix element.

For the generation of \(t\bar{t}\) events and \(Wt\)-channel single-top-quark events the Powheg-Box v2 generator is used with the CT10 PDF set. The parton shower and the underlying event are simulated using Pythia 6 with the CTEQ6L1 PDF set and the corresponding Perugia2012 tune. The \(t\bar{t}\) samples are normalised to their next-to-next-to-leading-order (NNLO) cross-section predictions, including soft-gluon resummation to next-to-next-to-leading-log order, as calculated with the Top++2.0 program (see Ref. [38] and references therein). For more efficient sample generation, the \(t\bar{t}\) sample is produced by selecting only true dilepton events in the final state. Moreover, an additional dilepton \(t\bar{t}\) sample requiring a b-hadron not coming from top-quark decays is generated after b-jet selection. Diagram removal is employed to remove the overlap between \(t\bar{t}\) and \(Wt\)  [39].

Samples of \(t\bar{t}\) events produced in association with a Higgs boson (\(t\bar{t} H\)) are generated using NLO matrix elements in MG5_aMC with the CT10NLO PDF set and interfaced with Pythia 8 for the modelling of the parton shower. Higgs boson production via gluon–gluon fusion (ggF) and vector boson fusion (VBF) is generated using the Powheg-Box v2 generator with CT10 PDF set. The parton shower and underlying event are simulated using Pythia 8 with the CTEQ6L1 PDF set and AZNLO tune. Higgs boson production with a vector boson is generated at LO using Pythia 8 with the CTEQ6L1 PDF. All Higgs boson samples are normalised using theoretical calculations of Ref. [40].

Events containing Z or W bosons with associated jets, referred to as Z+jets and W+jets in the following, are simulated using the Sherpa 2.1 generator. Matrix elements are calculated for up to two partons at NLO and four partons at LO. The CT10 PDF set is used in conjunction with a dedicated parton-shower tuning developed by the Sherpa authors [31]. The Z / W+jets samples are normalised to the NNLO cross sections [41,42,43,44]. Alternative Z / W+jets samples are simulated using MG5_aMC at LO interfaced to the Pythia 8 parton shower model. The A14 tune is used together with the NNPDF2.3LO PDF set.

The SM production of three and four top quarks is generated at LO with MG5_aMC+Pythia 8, using the A14 tune together with the NNPDF2.3LO PDF set. The samples are normalised using cross sections computed at NLO [45, 46].

4 Object reconstruction

The final states of interest in this analysis contain electrons, muons, jets, b-jets and missing transverse momentum.

Electron candidates [47] are reconstructed from energy deposits (clusters) in the EM calorimeter that are associated with reconstructed tracks in the inner detector. The electron identification relies on a likelihood-based selection [48, 49]. Electrons are required to pass the “medium” likelihood identification requirements described in Ref. [49]. These include requirements on the shapes of the electromagnetic shower in the calorimeter as well as tracking and track-to-cluster matching quantities. The electrons are also requirement to have transverse momentum \(p_{\text {T}} > 7\,\text {GeV}\) and \(|\eta _\mathrm {cluster}| < 2.47\), where \(\eta _\mathrm {cluster}\) is the pseudorapidity of the calorimeter energy deposit associated with the electron candidate. Candidates in the EM calorimeter barrel/endcap transition region \(1.37< |\eta _\mathrm {cluster}| < 1.52\) are excluded.

Muon candidates are reconstructed from a fit to track segments in the various layers of the muon spectrometer, matched with tracks identified in the inner detector. Muons are required to have \(p_{\text {T}} > 7\,\text {GeV}\) and \(|\eta | < 2.4\) and to pass the “medium” identification requirements defined in Ref. [50]. The medium requirement includes selections on the numbers of hits in the ID and MS as well as a compatibility requirement between momentum measurements in the ID and MS. It provides a high efficiency and purity of selected muons. Electron candidates sharing a track with a muon candidate are removed.

To reduce the non-prompt lepton background from hadron decays or jets misidentified as leptons (labelled as “fake leptons” throughout this paper), electron and muon candidates are required to be isolated. The total sum of track transverse momenta in a surrounding cone of size \(\min (10\,\text {GeV}/p_{\text {T}}, r_{e,\mu })\), excluding the track of the candidate from the sum, is required to be less than \(6\%\) of the candidate \(p_{\text {T}}\), where \(r_e = 0.2\) and \(r_{\mu } = 0.3\). In addition, the sum of the cluster transverse energies in the calorimeter within a cone of size \(\Delta R_{\eta } \equiv \sqrt{(\Delta \eta )^2 + (\Delta \phi )^2} = 0.2\) of any electron candidate, excluding energy deposits of the candidate itself, is required to be less than \(6\%\) of the candidate \(p_{\text {T}}\).

For both electrons and muons, the longitudinal impact parameter of the associated track with respect to the primary vertex,Footnote 4 \(z_{0}\), is required to satisfy \(|z_0 \sin \theta |<0.5\) mm. The significance of the transverse impact parameter \(d_0\) is required to satisfy \(|d_0|/\sigma (d_0)<5\) for electrons and \(|d_0|/\sigma (d_0)<3\) for muons, where \(\sigma (d_0)\) is the uncertainty in \(d_0\).

Jets are reconstructed using the anti-\(k_t\) algorithm [51, 52] with radius parameter \(R = 0.4\), starting from topological clusters in the calorimeters [53]. The effect of pile-up on jet energies is accounted for by a jet-area-based correction [54] and the energy resolution of the jets is improved by using global sequential corrections [55]. Jets are calibrated to the hadronic energy scale using E- and \(\eta \)-dependent calibration factors based on MC simulations, with in-situ corrections based on Run-1 data [56, 57] and checked with early Run-2 data [58]. Jets are accepted if they fulfil the requirements \(p_{\text {T}} > 25\,\text {GeV}\) and \(|\eta | < 2.5\). To reduce the contribution from jets associated with pile-up, jets with \(p_{\text {T}} < 60\,\text {GeV}\) and \(|\eta | < 2.4\) are required to satisfy pile-up rejection criteria (JVT), based on a multivariate combination of track-based variables  [59].

Jets are b-tagged as likely to contain b-hadrons using the MV2c20 algorithm, a multivariate discriminant making use of the long lifetime, large decay multiplicity, hard fragmentation and high mass of b-hadrons [60]. The average efficiency to correctly tag a b-jet is approximately \(77\%\), as determined in simulated \(t\bar{t}\) events, but it varies as a function of \(p_{\text {T}}\) and \(\eta \). In simulation, the tagging algorithm gives a rejection factor of about 130 against light-quark and gluon jets, and about 4.5 against jets containing charm quarks [61]. The efficiency of b-tagging in simulation is corrected to that in data using a \(t\bar{t}\)-based calibration using Run-1 data [62] and validated with Run-2 data [63].

The missing transverse momentum \(\mathbf {p}^\mathrm {miss}_\mathrm {T}\), with magnitude \(E_{\text {T}}^{\text {miss}}\), is a measure of the transverse momentum imbalance due to particles escaping detection. It is computed [64] as the negative sum of the transverse momenta of all electrons, muons and jets and an additional soft term. The soft term is constructed from all tracks that are associated with the primary vertex but not with any physics object. In this way, the \(E_{\text {T}}^{\text {miss}}\) is adjusted for the best calibration of the jets and the other identified physics objects above, while maintaining pile-up independence in the soft term [65, 66].

To prevent double-counting of electron energy deposits as jets, the closest jet within \(\Delta R_y = 0.2\) of a reconstructed electron is removed, where \(\Delta R_y \equiv \sqrt{(\Delta y)^2 + (\Delta \phi )^2}\). If the nearest jet surviving the above selection is within \(\Delta R_y = 0.4\) of an electron, the electron is discarded to ensure that selected electrons are sufficiently separated from nearby jet activity. To reduce the background from muons originating from heavy-flavour particle decays inside jets, muons are removed if they are separated from the nearest jet by \(\Delta R_y < 0.4\). However, if this jet has fewer than three associated tracks, the muon is kept and the jet is removed instead; this avoids an inefficiency for high-energy muons undergoing significant energy loss in the calorimeter.

5 Event selection and background estimation

Only events collected using single-electron or single-muon triggers are accepted. The trigger thresholds, \(p_{\text {T}} > 24\,\text {GeV}\) for electrons and \(p_{\text {T}} > 20\,\text {GeV}\) for muons, are set to be almost fully efficient for reconstructed leptons with \(p_{\text {T}} > 25\,\text {GeV}\). Events are required to have at least one reconstructed primary vertex. In all selections considered, at least one reconstructed lepton with \(p_{\text {T}} > 25\,\text {GeV}\) is required to match (\(\Delta R_{\eta } < 0.15\)) a lepton with the same flavour reconstructed by the trigger algorithm. Three channels are defined based on the number of reconstructed leptons, which are sorted according to their transverse momentum in decreasing order.

Background events containing well-identified prompt leptons are modelled by simulation. The normalisations for the WZ and ZZ processes are taken from data control regions and included in the fit. The yields in these data control regions are extrapolated to the signal regions using simulation. Systematic uncertainties in the extrapolation are taken into account in the overall uncertainty in the background estimate.

Background sources involving one or more fake leptons are modelled using data events from control regions. For the same-sign dimuon (\(2\mu \)-SS) analysis and the trilepton analysis the fake-lepton background is estimated using the matrix method [67], where any combination of fake leptons among the selected leptons is considered. However, compared to Ref. [67], the real- and fake-lepton efficiencies used by the matrix method are estimated in a different way in this measurement. The lepton efficiencies are measured by applying the matrix method in control regions, where the lepton efficiencies are extracted in a likelihood fit as free parameters using the matrix method as model, assuming Poisson statistics, and assuming that events with two fake leptons are negligible. In this way the parameters are by construction the actual parameters of the matrix model itself, instead of relying on external lepton efficiency measurements, which are not guaranteed to be fully consistent with the matrix model. The control regions are defined in dilepton events, separately for b-tagged and b-vetoed events to take into account the different fake-lepton efficiencies depending on whether the source is a light-flavour jet or a heavy-flavour jet. The real-lepton efficiencies are measured in inclusive opposite-sign events, and fake-lepton efficiencies in events with same-sign leptons and \(E_{\text {T}}^{\text {miss}} >40\,\text {GeV}\) (for b-tagged events \(E_{\text {T}}^{\text {miss}} >20\,\text {GeV}\)), after subtracting the estimated contribution from events with misidentification of the charge of a lepton (referred to as “charge-flip” in the following), and excluding the same-sign dimuon signal region. The charge-flip events are subtracted using simulation. The extracted fake-lepton efficiencies are found to be compatible with fake-lepton efficiencies from a fully data-driven procedure where the charge-flip events are estimated from data. For the tetralepton channel, the contribution from backgrounds containing fake leptons is estimated from simulation and corrected with scale factors determined in control regions.

The full selection requirements and the background evaluation strategies in the different channels are described below.

5.1 Same-sign dimuon analysis

The same-sign dimuon signal region targets the \(t\bar{t} W\) process and has the highest sensitivity among all same-sign dilepton regions [11]. The main reason for this is that electrons have a much larger charge misidentification probability, inducing a significant background from top-quark pairs. Events are required to have two muon candidates with the same charge and \(p_{\text {T}} >25\,\text {GeV}\), \(E_{\text {T}}^{\text {miss}} > 40\,\text {GeV}\), the scalar sum of the \(p_{\text {T}}\) of selected leptons and jets, \(H_{\text {T}}\), above \(240\,\text {GeV}\), and at least two b-tagged jets. Events containing additional leptons (with \(p_{\text {T}} >7\,\text {GeV}\)) are vetoed.

The dominant background in the \(2\mu \)-SS region arises from events containing fake leptons, where the main source is \(t\bar{t}\) events. Backgrounds from the production of prompt leptons with correctly identified charge come primarily from WZ production, but the relative contribution of this background is small compared to the fake-lepton background. The charge-flip background is negligible in this signal region, as the probability of misidentifying the charge of a muon in the relevant \(p_{\text {T}}\) range is negligible. For the validation of the fake-lepton background estimate a region is defined based on the signal region selection but omitting the \(E_{\text {T}}^{\text {miss}}\) requirement, reducing the \(p_{\text {T}}\) threshold of the subleading lepton to \(20\,\text {GeV}\) and requiring at least one b-tagged jet. The distributions of \(E_{\text {T}}^{\text {miss}}\) and subleading lepton \(p_{\text {T}}\) in this validation region (\(2\mu \)-SS-VR) are shown in Fig. 1. The expected numbers of events in the \(2\mu \)-SS signal region are shown in Table 4. Nine events are observed in data for this signal region.

Fig. 1
figure 1

The (left) \(E_{\text {T}}^{\text {miss}}\) and (right) subleading lepton \(p_{\text {T}}\) distributions shown for the b-tagged \(2\mu \)-SS channel where the signal region requirements on subleading lepton \(p_{\text {T}}\), number of b-tags, and \(E_{\text {T}}^{\text {miss}}\) are relaxed. The shaded band represents the total uncertainty. The background denoted ‘Other’ contains other SM processes producing two same-sign prompt leptons. The last bin in each of the distributions includes the overflow

5.2 Trilepton analysis

Four signal regions with exactly three leptons are considered. The first three are sensitive to \(t\bar{t} Z\); each of these requires an opposite-sign same-flavour (OSSF) pair of leptons whose invariant mass is within \(10\,\text {GeV}\) of the Z boson mass. The signal regions are categorised by their jet and b-jet multiplicities and have different signal-to-background ratios. In the 3\(\ell \)-Z-1b4j region, at least four jets are required, exactly one of which is b-tagged. In the 3\(\ell \)-Z-2b3j region, exactly three jets with at least two b-tagged jets are required. In the 3\(\ell \)-Z-2b4j region, at least four jets are required, of which at least two are b-tagged.

In the 3\(\ell \)-noZ-2b region at least two and at most four jets are required, of which at least two are b-tagged, no OSSF lepton pair is allowed in the Z boson mass window, and the sum of the lepton charges must be ±1. This region primarily targets the \(t\bar{t} W\) process but also has a sizeable \(t\bar{t} Z\) contribution.

The signal region definitions for the trilepton channel are summarised in Table 2, while the expected numbers of events in the signal regions are shown in Table 4. The dominant backgrounds in the 3\(\ell \)-Z-1b4j, 3\(\ell \)-Z-2b3j and 3\(\ell \)-Z-2b4j signal regions arise from Z+jets production with a fake lepton, diboson production and the production of a single top quark in association with a Z boson.

Table 2 Summary of event selections in the trilepton signal regions

A control region is used to constrain the normalisation of the WZ background in data. Exactly three leptons are required, at least one pair of which must be an OSSF pair with an invariant mass within \(10\,\text {GeV}\) of the Z boson mass. There must be exactly three jets, none of which pass the b-tagging requirement. With these requirements, the expected \(t\bar{t} Z\) signal contribution is roughly 1% of the total number of events. This region is referred to as 3\(\ell \)-WZ-CR and it is included in the fit. Distributions comparing data and SM prediction are shown in Fig. 2.

Fig. 2
figure 2

Distributions of (left) the number of electrons and (right) the third-lepton \(p_{\text {T}}\) in the 3\(\ell \)-WZ-CR control region before the fit. The background denoted ‘Other’ contains other SM processes producing three prompt leptons. The shaded band represents the total uncertainty. The last bin of the distribution shown in the right panel includes the overflow

Two background validation regions are defined for the trilepton channel. In the first region, \(3\ell \)-Z-VR, the presence of two OSSF leptons with an invariant mass within \(10\,\text {GeV}\) of the mass of the Z boson is required. The region requires the events to have at most three jets where exactly one is b-tagged, or exactly two jets where both jets are b-tagged. The main backgrounds are WZ production and Z+jets events with fake leptons. In the second region, \(3\ell \)-noZ-VR, events with such a pair of leptons are vetoed. This region requires the events to have at most three jets where exactly one is b-tagged, and it is dominated by the fake-lepton background from top-quark pair production. Neither validation region is used in the fit. The distributions of the number of electrons in each of the two validation regions are shown in Fig. 3, demonstrating that data and background modelling are in good agreement within statistical uncertainties.

Fig. 3
figure 3

Distributions of the number of electrons in the (left) \(3\ell \)-Z-VR and (right) \(3\ell \)-noZ-VR validation regions, shown before the fit. The background denoted ‘Other’ contains other SM processes producing three prompt leptons. The shaded band represents the total uncertainty

In total, 29 events are observed in the four signal regions. Distributions of the number of jets, number of b-tagged jets, missing transverse momentum and transverse momentum of the third lepton are shown in Fig. 4.

Fig. 4
figure 4

Distributions of (top left) the number of jets, (top right) the number of b-tagged jets, (bottom left) the missing transverse momentum and (bottom right) the third-lepton \(p_{\text {T}}\), for events contained in any of the three signal regions 3\(\ell \)-Z-1b4j, 3\(\ell \)-Z-2b3j or 3\(\ell \)-Z-2b4j. The distributions are shown before the fit. The background denoted ‘Other’ contains other SM processes producing three prompt leptons. The shaded band represents the total uncertainty. The last bin in each of the distributions shown in the bottom panels includes the overflow

5.3 Tetralepton analysis

The tetralepton channel targets the \(t\bar{t} Z\) process for the case where both W bosons resulting from top-quark decays and the Z boson decay leptonically. Events with two pairs of opposite-sign leptons are selected, and at least one pair must be of same flavour. The OSSF lepton pair with reconstructed invariant mass closest to \(m_Z\) is attributed to the \(Z\) boson decay and denoted in the following by \(Z_1\). The two remaining leptons are used to define \(Z_2\). Four signal regions are defined according to the relative flavour of the two \(Z_2\) leptons, different flavour (DF) or same flavour (SF), and the number of b-tagged jets: one, or at least two (1b, 2b). The signal regions are thus 4\(\ell \)-DF-1b, 4\(\ell \)-DF-2b, 4\(\ell \)-SF-1b and 4\(\ell \)-SF-2b.

To suppress events with fake leptons in the 1-b-tag multiplicity regions, additional requirements on the scalar sum of the transverse momenta of the third and fourth leptons (\(p_{\text {T34}} \)) are imposed. In the 4\(\ell \)-SF-1b and 4\(\ell \)-DF-1b regions, events are required to satisfy \(p_{\text {T34}} > {25}\,{\text {GeV}}\) and \(p_{\text {T34}} > {35}\,{\text {GeV}}\), respectively. In all regions, the invariant mass of any two reconstructed OS leptons is required to be larger than 10 \(\text {GeV}\). The signal region definitions for the tetralepton channel are summarised in Table 3.

Table 3 Definitions of the four signal regions in the tetralepton channel. All leptons are required to satisfy \(p_{\text {T}} > 7\,\text {GeV}\) and at least one lepton with \(p_{\text {T}} > 25\,\text {GeV}\) is required to be trigger matched. The invariant mass of any two reconstructed OS leptons is required to be larger than 10 \(\text {GeV}\)

A control region used to constrain the ZZ normalisation, referred to as 4\(\ell \)-ZZ-CR, is included in the fit and is defined to have exactly four reconstructed leptons, a \(Z_2\) pair with OSSF leptons, the value of both \(m_{Z_1}\) and \(m_{Z_2}\) within \(10\,\text {GeV}\) of the mass of the Z boson, and \(E_{\text {T}}^{\text {miss}} <40\,\text {GeV}\). The leading lepton \(p_{\text {T}}\), the invariant mass of the \(Z_2\) lepton pair, the missing transverse momentum and the jet multiplicity in this control region are shown in Fig. 5, and good agreement is seen between data and prediction.

Fig. 5
figure 5

(Top left) Leading lepton \(p_{\text {T}}\), (top right) \(m_{Z_2}\), (bottom left) missing transverse momentum and (bottom right) jet multiplicity distributions in the 4\(\ell \)-ZZ-CR control region. The distributions are shown before the fit. The shaded band represents the total uncertainty. The last bin of the distribution shown in the top left panel includes the overflow

The contribution from backgrounds containing fake leptons is estimated from simulation and corrected with scale factors determined in two control regions: one region enriched in \(t\bar{t}\) events and thus in heavy-flavour jets, and one region enriched in \(Z+\)jets events, and thus in light-flavour jets. The scale factors are calibrated separately for electron and muon fake-lepton candidates. The scale factors are applied to all MC simulation events with fewer than four prompt leptons according to the number and the flavour of the fake leptons. The \(t\bar{t}\) scale factors are applied to MC processes with real top quarks, while for all other processes the \(Z+\)jets scale factors are applied. Different generators are used when determining the scale factors and when applying them. It is verified that the uncertainties in the scale factors include the differences between these generators.

The expected yields in the signal and control regions in the tetralepton channel are shown in Table 4. Five events are observed in the four signal regions. Figure 6 shows the data superimposed to the expected distributions for all four signal regions combined. Overall the acceptance times efficiency for the \(t\bar{t} Z\) and \(t\bar{t} W\) processes is 6‰ and 2‰, respectively.

Fig. 6
figure 6

Distributions (left) of the invariant mass of the OSSF lepton pair closest to the Z boson mass, \(m_{Z_1}\), and (right) of the number of b-tagged jets, for events in the tetralepton signal regions. The distributions are shown before the fit. The background denoted ‘Other’ contains other SM processes producing four prompt leptons. The shaded band represents the total uncertainty. The first and last bin of the distribution shown in the left panel include the underflow and overflow, respectively

Table 4 Expected event yields for signal and backgrounds, and the observed data in all control and signal regions used in the fit to extract the \(t\bar{t} Z\) and \(t\bar{t} W\) cross sections. The quoted uncertainties in expected event yields represent systematic uncertainties including MC statistical uncertainties. The \(tZ\), \(tWZ\), \(t\bar{t} H\), three- and four-top-quark processes are denoted \(t+X\). The WZ, ZZ, \(H \rightarrow ZZ\) (ggF and VBF), HW and HZ and VBS processes are denoted ‘Bosons’

6 Systematic uncertainties

The normalisation of signal and background in each channel can be affected by several sources of systematic uncertainty. These are described in the following subsections.

6.1 Luminosity

The uncertainty in the integrated luminosity in the 2015 dataset is 2.1%. It is derived, following a methodology similar to that detailed in Ref. [68], from a calibration of the luminosity scale using xy beam-separation scans performed in August 2015. This systematic uncertainty is applied to all processes modelled using Monte Carlo simulations.

6.2 Uncertainties associated with reconstructed objects

Uncertainties associated with the lepton selection arise from imperfect knowledge of the trigger, reconstruction, identification and isolation efficiencies, and lepton momentum scale and resolution [47,48,49,50, 69]. The uncertainty in the electron identification efficiency is the largest systematic uncertainty in the trilepton channel and among the most important ones in the tetralepton channel.

Uncertainties associated with the jet selection arise from the jet energy scale (JES), the JVT requirement and the jet energy resolution (JER). Their estimations are based on Run-1 data and checked with early Run-2 data. The JES and its uncertainty are derived by combining information from test-beam data, collision data and simulation [70]. JES uncertainty components arising from the in-situ calibration and the jet flavour composition are among the dominant uncertainties in the \(2\mu \)-SS and trilepton channels. The uncertainties in the JER and JVT have a significant effect at low jet \(p_{\text {T}}\). The JER uncertainty results in the second largest uncertainty in the trilepton channel.

The efficiency of the flavour-tagging algorithm is measured for each jet flavour using control samples in data and in simulation. From these measurements, correction factors are defined to correct the tagging rates in the simulation. In the case of b-jets, correction factors and their uncertainties are estimated based on observed and simulated b-tagging rates in \(t\bar{t}\) dilepton events [62]. In the case of c-jets, they are derived based on jets with identified \(D^{*}\) mesons [71]. In the case of light-flavour jets, correction factors are derived using dijet events [71]. Sources of uncertainty affecting the b- and c-tagging efficiencies are considered as a function of jet \(p_{\text {T}}\), including bin-to-bin correlations [62]. An additional uncertainty is assigned to account for the extrapolation of the b-tagging efficiency measurement from the \(p_{\text {T}}\) region used to determine the scale factors to regions with higher \(p_{\text {T}}\). For the efficiency to tag light-flavour jets, the dependence of the uncertainty on the jet \(p_{\text {T}}\) and \(\eta \) is considered. These systematic uncertainties are taken as uncorrelated between b-jets, c-jets, and light-flavour jets.

The treatment of the uncertainties associated with reconstructed objects is common to all three channels, and thus these are considered as correlated among different regions.

6.3 Uncertainties in signal modelling

From the nominal MG5_aMC+Pythia  8 (A14 tune) configuration, two parameters are varied to investigate uncertainties from the modelling of the \(t\bar{t} Z\) and \(t\bar{t} W\) processes: the renormalisation (\(\mu _\mathrm{R}\)) and factorisation (\(\mu _\mathrm{F}\)) scales. A simultaneous variation of \(\mu _\mathrm{R} = \mu _\mathrm{F}\) by factors 2.0 and 0.5 is performed. In addition, the effects of a set of variations in the tune parameters (A14 eigentune variations), sensitive to initial- and final-state radiation, multiple parton interactions and colour reconnection, are evaluated. Studies performed at particle level show that the largest impact comes from variations in initial-state radiation [26]. The systematic uncertainty due to the choice of generator for the \(t\bar{t} Z\) and \(t\bar{t} W\) signals is estimated by comparing the nominal sample with one generated with Sherpa v2.2. The Sherpa sample uses the LO matrix element with up to one (two) additional parton(s) included in the matrix element calculation for \(t\bar{t} Z\) (\(t\bar{t} W\)) and merged with the Sherpa parton shower [72] using the ME+PS@LO prescription. The NNPDF3.0NLO PDF set is used in conjunction with a dedicated parton shower tune developed by the Sherpa authors. Signal modelling uncertainties are treated as correlated among channels.

6.4 Uncertainties in background modelling

In the trilepton and \(2\mu \)-SS channels, the diboson background is dominated by WZ production, while ZZ production is dominant in the tetralepton channel. While the inclusive cross sections for these processes are known to better than 10%, they contribute to the background in these channels if additional b-jets and other jets are produced and thus have a significantly larger uncertainty.

In the trilepton and \(2\mu \)-SS channels, the normalisation of the WZ background is treated as a free parameter in the fit used to extract the \(t\bar{t} Z\) and \(t\bar{t} W\) signals. The uncertainty in the extrapolation of the WZ background estimate from the control region to signal regions with specific jet and b-tag multiplicities is evaluated by comparing predictions obtained by varying the renormalisation, factorisation and resummation scales used in MC generation. The uncertainties vary across the different regions and an overall uncertainty of \(-50\%\) and \(+100\%\) is used.

The normalisation of the ZZ background is treated as a free parameter in the fit used to extract the \(t\bar{t} Z\) and \(t\bar{t} W\) signals. In the tetralepton channel, several uncertainties in the ZZ background estimate are considered. They arise from the extrapolation from the 4\(\ell \)-ZZ-CR control region (corresponding to on-shell ZZ production) to the signal region (with off-shell ZZ background) and from the extrapolation from the control region without jets to the signal region with at least one jet. They are found to be 30 and 20%, respectively. An additional uncertainty of 10–30% is assigned to the normalisation of the heavy-flavour content of the ZZ background, based on a data-to-simulation comparison of events with one Z boson and additional jets and cross-checked with a comparison between different ZZ simulations [11].

The uncertainty in the \(t\bar{t} H\) background is evaluated by varying the factorisation and renormalisation scales up and down by a factor of two with respect to the nominal value, \(H_\mathrm{T}/2\), where \(H_\mathrm{T}\) is defined as the scalar sum of the transverse masses \(\sqrt{p_{\text {T}} ^2+m^2}\) of all final state particles.

For the \(tZ\) background, an overall normalisation uncertainty of 50% is assumed. An additional uncertainty affecting the distribution of this background as a function of jet and b-jet multiplicity is evaluated by varying the factorisation and renormalisation scales, as well as the amount of radiation in the Perugia2012 parton-shower tune.

An uncertainty of \(+10\%\) and \(-22\%\) is assigned to the \(tWZ\) background cross section. The uncertainty is asymmetric due to an alternative estimate of the interference effect between this process and the \(t\bar{t} Z\) production. The shape uncertainty is evaluated by varying the factorisation and renormalisation scales up and down by a factor of two with respect to the nominal value \(H_\mathrm{T}/2.\)

For other prompt-lepton backgrounds, uncertainties of 20% are assigned to the normalisations of the WH and ZH processes, based on calculations from Ref. [73]. An uncertainty of 50% is considered for triboson and same-sign WW processes.

The fake-lepton background uncertainty is evaluated as follows. The uncertainty due to the matrix method is estimated by propagating the statistical uncertainty on the measurement of the fake-lepton efficiencies. Additionally, a 20% uncertainty is added to the subtracted charge-flip yields estimated as the difference between data-driven charge-flips and simulation, and the \(E_{\text {T}}^{\text {miss}}\) requirement used to enhance the single-fake-lepton fraction is varied by \(20\,\text {GeV}\). The main sources of fake muons are decays of light-flavour or heavy-flavour hadrons inside jets. For the \(2\mu \)-SS region, the flavour composition of the jets faking leptons is assumed to be unknown. To cover this uncertainty, the central values of the fake-lepton efficiencies extracted from the b-veto and the b-tag control regions are used, with the efficiency difference assigned as an extra uncertainty. For the tetralepton channel, fake-lepton systematic uncertainties are covered by the scale-factor uncertainties used to calibrate the simulated fake-lepton yield in the control regions. Within a fake-lepton estimation method, all systematic uncertainties are considered to be correlated among analysis channels and regions. Thus \(2\mu \)-SS and trilepton fake-lepton systematic uncertainties that use the matrix method are not correlated with the tetralepton systematic uncertainties. The expected uncertainties in the fake-lepton backgrounds relative to the total backgrounds vary in each channel and signal region: 50% for the \(2\mu \)-SS region, 25–50% for the trilepton channel and 5–10% for the tetralepton channel.

7 Results

In order to extract the \(t\bar{t} Z\) and \(t\bar{t} W\) cross sections, nine signal regions (\(2\mu \)-SS, 3\(\ell \)-Z-1b4j, 3\(\ell \)-Z-2b3j, 3\(\ell \)-Z-2b4j, 3\(\ell \)-noZ-2b, 4\(\ell \)-DF-1b, 4\(\ell \)-DF-2b, 4\(\ell \)-SF-1b, 4\(\ell \)-SF-2b) and two control regions (3\(\ell \)-WZ-CR, 4\(\ell \)-ZZ-CR) are simultaneously fitted. The \(2\mu \)-SS signal region is particularly sensitive to \(t\bar{t} W\), the 3\(\ell \)-noZ-2b signal region is sensitive to both, \(t\bar{t} W\) and \(t\bar{t} Z\), while all other signal regions aim at the determination of the \(t\bar{t} Z\) cross section. The cross sections \(\sigma _{t\bar{t} Z}\) and \(\sigma _{t\bar{t} W}\) are determined using a binned maximum-likelihood fit to the numbers of events in these regions. The fit is based on the profile-likelihood technique, where systematic uncertainties are allowed to vary as nuisance parameters and take on their best-fit values. None of the uncertainties are found to be significantly constrained or pulled from their initial values. The calculation of confidence intervals and hypothesis testing is performed using a modified frequentist method as implemented in RooStats [74, 75].

A summary of the fit to all regions used to measure the \(t\bar{t} Z\) and \(t\bar{t} W\) production cross sections are shown in Fig. 7. The normalisation corrections for the WZ and ZZ backgrounds with respect to the Standard Model predictions are obtained from the fits as described in Sect. 5 and found to be compatible with unity: \(1.11 \pm 0.30\) for the WZ background and \(0.94 \pm 0.17\) for the ZZ background.

Fig. 7
figure 7

Expected yields after the fit compared to data for the fit to extract \(\sigma _{t\bar{t} Z}\) and \(\sigma _{t\bar{t} W}\) in the signal regions and in the control regions used to constrain the WZ and ZZ backgrounds. The ‘Other’ background summarises all other backgrounds described in Sect. 3. The shaded band represents the total uncertainty

The results of the fit are \(\sigma _{t\bar{t} Z} = 0.92 \pm 0.29 \;\text{(stat.) } \pm 0.10 \;\text{(syst.) } \) pb and \(\sigma _{t\bar{t} W} = 1.50 \pm 0.72 \;\text{(stat.) } \pm 0.33 \;\text{(syst.) } \) pb with a correlation of \(-0.13\) and are shown in Fig. 8. The fit yields significances of \(3.9\sigma \) and \(2.2\sigma \) over the background-only hypothesis for the \(t\bar{t} Z\) and \(t\bar{t} W\) processes, respectively. The expected significances are \(3.4\sigma \) for \(t\bar{t} Z\) and \(1.0\sigma \) for \(t\bar{t} W\) production. The significance values are computed using the asymptotic approximation described in Ref. [76]. In the two channels most sensitive to the \(t\bar{t} W\) signal the observed relative number of events with two positively or two negatively charged leptons is compatible with expectation. In the 3\(\ell \)-noZ-2b channel the observed distribution of the number of events with a given amount of electrons and muons match expectation, as well.

Fig. 8
figure 8

The result of the simultaneous fit to the \(t\bar{t} Z\) and \(t\bar{t} W\) cross sections along with the 68 and 95% confidence level (CL) contours. The shaded areas correspond to the theoretical uncertainties in the Standard Model predictions, and include renormalisation and factorisation scale uncertainties as well as PDF uncertainties including \(\alpha _{\text {S}} \) variations

Table 5 shows the leading and total uncertainties in the measured \(t\bar{t} Z\) and \(t\bar{t} W\) cross sections. In estimating the uncertainties for \(t\bar{t} Z\) (\(t\bar{t} W\)), the cross section for \(t\bar{t} W\) (\(t\bar{t} Z\)) is fixed to its Standard Model value. For both processes, the precision of the measurement is dominated by statistical uncertainties. For the \(t\bar{t} Z\) determination, the different sources contribute with similar size to the total systematic uncertainty. For the \(t\bar{t} W\) determination, the dominant systematic uncertainty source is the limited amount of data available for the estimation of the fake leptons.

Table 5 List of dominant and total uncertainties in the measured cross sections of the \(t\bar{t} Z\) and \(t\bar{t} W\) processes from the fit. All uncertainties are symmetrised

8 Conclusion

Measurements of the production cross sections of a top-quark pair in association with a Z or W boson using \(3.2\,\text{ fb }^{-1}\) of data collected by the ATLAS detector in \(\sqrt{s} = 13\,\text {TeV}\) pp collisions at the LHC are presented. Final states with either two same-charge muons, or three or four leptons are analysed. From a simultaneous fit to nine signal regions and two control regions, the \(t\bar{t} Z\) and \(t\bar{t} W\) production cross sections are determined to be \(\sigma _{t\bar{t} Z} = 0.9 \pm 0.3\) pb and \(\sigma _{t\bar{t} W} = 1.5 \pm 0.8\) pb. Both measurements are consistent with the NLO QCD theoretical calculations, \(\sigma _{t\bar{t} Z} = 0.84 \pm 0.09\,\text {pb}\) and \(\sigma _{t\bar{t} W} = 0.60 \pm 0.08\,\text {pb}\).