1 Introduction

Precise measurements of the properties of the top quark, the heaviest known elementary particle, are an important check of the internal consistency of the Standard Model (SM) of particle physics and could provide hints of possible new physics beyond the SM (BSM). The production cross sections of top-quark–antiquark pairs (\(t\bar{t}\)), single top quarks, as well as the top-quark mass, have been measured with a great level of precision [1,2,3,4]. The large centre-of-mass energy and luminosity of the Large Hadron Collider (LHC) enable precise and differential cross-section measurements for SM processes with small production rates, such as the associated production of a \(t\bar{t}\) pair and a \(Z\) boson (\(t\bar{t}Z\)).

The \(t\bar{t}Z\) production process is particularly interesting, as it provides direct access to the neutral coupling of the top quark to the electroweak (EW) gauge bosons [5, 6]. Deviations of the coupling strength of the top quark to the \(Z\) boson (t\(Z\) coupling) from its SM value might imply the existence of new effects in the EW symmetry breaking mechanism which could be probed in the context of effective field theory (EFT) [7]. Various BSM models predict large deviations of the top quark’s EW couplings from the SM value, which were probed by the previous generation of lepton colliders [8, 9]. Precise measurements of the inclusive and differential cross sections of the \(t\bar{t}Z\) process are, thus, of particular interest. Differential cross-section measurements can also offer sensitivity to differences among the predictions from various Monte Carlo (MC) generators and can, therefore, serve as an important input to the tuning of MC parameter values (MC tunes). Furthermore, the \(t\bar{t}Z\) process is an irreducible background in several searches for BSM phenomena [10, 11], as well as in measurements of important SM processes, such as \(t\bar{t}\) production in association with a Higgs boson [12] or single top-quark production in association with a \(Z\) boson [13]. The ATLAS Collaboration measured the inclusive \(t\bar{t}Z\) cross section using a subset of the LHC Run 2 data, collected in 2015 and 2016 [14] and a first differential measurement of the \(t\bar{t}Z\) process was carried out by the CMS Collaboration using the 2016 and 2017 data sets [15].

Theoretical predictions of the \(t\bar{t}Z\) cross section exist at next-to-leading order (NLO) with the resummation of soft gluon corrections computed at next-to-next-to-leading-logarithm (NNLL) precision [16, 17] in perturbative quantum chromodynamics (QCD) with added EW corrections. Recently they have been matched to the complete set of NLO corrections of both QCD and EW origin [18, 19] using the MadGraph5_aMC@NLO (MG5_aMC@NLO) framework [20].

This paper presents measurements of the inclusive and differential \(t\bar{t}Z\) production cross section in final states with three or four isolated leptons (electrons or muons) with the ATLAS detector [21] at the LHC. The measurements were performed with \(\sqrt{s} = {13}\,\hbox {TeV}\) proton–proton (\(pp\)) collision data collected during Run 2 of the LHC (2015–2018) and corresponding to an integrated luminosity of \({139}\,\hbox {fb}^{-1}\). The \(Z\) boson is identified by targeting events featuring an oppositely charged electron (e) or muon (\(\mu \)) pair. The detector signatures resulting from the hadronisation of final-state quarks from the decay of the \(t\bar{t}\) system, in particular those from bottom (anti)quarks, are exploited by constructing target regions with different jet and b-jet multiplicities. The inclusive measurement follows an analysis strategy similar to the previous ATLAS \(t\bar{t}Z\) measurement [14]. The production cross section is extracted by performing a simultaneous maximum-likelihood fit in the targeted analysis regions with the signal normalisation as the parameter of interest. In addition, a set of normalised and absolute differential measurements are presented as a function of different variables which probe the SM predictions for the kinematics of the \(t\bar{t}Z\) system. Some of these variables are found to be sensitive to potential EFT signals [7], while others are more interesting in the context of MC tuning [22, 23]. The differential measurements were performed at both particle and parton level in different fiducial volumes in order to correct for various acceptance effects. For the first time, the \(t\bar{t}Z\) cross section is measured with the full Run 2 data and includes differential measurements performed as functions of variables related to the \(t\bar{t}\) system.

This paper is structured as follows. Section 2 provides a brief description of the ATLAS detector. In Sect. 3, the simulation of signal and background processes is discussed, followed by the definitions of the final-state objects at reconstruction, particle and parton levels in Sect. 4. The event selection, both online during data taking and offline after data taking, and the background estimation are discussed in Sects. 5 and 6, respectively. The sources of systematic uncertainties that affect the measurements are discussed in Sect. 7. The result of the inclusive cross-section measurement is presented in Sect. 8. Section 9 describes the differential variables and the unfolding procedure, followed by the results of the differential measurements. Conclusions are drawn in Sect. 10.

2 ATLAS detector

The ATLAS detector [21] at the LHC covers nearly the entire solid angle around the collision point.Footnote 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three sets of large superconducting toroidal magnets, each consisting of eight separate coils. The inner-detector system (ID) is immersed in a \({2}\,\hbox {T}\) axial magnetic field and provides charged-particle tracking in the range \(|\eta | < 2.5\).

The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, with the first hit typically being detected in the insertable B-layer (IBL) installed before Run 2 [24, 25]. It is followed by the silicon microstrip tracker (SCT), which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to \(|\eta | = 2.5\). The TRT also provides electron identification information based on the fraction of hits above a higher energy-deposit threshold corresponding to transition radiation. Typically, around 30 TRT hits are measured in total.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta | < 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\) to correct for energy loss in material upstream of the calorimeters. Hadronic calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadronic endcap calorimeters. The solid angle coverage is extended with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements respectively.

The muon spectrometer (MS) comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The field integral of the toroids ranges between 2.0 and 6.0 T \(\cdot \) m across most of the detector. A set of precision chambers covers the region \(|\eta | < 2.7\) with three layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background rates are highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive-plate chambers in the barrel, and thin-gap chambers in the endcap regions.

Interesting events are selected to be recorded by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [26]. The first-level trigger accepts events from the \({40}\,\hbox {MHz}\) bunch crossings at a rate below \({100}\,\hbox {kHz}\), which the high-level trigger reduces in order to record events to disk at about \({1}\,\hbox {kHz}\).

3 Data and simulated event samples

The analysis is performed on data from \(pp\) collisions at \(\sqrt{s} = {13}\,\hbox {TeV}\) delivered by the LHC and recorded by the ATLAS detector between 2015 and 2018. The bunch spacing for this data-taking period was 25 ns with an average number of \(pp\) interactions per bunch crossing (‘pile-up’) which varies by year and LHC beam conditions and was in the range from 10 to 70 for almost all events. After requirements on the stability of the beams, the operational status of all ATLAS detector components, and the quality of the recorded data, the total integrated luminosity of the data set corresponds to \({139}\,\hbox {fb}^{-1}\). This value is derived from the calibration of the luminosity scale using xy beam-separation scans, following a methodology similar to that detailed in Ref. [27], and using the LUCID-2 detector [28] for the baseline luminosity measurements.

The data were collected using a combination of single-electron and single-muon triggers, with requirements on the identification, isolation, and \(p_{\text {T}}\) of the leptons to maintain efficiency across the full momentum range while controlling the trigger rates [26]. For electrons the trigger thresholds were \(p_{\text {T}} = 26\), 60 and \({140}\,\hbox {GeV}\), whereas for muons, the thresholds were \(p_{\text {T}} = 26\) and \({50}\,\,\hbox {GeV}\).Footnote 2 Identification and isolation requirements were applied to the triggers with the lower \(p_{\text {T}}\) thresholds [29,30,31].

Signal and background processes considered in this analysis were modelled using simulated MC samples. The effect of pile-up interactions was modelled by overlaying the hard-scattering event with simulated minimum-bias events generated with Pythia 8.186 [32] using the NNPDF 2.3lo set of parton distribution functions (PDFs) [33] and the A3 set of tuned MC parameters [34]. The simulated events were reweighted to match the pile-up conditions observed in the measured data. For processes featuring \(W\) boson, \(Z\) boson or top-quark production, the \(W\), \(Z\) and top-quark masses were set to \({80.4}\,\hbox {GeV}\), \({91.2}\,\hbox {GeV}\) [35] and \({172.5}\,\hbox {GeV}\), respectively. The decays of bottom and charm hadrons were simulated using the EvtGen  program [36]. The MC samples were either processed through a full simulation of the ATLAS detector based on Geant4 [37, 38] or a fast simulation (AtlFast2) relying on parameterised showers in the calorimeter [39, 40].

The \(t\bar{t}Z\) signal process was modelled using the MG5_aMC@NLO 2.3.3 [41] generator together with EvtGen  1.2.0, which provided a matrix element (ME) calculation at NLO in the strong coupling constant (\(\alpha _{\text {s}}\)) with the NNPDF 3.0nlo [42] PDF set. The functional form of the renormalisation and factorisation scales (\(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\)) was set to \(\mu _{\mathrm {r,f}} = 0.5 \times \sum _i \sqrt{m^2_{\mathrm {i}}+p^2_{\mathrm {T,i}}}\), where i runs over all final-state particles generated from the ME calculation. The \(t\bar{t} \gamma ^{*}\) contribution and the \(Z/\gamma ^{*}\) interference were included with dilepton invariant masses (\(m_{\ell \ell }\)) down to \({5}\,\hbox {GeV}\). Top-quark decays were simulated at leading order (LO) using MadSpin  [43, 44] to preserve all spin correlations. The events were interfaced with Pythia 8.210  [45] for the parton shower and hadronisation, using the A14 set of tuned parameters [46] and the NNPDF 2.3lo PDF set.

The SM theoretical prediction of the production cross section for the \(t\bar{t}Z\) process, including all \(Z\) boson decay modes and taking into account the \(t\bar{t} \gamma ^{*}\) contribution and the \(Z/\gamma ^{*}\) interference, is \(\sigma _{t\bar{t}Z} = 0.88^{+0.09}_{-0.10}\,\)pb and includes NLO QCD+EW corrections [47]. This value is an off-shell extension of a cross-section calculation of \(\sigma _{t\bar{t}Z} = 0.84^{+0.09}_{-0.10}\,\)pb, which was reported in Ref. [48] (based on Ref. [49]). The uncertainties are due to the QCD scales, the proton PDFs, and \(\alpha _{\text {s}}\).

The measured differential cross sections are compared with theoretical expectations obtained with different generators. Alternative \(t\bar{t}Z\) samples were simulated with Sherpa 2.2.1 [50] generator at NLO QCD accuracy, using both inclusive and multi-leg set-ups. In both cases, dynamic \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\) scales were used as in the nominal samples. The default Sherpa 2.2.1 parton shower was used together with the NNPDF 3.0nnlo PDF set [42]. The multi-leg sample was simulated using the MEPS@NLO prescription [51,52,53,54] with up to one additional parton at NLO and with a merging scale of \({30}\,\hbox {GeV}\). Another sample was generated with the same MG5_aMC@NLO and EvtGen versions as the nominal sample but using a different MC program for the modelling of the parton-shower and hadronisation: Herwig 7 [55, 56] instead of Pythia 8. In addition, two alternative samples with the same settings as the nominal sample, but using a set of variations of the A14 tune’s parameters (A14 eigentune variation Var3c [46]), were employed to evaluate the uncertainty associated with the amount of initial-state radiation (ISR).

The production of a single top quark or antiquark in association with a \(Z\) boson and one extra parton (\(tZq\)) was simulated using the MG5_aMC@NLO 2.3.3 generator at NLO QCD with the NNPDF 3.0nnlo PDF set. The events were interfaced with Pythia 8.230 using the A14 tune and the NNPDF 2.3lo PDF set. The \(tZq\) sample also includes off-shell \(Z\) decays to dilepton pairs with invariant masses in the range \(m_{\ell \ell } > {30}\,\hbox {GeV}\). Single top-quark or top-antiquark production in association with both a \(W\) and a \(Z\) boson (\(tWZ\)) was simulated at NLO with MG5_aMC@NLO 2.2.2 and the NNPDF 3.0nnlo PDF set, using Pythia 8.235 for the parton-shower simulation. The interference between \(t\bar{t}Z\) and \(tWZ\) was removed following a diagram removal approach referred to as the DR1 scheme [57].

Events featuring the production of a \(t\bar{t}\) pair in association with a \(W\) or Higgs boson (\(t\bar{t}W\) and \(t\bar{t}H\)) were generated using NLO QCD MEs in MG5_aMC@NLO 2.3.3  (for \(t\bar{t}W\)) or 2.6.0 (for \(t\bar{t}H\)) with the NNPDF 3.0nlo PDF set and showered with Pythia 8.210 or 8.230 using the A14 tune. MC samples featuring Higgs production in association with a \(W\) or \(Z\) boson (\(H\mathrm {+}W/Z\)) were generated at LO with Pythia 8.186 using the A14 tune and the NNPDF 2.3lo PDF set.

Diboson processes featuring the production of three charged leptons and one neutrino or four charged leptons (\(WZ\,+\,\text {jets}\) or \(ZZ\,+\,\text {jets}\), respectively) were simulated using the Sherpa 2.2.2 generator. In this set-up, multiple MEs were matched and merged with the Sherpa parton shower based on the Catani–Seymour dipole factorisation scheme [58, 59] using the MEPS@NLO prescription [51,52,53,54]. The virtual QCD corrections for MEs at NLO accuracy were provided by the OpenLoops  library [60, 61]. Samples were generated using the NNPDF 3.0nnlo PDF set, along with the dedicated set of tuned parton-shower parameters developed by the Sherpa  authors. The \(WZ/ZZ\,+\,\text {jets}\) events with up to one additional parton were simulated at NLO, whereas events with two or three partons were simulated at LO precision.

The production of three or four top quarks (\(t\bar{t}t\) and \(t\bar{t}t\bar{t}\)) and the production of a \(t\bar{t}\) pair with two \(W\) bosons (\(t\bar{t}WW\)) were simulated at LO using MG5_aMC@NLO 2.2.2 interfaced to Pythia 8.186 with the A14 tune and the NNPDF 2.3lo PDF set. Fully leptonic triboson processes (WWW, WWZ, WZZ and ZZZ) with up to six leptons in the final states were simulated with Sherpa 2.2.2 and the NNPDF 3.0nnlo PDF set. Final states with no additional partons were calculated at NLO, whereas final states with one, two or three additional partons, were calculated at leading order.

4 Object reconstruction

The following subsections describe the definitions of final-state objects at reconstruction (detector), particle, and parton levels, which are used to characterise the final-state event topologies and to define the phase-space regions for the cross-section measurements.

4.1 Reconstruction of detector-level objects

Electron candidates are reconstructed from clusters of energy deposits in the electromagnetic calorimeter that are matched to a track in the ID. They are required to satisfy \(p_{\text {T}} > {7}\,\hbox {GeV}\), \(|\eta | < 2.47\) and a ‘Medium’ likelihood-based identification requirement [62, 63]. Electron candidates are excluded if their calorimeter clusters lie within the transition region between the barrel and the endcap of the electromagnetic calorimeter, \(1.37< |\eta | < 1.52\). The track associated with the electron must pass the requirements \(|z_{0} \sin (\theta ) | < {0.5}\,\hbox {mm}\) and \(|d_{0}| / \sigma (d_{0}) < 5\), where \(z_{0}\) describes the longitudinal impact parameter relative to the reconstructed primary vertex,Footnote 3\(d_{0}\) is the transverse impact parameter relative to the beam axis, and \(\sigma (d_{0})\) is the uncertainty on \(d_{0}\).

Muon candidates are reconstructed from MS tracks matched to ID tracks in the pseudorapidity range of \(|\eta | < 2.5\). They must satisfy \(p_{\text {T}} > {7}\,\hbox {GeV}\) along with the ‘Medium’ identification requirements defined in Refs. [64, 65]. This criterion defines requirements on the number of hits in the different ID and MS subsystems and on the significance of the charge-to-momentum ratio q/p. In addition, the track associated with the muon candidate must have \(|z_{0} \sin (\theta ) | < {0.5}\,\hbox {mm}\) and \(|d_{0}| / \sigma (d_{0}) < 3\).

Isolation criteria are applied to the selected electrons and muons. For electrons, the scalar sum of the \(p_{\text {T}}\) of tracks within a variable-size cone around the electron, excluding tracks originating from the electron itself, must be less than 6% of the electron \(p_{\text {T}}\). The track isolation cone radius \(\Delta R = \sqrt{(\Delta \eta )^2+(\Delta \phi )^2}\) is given by the smaller of \(\Delta R = {10}\,\hbox {GeV}/p_{\text {T}} \) and \(\Delta R = 0.2\). In addition, the sum of the transverse energy of the calorimeter topo-clustersFootnote 4 in a cone of \(\Delta R = 0.2\) around the electron is required to be less than 6% of the electron \(p_{\text {T}}\), excluding clusters originating from the electron itself. For muons, the scalar sum of the \(p_{\text {T}}\) of tracks within a variable-size cone around the muon, excluding its own track, must be less than 6% of the muon \(p_{\text {T}}\), with the track isolation cone radius being given by the minimum of \(\Delta R = {10}\,\hbox {GeV}/p_{\text {T}} \) and \(\Delta R = 0.3\).

Jets are reconstructed from topo-clusters, using the anti-\(k_{t}\) jet clustering algorithm [67] as implemented in the FastJet package [68], with a radius parameter of \(R = 0.4\). They are calibrated through the application of a jet energy scale derived from \({13}\,\hbox {TeV}\) data and simulation [69]. Only jet candidates with \(p_{\text {T}} > {25}\,\hbox {GeV}\) and \(|\eta | < 2.5\) are considered in this analysis. To mitigate the impact of jets arising from additional \(pp\) collisions in a given bunch crossing, an additional selection criterion using a likelihood-based ‘jet-vertex-tagging’ (JVT) discriminant is applied to jets with \(p_{\text {T}} < {120}\,\hbox {GeV}\) and \(|\eta | < 2.5\) [70].

Jets containing b-hadrons (‘b-jets’) are identified (tagged) by the MV2c10 b-tagging algorithm [71]. The algorithm uses a multivariate discriminant with quantities such as the impact parameters of associated tracks, and well-reconstructed secondary vertices. For the differential measurements, a selection that provides an 85% efficiency for identifying b-jets in simulated \(t\bar{t}\) events, with rejection factors against light-flavour jets and c-jets of 28 and 2, respectively is used. Different calibrated b-tagging working points (WPs), corresponding to different b-jet selection efficiencies are used for the inclusive cross-section measurement. A method where exclusive bins in the b-tagging discriminant corresponding to different identification efficiencies is employed. In the following, this approach is referred to as pseudo-continuous b-tagging (PCBT).

Scale factors are applied as weights to MC events to correct for the mismodelling of efficiencies associated with the reconstruction, identification and trigger selection of electrons and muons, as well as the JVT and b-tagging requirements for jets. The b-tagging scale factors are derived from a pseudo-continuous calibration as outlined above.

The missing transverse momentum is defined as the negative vector sum of the transverse momenta of all selected and calibrated physics objects (electrons, photons, muons and jets). Low-momentum tracks from the primary vertex that are not associated with any of the reconstructed physics objects described previously are also included as a ‘soft term’ in the calculation [72]. The magnitude of the missing transverse momentum vector is denoted as \(E_{\text {T}}^{\text {miss}}\).

Ambiguities can arise from the independent reconstruction of electron, muon and jet candidates in the detector. A sequential procedure (overlap removal) is applied to resolve these ambiguities and, thus, avoids a double counting of physics objects.Footnote 5 It is applied as follows. If an electron candidate and a and muon candidate share a track, the electron candidate is removed. Jet candidates within a distance of \(\Delta R _{y,\phi } =\sqrt{(\Delta y)^2+(\Delta \phi )^2} = 0.2\) from a remaining electron candidate are discarded. If multiple jets are found in this area, only the closest jet is removed. If the electron-jet distance is between 0.2 and 0.4, the electron candidate is removed. If the \(\Delta R _{y,\phi }\) between any remaining jet and a muon candidate is less than 0.4, the muon candidate is removed if the jet has more than two associated tracks, otherwise the jet is discarded.

4.2 Particle- and parton-level objects and definitions of fiducial regions

In the measurements of differential \(t\bar{t}Z\) cross sections, the measured spectra are corrected for detector effects to so-called particle and parton levels using an unfolding procedure.

Parton-level objects were obtained from the MC record of the \(t\bar{t}Z\) event. The top quarks (antiquarks) and \(Z\) bosons were selected after final-state radiation and just before their corresponding decay, \(t \rightarrow Wb\) or \(Z \rightarrow \ell \ell \), respectively. The leptons originating from \(W\) and \(Z\) bosons were selected directly from the decay vertex of the parent bosons.

The parton-level fiducial volumes for final states with three or four leptons were defined as follows: the \(Z\) boson was required to decay into leptons, whereas the \(t\bar{t}\) pair was required to decay via \(t\bar{t} \rightarrow W ^{+}b W ^{-}{\bar{b}}\), with either one or both \(W\) bosons subsequently decaying leptonically.

The particular decay chains of interest are therefore:

$$\begin{aligned}&Z \rightarrow \ell ^{+}\ell ^{-}, \ \ t \rightarrow W^{+} ( \rightarrow \ell ^{+}\nu _{\ell })+b, \ \ {\bar{t}}\rightarrow W^{-} (\rightarrow q{\bar{q}})+{\bar{b}} \\ \text {or}&Z \rightarrow \ell ^{+}\ell ^{-}, \ \ t\rightarrow W^{+} (\rightarrow q{\bar{q}})+b, \ \ \ \ \ {\bar{t}}\rightarrow W^{-} (\rightarrow \ell ^{-}{\bar{\nu }}_{\ell })+{\bar{b}} \\ \end{aligned}$$

for a three-lepton final state, and

$$\begin{aligned}&Z \rightarrow \ell ^{+}\ell ^{-}, \ \ t\rightarrow W^{+} (\rightarrow \ell ^{+}\nu _{\ell })+b, \ \ {\bar{t}}\rightarrow W^{-} (\rightarrow \ell ^{-}{\bar{\nu }}_{\ell })+{\bar{b}} \\ \end{aligned}$$

for a four-lepton final state. The two decay chains for the three-lepton final state differ only in terms of which of the top quark or antiquark decayed hadronically.

The invariant mass of the lepton pair originating from the \(Z\) boson has to be within a range of \(\pm {15}\,\,\hbox {GeV}\) around the nominal \(Z\) boson mass (\({91.2}\,\,\hbox {GeV}\)) [35] to be sensitive to on-shell \(Z\) decays. Prompt \(\tau \)-leptons from \(Z\) or \(W\) boson decays were not included in the parton-level fiducial volume, regardless of their subsequent decay into leptons or hadrons. No kinematic requirements were applied to the parton-level objects in order that the unfolded differential results at parton level can be more easily compared with fixed-order predictions.

Particle-level objects in simulated events were defined using quasi-stable particles with a mean lifetime greater than 30 ps originating from \(pp\) collisions. They were selected after hadronisation but before the interaction of these particles with the detector components or consideration of pile-up effects. Electrons and muons were required to not to have originated from a hadron in the MC generator event record, whether directly or through a \(\tau \)-lepton decay. This ensures that they originated from the \(Z\) boson or the \(W\) bosons from top-quark decays, without requiring a direct match with the parent boson. The four-momenta of the bare leptons were modified (‘dressed’) by adding the four-momenta of all radiated photons within a cone of size \(\Delta R = 0.1\), excluding photons from hadron decays, to take into account final-state photon radiation. Particle-level jets were reconstructed with the anti-\(k_{t}\) algorithm with a radius parameter of \(R = 0.4\) applied to all stable particles, but excluding both the neutrinos originating from the \(Z\) boson or top quarks and the selected electrons, muons and photons used in the definition of the charged leptons. If b-hadrons with \(p_{\text {T}} > {5}\,\hbox {GeV}\) were found in the MC event record, they were clustered in the stable-particle jets with their energies set to a negligible positive value (‘ghost-matching’) [73]. Particle-level jets containing one or more of these b-hadrons were considered to originate from a b-quark. The particle-level missing transverse momentum was defined as the vector sum of the transverse momenta of all neutrinos found in the simulation history of the event, excluding those originating from hadron decay.

The particle-level fiducial volume for final states with three or four leptons was defined by applying the same \(p_{\text {T}}\) and \(|\eta |\) requirements as those summarised for the detector-level selection in Tables 1 and 2, respectively. In addition, the same requirements were placed on the number of jets and b-jets, and the same requirements were placed on the opposite-sign–same-flavour (OSSF) lepton pair, along with the same invariant mass requirement for the \(Z\)-mass window as that used in the detector-level selection described in the following section (\(|m_{\ell \ell }- m_{Z} | < {10}\,\hbox {GeV}\)).Footnote 6 For the particle-level fiducial volume for four-lepton final states, only one OSSF lepton pair was required within the \(Z\)-mass window – the remaining lepton pair was required only to be opposite-sign. Only one of the jets is required to have originated from a b-quark.

Table 1 The definitions of the trilepton signal regions: for the inclusive measurement, a combination of the regions with pseudo-continuous b-tagging \(3\ell \)-Z-1b4j-PCBT and \(3\ell \)-Z-2b3j-PCBT is used, whereas for the differential measurement only the region \(3\ell \)-Z-2b3j with a fixed b-tagging WP is employed
Table 2 The definitions of the four tetralepton signal regions. The regions are defined to target different b-jet multiplicities and flavour combinations of the non-\(Z\) leptons (\(\ell \ell ^{\,\text {non-}Z}\))

5 Event selection and signal regions

Only final states with exactly three or four isolated leptons (electrons or muons) and at least two jets, as defined in Sect. 4, are considered. All selected events are required to pass a single-electron or single-muon trigger. In addition, at least one reconstructed lepton with \(p_{\text {T}} > {27}\,\hbox {GeV}\) is required to be matched to the lepton reconstructed by the trigger algorithm and to be of the same flavour. Different signal regions are defined and optimised to achieve the best sensitivity to \(t\bar{t}Z\) production with one or both top quarks decaying via \(t \rightarrow bW \rightarrow b\ell \nu _{\ell }\). Furthermore, the regions are designed to contain a sufficient number of signal events in order to reduce the statistical uncertainties of the differential \(t\bar{t}Z\) cross-section measurements. The signal regions are referred to as ‘trilepton’ (\(3\ell \)) and ‘tetralepton’ (\(4\ell \)) signal regions, depending on the number of reconstructed leptons, and are meant to target events with one or two prompt leptons, respectively from the \(t\bar{t}\) decay.

5.1 Trilepton signal regions

A summary of the definitions of the trilepton signal regions is provided in Table 1. The requirement on the minimum transverse momentum of the leading, sub-leading and third lepton is 27, 20 and \({20}\,\hbox {GeV}\), respectively. The sum of the three lepton charges is required to be \(\pm 1\). The OSSF lepton pair with the invariant mass closest to the \(Z\) boson mass is considered to originate from the \(Z\) decay and its invariant mass (labelled as \(m_{\ell \ell }^{Z}\)) is required to be compatible with the mass of the \(Z\) boson (\(|m_{\ell \ell }^{Z}- m_{Z} | < {10}\,\hbox {GeV}\)). Furthermore, all OSSF lepton combinations are required to have \(m_{\mathrm {OSSF}} > {10}\,\hbox {GeV}\) to remove contributions arising from low-mass resonances. Additional requirements are imposed on the total number of reconstructed jets (\(N_{\mathrm {jets}}\)) and b-tagged jets (\(N_{b\text {-jets}}\)) in the event.

Different b-jet requirements are used for the inclusive and differential cross-section measurements. For the inclusive measurement, a combination of two orthogonal regions which use different b-tagging WPs (PCBT, see Sect. 4) with 60% and 70% efficiency is employed. The tighter b-tagging WPs are used to suppress the \(WZ\,+\,\text {jets}\) background and reduce its impact on the overall precision of the measurement. The two regions, referred to as \(3\ell \)-Z-1b4j-PCBT and \(3\ell \)-Z-2b3j-PCBT, are kept distinct from one another during the fitting procedure used to perform the cross-section measurement. For the differential measurements, a looser, fixed WP corresponding to an 85% efficiency is used in order to increase the data statistics.

5.2 Tetralepton signal regions

The definitions of the tetralepton signal regions are summarised in Table 2. The requirement on the transverse momentum of the four leading leptons in all regions is 27, 20, 10 and \({7}\,\hbox {GeV}\), respectively. As in the case of the trilepton signal regions, all events are required to have at least one OSSF lepton pair with an invariant mass satisfying \(|m_{\ell \ell }^{Z}- m_{Z} | < {10}\,\hbox {GeV}\). Furthermore, the remaining leptons which are not associated with the \(Z\) boson (non-\(Z\)) are required to have opposite charges, such that the sum of the four lepton charges is zero. As in the trilepton selection, a requirement that all OSSF lepton combinations satisfy \(m_{\mathrm {OSSF}} > {10}\,\hbox {GeV}\) in order to suppress background contributions from low-mass resonances is applied.

The tetralepton signal regions are separated into different-flavour (DF) and same-flavour (SF) signal regions, according to the b-jet multiplicities and the flavour composition of the non-\(Z\) lepton pair. The \(ZZ\,+\,\text {jets}\) background is suppressed by setting requirements on the jet and b-jet multiplicities, as well as by applying cuts on \(E_{\text {T}}^{\text {miss}}\) and the invariant mass of the non-\(Z\) lepton pair (\(m_{\ell \ell }^{\text {non-}Z}\)) in the case of the SF regions. In the SF regions, events with \(m_{\ell \ell }^{\text {non-}Z}\) close to the \(Z\) mass are accepted, but the \(E_{\text {T}}^{\text {miss}}\) requirement is increased to reduce the \(ZZ\,+\,\text {jets}\) background. If \(m_{\ell \ell }^{\text {non-}Z}\) is not close to the \(Z\) mass, the \(E_{\text {T}}^{\text {miss}}\) cut is relaxed. For the inclusive cross-section measurement, the four tetralepton regions are included as separate bins in the fit, whereas for the differential measurements all the events are combined. Unlike the trilepton signal regions, the b-jets are all selected using a fixed 85% b-tagging efficiency WP. The tetralepton signal region selections are identical for the inclusive and differential measurements.

Table 3 Definitions of the control regions targeting the \(WZ\,+\,\text {jets}\), \(WZ \rightarrow \ell \ell \ell \nu \) (left) and \(ZZ\,+\,\text {jets}\), \(ZZ \rightarrow \ell \ell \ell \ell \) processes (right): the control regions are used to obtain normalisations of the light-flavour components of the \(WZ/ZZ\,+\,\text {jets}\) backgrounds from data

6 Background estimation

Several processes can lead to background contaminations in the signal regions. The contributions from SM processes featuring the production of three or four prompt leptonsFootnote 7 is discussed in Sect. 6.1, whereas the estimation of processes where at least one of the reconstructed leptons is a fake lepton is explained in Sect. 6.2.

6.1 Prompt lepton background

The dominant SM background processes in the trilepton and tetralepton regions are \(WZ/ZZ\,+\,\text {jets}\) production with \(WZ \rightarrow \ell \ell \ell \nu \) and \(ZZ \rightarrow \ell \ell \ell \ell \) decays, respectively. The normalisations of these processes are obtained from data and measured in dedicated \(WZ\,+\,\text {jets}\) and \(ZZ\,+\,\text {jets}\) control regions (CRs) as defined in Table 3. The CRs are common to both the inclusive and the differential cross-section measurements. Invariant mass requirements on the OSSF lepton pairs are applied to select the \(Z\) bosons expected in both regions. A b-jet veto is applied in \(3\ell \)-WZ-CR to suppress the \(t\bar{t}Z\) contribution and to ensure orthogonality with the trilepton signal region. In \(4\ell \)-ZZ-CR, no requirements are placed on the number of jets or b-jets. The invariant mass requirements on the two OSSF lepton pairs are sufficient to yield a very high \(ZZ\,+\,\text {jets}\) purity in this region. Orthogonality with the tetralepton signal regions is ensured through the use of an \(E_{\text {T}}^{\text {miss}}\) requirement (\({20}\,\hbox {GeV}< E_{\text {T}}^{\text {miss}} < {40}\,\hbox {GeV}\)), where the lower bound is set so that the selected events are more similar kinematically to those in the signal regions. The \(WZ\,+\,\text {jets}\) purity in \(3\ell \)-WZ-CR is approximately 80%, while the \(ZZ\,+\,\text {jets}\) purity in \(4\ell \)-ZZ-CR is approximately 97%.

The event yields in these control regions are extrapolated to the signal regions in accord with simulation. As the control regions are mostly populated by WZ/ZZ plus light-flavour jet events, only the predictions from these light-flavour components in the signal regions are constrained by the observed data yields in the control regions. The \(WZ/ZZ + b\)- and c-jetFootnote 8 backgrounds are constrained to their MC predictions, but with additional normalisation uncertainties assigned (more details are provided in Sect. 7.3). Figure 1a and b show, respectively, the \(p_{\text {T}}\) and \(\eta \) distributions of the leading lepton for the \(WZ\,+\,\text {jets}\) control region. The \(p_{\text {T}}\) distribution and the number of selected jets in the \(ZZ\,+\,\text {jets}\) control region are shown in Fig. 2a and b. All distributions in the control regions are shown before the simultaneous fit to data is applied (pre-fit).

Another important background in the signal regions is \(tWZ\) production, which can lead to final states very similar to those of the \(t\bar{t}Z\) signal. A relevant background process in the trilepton regions is \(tZq\) production, which contributes more for lower jet multiplicities. Other background processes, such as \(t\bar{t}\mathrm {+}W/H\), \(t\bar{t}WW\), three/four top-quark production, \(H\mathrm {+}W/Z\) or triboson production can also contribute to the signal regions, but are significantly smaller than the other processes mentioned above.

The MC samples used to simulate these processes are described in Sect. 3. Besides the WZ/ZZ plus light-flavour jets background, for which control regions are employed to obtain the normalisation, the contributions from all SM processes leading to three or four prompt leptons are estimated entirely from MC simulation and normalised to their theoretical cross-section predictions.

Fig. 1
figure 1

Distribution of the a \(p_{\text {T}}\) and b \(\eta \) of the leading lepton in the \(WZ\,+\,\text {jets}\) control region. The shaded band corresponds to the total uncertainty (systematic and statistical) of the total SM prediction. The lower panel shows the ratio of the data to the SM prediction. The results and uncertainties are shown before the fit to data is performed. The category ‘other’ contains all processes mentioned in Sect. 3 which are not listed separately. Events with a leading lepton \(p_{\text {T}}\) above \({300}\,\hbox {GeV}\) are included in the uppermost bin of a

Fig. 2
figure 2

Distribution of a the \(p_{\text {T}}\) of the leading lepton and b the number of selected jets in the \(ZZ\,+\,\text {jets}\) control region. The shaded band corresponds to the total uncertainty (systematic and statistical) of the total SM prediction. The lower panel shows the ratio of the data to the SM prediction. The results and uncertainties are shown before the fit to data is performed. The category ‘other’ contains all processes mentioned in Sect. 3 which are not listed separately. Events with a leading lepton \(p_{\text {T}}\) above \({300}\,\hbox {GeV}\) are included in the uppermost bin of a

Table 4 The observed and expected numbers of events in the trilepton and tetralepton signal regions, as well as in the \(WZ/ZZ\,+\,\text {jets}\) control regions. The predictions are shown before the fit to data. The WZ and \(ZZ\,+\,\text {jets}\) backgrounds are listed separately for their light-flavour (l), b-jet and c-jet components. The category ‘fake leptons’ refers to the contributions from fake and non-prompt leptons. ‘Other’ includes the contributions from \(H\mathrm {+}W/Z\), \(t\bar{t}WW\), three/four top-quark production and triboson processes. Background categories with event yields shown as ‘–’ do not contribute significantly to a region. The indicated uncertainties consider statistical errors as well as all experimental and theoretical systematic uncertainties, except the normalisation uncertainties of the fitted background components

6.2 Fake lepton background

Different types of objects, which are misidentified as leptons, are referred to as ‘fake leptons’ throughout the rest of the document. In the signal regions of this analysis, this background arises mainly from dileptonic \(t\bar{t}\) decays where additional non-prompt leptons arise from heavy-flavour hadron decays.

To estimate the contribution of fake leptons in the signal regions, a fully data-driven method, called the ‘matrix method’ is employed. Descriptions of this technique can be found in Refs. [74, 75]. It relies on the prompt and fake leptons having different probabilities of passing the identification, isolation and impact parameters requirements. The method uses data events selected with the same criteria as in the signal regions, but with looser lepton selectionsFootnote 9 than the ones defined in Sect. 4.

An alternative version of the matrix method is described in Ref. [76]. It evaluates the total number of fake electrons and muons entering the signal regions via the maximisation of a likelihood function. The likelihood function is constructed from a product of Poisson probability functions that represent the numbers of leptons passing different quality criteria for the signal regions. The observed number of leptons selected with the looser criteria and the probabilities (efficiencies) for fake or prompt leptons to satisfy the nominal lepton requirements are fixed, while the expectation values of the Poisson functions – the numbers of fake leptons in the signal regions – are obtained from the likelihood maximisation. This method offers a more robust fake-lepton estimation for statistically limited regions. For the differential measurements, the estimations are performed separately for each bin of the measured variables.

The probabilities of fake leptons to satisfy the nominal lepton requirements (fake-lepton efficiencies) are obtained from data. They are measured separately for electrons and muons in events with exactly two leptons with the same charge (same-sign) and at least one b-jet identified at the 85% efficiency WP. The measurements are performed after subtracting the contributions, estimated from MC events, of charge-misidentified electrons and prompt leptons in the same-sign region. It has been checked that the dominant fake lepton source in this region are heavy-flavour hadron decays, as in the signal regions. The fake-lepton efficiencies are approximately 10% for electrons and 15% for muons, with an increase to around 20% for muons with \(p_{\text {T}} < {12}\,\hbox {GeV}\) and \(> {35}\,\hbox {GeV}\). The equivalent probabilities for prompt leptons (prompt-lepton efficiencies) are obtained from \(Z \rightarrow \ell \ell \) simulation and the respective scale factors for electrons or muons. The prompt-lepton efficiencies are in most cases higher than 90% for both the electrons and muons. They increase for larger lepton \(p_{\text {T}}\) values and reach \(> {98}\,{\%}\) for \(p_{\text {T}} > {35}\,\hbox {GeV}\). Both the fake- and prompt-lepton efficiencies are parameterised as a function of \(p_{\text {T}}\) and \(|\eta |\) of the respective lepton.

Systematic uncertainties are assigned to the fake-lepton estimates to account for differences in the relative contributions of the various fake-lepton sources between the signal regions and the regions used for the efficiency measurements. Further uncertainties arise from the subtraction of prompt and charge-misidentified leptons, as well as from the dependencies of the fake-lepton efficiencies on the number of jets/b-jets in the events. The method is also affected by statistical uncertainties arising from the limited number of events in the data sample used to evaluate the fake-lepton yields, as well as the statistical limitations of the efficiency measurements. Similarly to the nominal values, the uncertainties of the fake- and prompt-lepton efficiencies are binned in \(p_{\text {T}}\) and \(|\eta |\) of the leptons and propagated to the fake-lepton estimation in the signal regions. The overall uncertainties are approximately 50% for both electrons and muons, but they can fluctuate for the differential measurements, depending on the variable and the kinematic region.

6.3 Pre-fit event yields in signal and control regions

To validate the SM background modelling explained in the previous sections, Table 4 presents a comparison between the total expected background prediction and the observed data events in the trilepton and tetralepton signal regions, as well as in the \(WZ\,+\,\text {jets}\) and \(ZZ\,+\,\text {jets}\) control regions. The event yields and uncertainties are shown before applying the fitting procedure. The statistical and all systematic uncertainties as explained in Sect. 7 are considered except the normalisation uncertainties of the processes which are free parameters in the fit. Within the uncertainties, agreement between data and the SM predictions is observed in nearly all regions.

7 Systematic uncertainties

The signal and background predictions in all signal regions are affected by several sources of experimental and theoretical systematic uncertainty. These are considered for both inclusive and differential measurements presented in Sects. 8 and 9. The uncertainties can be classified into the different categories which are described in the following subsections.

7.1 Detector-related uncertainties

The uncertainty in the combined 2015–2018 integrated luminosity is 1.7% [77], obtained using the LUCID-2 detector [28] for the primary luminosity measurements. This systematic uncertainty affects all processes modelled using MC simulations apart from the light-flavour components of the \(WZ/ZZ\,+\,\text {jets}\) backgrounds, whose normalisations are taken from data in control regions.

The uncertainty in the reweighting of the MC pile-up distribution to match the distribution in data is evaluated by varying the pile-up correction factors used to perform the reweighting.

Uncertainties associated with the lepton selection arise from the trigger, reconstruction, identification and isolation efficiencies, and the lepton momentum scale and resolution [63,64,65]. They are below 1% for the individual sources and have a total impact of 2–2.5% on the measurements. Uncertainties associated with the jet selection arise from the jet energy scale (JES), the JVT requirement and the jet energy resolution (JER). The JES and its uncertainties are derived by combining information from test-beam data, collision data and simulation [78]. The uncertainties in the JER and JVT increase at lower jet \(p_{\text {T}}\). The overall effect of uncertainties related to jet selection and calibration is approximately 2%.

The efficiency of the flavour-tagging algorithm is measured for each jet flavour using control samples in data and in simulation. From these measurements, correction factors are derived to correct the tagging rates in the simulation. In the case of b-tagged jets, the correction factors and their uncertainties are estimated from data using dileptonic \(t\bar{t}\) events [71]. In the case of c-jets, they are derived from jets arising from \(W\) boson decays in \(t\bar{t}\) events [79]. In the case of light-flavour jets, the correction factors are derived using dijet events [80]. Sources of uncertainty affecting the b- and c-tagging efficiencies are evaluated as a function of jet \(p_{\text {T}}\), including bin-to-bin correlations. The uncertainties in the efficiency for tagging light-flavour jets depend on the jet \(p_{\text {T}}\) and on \(\eta \). An additional uncertainty is assigned to account for the extrapolation of the b-tagging efficiency measurement from the \(p_{\text {T}}\) region used to determine the correction factors to regions with higher \(p_{\text {T}}\). The impact of flavour-tagging uncertainties on the measurements depends on the signal regions and is 2–3% in total.

7.2 Signal modelling uncertainties

Different sources of systematic uncertainty in the theoretical predictions of the \(t\bar{t}Z\) process are considered. To evaluate the effect of \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\) uncertainties, the scales used in the ME of the MG5_aMC@NLO  + Pythia 8 samples are varied simultaneously, as well as individually, by factors of 2.0 and 0.5 relative to their nominal values. The uncertainty due to the ISR is estimated using a set of variations of the A14 tune’s parameter values. Uncertainties associated with the choice of PDF set are evaluated according to the PDF4LHC  prescription [81] using eigenvector variations from multiple NLO PDF sets, the effects of which are added in quadrature.

The systematic uncertainty due to the modelling of the parton shower, the hadronisation and the underlying event – called the parton-shower uncertainty in the following – is quantified by employing an alternative \(t\bar{t}Z\) sample generated with MG5_aMC@NLO, but interfaced to Herwig 7 instead of Pythia 8.

7.3 Background modelling uncertainties

The normalisation of the \(WZ\,+\,\text {jets}\) and \(ZZ\,+\,\text {jets}\) backgrounds with light-flavour jets are obtained from data, as discussed in Sect. 6.1. The \(WZ/ZZ\,+\,\text {jets}\) components with b- or c-jets are constrained to their MC predictions and normalisation uncertainties of 50% (\(WZ/ZZ + b\)) and 30% (\(WZ/ZZ + c\)) are assigned to them. These uncertainties are evaluated from data/MC comparisons in \(Z + b/c\) events [82], but also take into account differences in the heavy-flavour jet fractions between \(Z \,+\,\text {jets}\) and \(WZ/ZZ\,+\,\text {jets}\) events. Modelling uncertainties of \(WZ/ZZ\,+\,\text {jets}\) related to the \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\) scales and the PDF choice are obtained with the same prescription as for the signal. Uncertainties attributed to the resummation scale and CKKW matching scale [52, 54] are evaluated from alternative \(WZ/ZZ\,+\,\text {jets}\) samples with variations of these scale choices.

Uncertainties related to the \(\mu _\mathrm {r}\) and \(\mu _\mathrm {f}\) scales and the PDF of the \(tWZ\) background are evaluated in the same way as for the \(t\bar{t}Z\) and \(WZ/ZZ\,+\,\text {jets}\) samples. An additional uncertainty is assigned to the \(tWZ\) process to account for the interference between the \(t\bar{t}Z\) and \(tWZ\) processes. It is evaluated by switching to an alternative diagram removal scheme (DR1 vs DR2) [57] and obtaining an uncertainty from the differences observed in the signal regions.

Scale and PDF uncertainties of the \(tZq\) background are obtained in the same way as for the previously described samples. In addition, a normalisation uncertainty of 30% is assigned, motivated by the measurements of this process presented in Refs. [83, 84]. As for the \(t\bar{t}Z\) signal, the uncertainty due to the ISR is evaluated using alternative samples with Pythia 8 A14 Var3c eigentune variations.

Fig. 3
figure 3

The observed and expected event yields in the trilepton and tetralepton signal regions, as well as the \(WZ/ZZ\,+\,\text {jets}\) control regions, after the combined fit. The bottom panel shows the ratio of data to the total SM prediction. The size of the combined statistical and systematic uncertainty in the sum of the fitted signal and backgrounds is indicated by the blue hatched band

For the fake-lepton background, statistical as well as systematic uncertainties are considered as explained in Sect. 6. They are evaluated for each signal region independently and applied as normalisation uncertainties of the total fake-lepton background contribution in each region.

For the \(t\bar{t}H\) background, a normalisation uncertainty of approximately 10% due to the choice of QCD scales and PDF is used [48]. For processes giving smaller backgrounds, namely \(H\mathrm {+}W/Z\), \(t\bar{t}W\), \(t\bar{t}WW\), triboson and three/four top-quark production, a conservative overall normalisation uncertainty of 50% is applied.

8 Results of the inclusive cross-section measurement

The ratio of the measured value of the inclusive \(t\bar{t}Z\) production cross section to its corresponding SM prediction (\(\mu _{t\bar{t}Z}\)) is obtained from a simultaneous fit to the numbers of events in the trilepton and tetralepton signal regions (as defined in Tables 1,  2), as well as the \(WZ\,+\,\text {jets}\) and \(ZZ\,+\,\text {jets}\) control regions (defined in Table 3). For trilepton events, only the dedicated regions for the inclusive measurement are included in the fit. The fit is based on the profile-likelihood technique, with a likelihood function as a product of Poisson probability functions given by the observed event yields in the signal and control regions. The value of \(\mu _{t\bar{t}Z}\) as well as the normalisations of the light-flavour components of the \(WZ/ZZ\,+\,\text {jets}\) backgrounds are treated as free parameters in the fit. The systematic uncertainties described in Sect. 7 are included in the fit as nuisance parameters constrained by Gaussian functions. None of the uncertainty parameters is found to be significantly constrained or pulled by the fit. The calculation of confidence intervals and hypothesis testing is performed using a modified frequentist method as implemented in the RooStats framework [85,86,87].

Table 5 The observed and expected numbers of events in the trilepton and tetralepton signal regions, as well as the \(WZ/ZZ\,+\,\text {jets}\) control regions, after the combined fit. The definitions of the background categories are the same as in Table 4. Categories with event yields shown as ‘–’ do not contribute significantly to a region. The indicated uncertainties consider all experimental and theoretical systematic uncertainties as well as the statistical errors. As systematic uncertainties might be correlated between different processes, the individual uncertainties do not necessarily add up in quadrature to the uncertainty of the total SM prediction
Fig. 4
figure 4

Post-fit distributions of a \(N_{\mathrm {jets}}\) in \(3\ell \)-Z-1b4j-PCBT and b the \(p_{\text {T}}\) of the leading lepton in \(3\ell \)-Z-2b3j-PCBT. The shaded band includes all sources of statistical and systematic uncertainty after the combined fit. The lower panel shows the ratio of data to the total SM prediction. The uppermost bins include all events above the x-axis ranges. The blue triangular marker in the lower panel of a points to the position of a data point which lies slightly beyond the y-axis range shown

Within their uncertainties, the fitted normalisations of the light-flavour components of the \(WZ\,+\,\text {jets}\) and \(ZZ\,+\,\text {jets}\) backgrounds are compatible with unity, but can vary by up to 10% from their initial value. The observed and expected total event yields in the signal regions and the \(WZ/ZZ\,+\,\text {jets}\) control regions after the combined fit (post-fit) are shown in Fig. 3 and detailed in Table 5. The strong anti-correlation between the \(WZ + l\) and \(WZ + c\) backgrounds results in a smaller total uncertainty of the fitted SM background expectation in \(3\ell \)-WZ-CR compared with the uncertainties of the individual \(WZ + l\) and \(WZ + c\) components.

Comparisons between data and the post-fit SM predictions for some selected variables which offer sensitivity to the quality of the background modelling in the signal regions are also presented. The number of selected jets with \(p_{\text {T}} > {25}\,\hbox {GeV}\) in signal region \(3\ell \)-Z-1b4j-PCBT is shown in Fig. 4a. The \(p_{\text {T}}\) of the leading lepton in \(3\ell \)-Z-2b3j-PCBT is given in Fig. 4b. Figure 5a depicts the number of selected jets and Fig. 5b the \(p_{\text {T}}\) of the leading lepton in the combination of the four tetralepton regions. Figure 6a shows the \(p_{\text {T}}\) and Fig. 6b the rapidity (y) of the reconstructed \(Z\) boson in the combination of the trilepton and tetralepton regions.

Fig. 5
figure 5

Post-fit distributions of a \(N_{\mathrm {jets}}\) and b the \(p_{\text {T}}\) of the leading lepton in the combination of the tetralepton signal regions. The shaded band includes all sources of statistical and systematic uncertainty after the combined fit. The lower panel shows the ratio of data to the total SM prediction. The uppermost bins include all events above the x-axis ranges. The blue triangular markers in the lower panels point to the positions of data points which lie slightly beyond the y-axis range shown

Fig. 6
figure 6

Post-fit distributions of the a \(p_{\text {T}}\) and b rapidity of the \(Z\) boson in the combination of the trilepton and tetralepton regions. The shaded band includes all sources of statistical and systematic uncertainty after the combined fit. The lower panel shows the ratio of data to the total SM prediction. The uppermost bins include all events above the x-axis ranges. The blue triangular marker in the lower panel of a points to the position of a data point which lies slightly beyond the y-axis range shown

Table 6 summarises the measured \(\mu _{t\bar{t}Z}\) parameters obtained from the individual fits in the trilepton and tetralepton regions, as well as the value from the combined \(3\ell + 4\ell \) fit. The values obtained from the fit in the different regions are compatible within their uncertainties. The \(3\ell \)-channel events represent the dominant contribution to the combined result, and the individual \(3\ell \) result can be seen to differ only slightly from that using the combined selections. The total systematic uncertainties in the \(4\ell \) channel are smaller than those in the \(3\ell \) channel, but the overall precision is poorer in the \(4\ell \) channel due to the limited number of data events.

The measured \(\mu _{t\bar{t}Z}\) value and its uncertainty based on the fit results from the combined trilepton and tetralepton channels are converted to a cross-section measurement. The value corresponds to the phase-space region where the invariant mass of the decay products of the \(Z\) boson lies between 70 and \({110}\,\hbox {GeV}\). The cross section is measured to be

$$\begin{aligned} \sigma (pp \rightarrow t\bar{t}Z) = 0.99 \pm 0.05 \,\mathrm {(stat.)} \pm 0.08 \,\mathrm {(syst.)} \,\mathrm {pb}. \end{aligned}$$

The result agrees with the SM prediction of \(0.84^{+0.09}_{-0.10}\,\)pb at NLO QCD and EW accuracy [48, 49] and more recent calculations including NNLL corrections or the complete set (QCD and EW) of NLO corrections [16, 17].

The contributions from the relevant uncertainties of the measured cross section are summarised in Table 7. For this table, the uncertainties are grouped into several type-related categories and are shown together with the total uncertainty. As none of the uncertainties show significant asymmetries, they are symmetrised. The dominant uncertainty sources can be attributed to the \(t\bar{t}Z\) parton shower, the modelling of the \(tWZ\) background, and jet flavour-tagging. It should be noted that the uncertainty in the cross section due to the systematic uncertainty on the luminosity is larger than the 1.7% mentioned in Sect. 7.1, as the luminosity affects both signal and background normalisation.

Table 6 Measured \(\mu _{t\bar{t}Z}\) parameters obtained from the fits in the different lepton channels. The uncertainties include statistical and systematic sources. The uncertainty of the theoretical prediction of the \(t\bar{t}Z\) cross section (see Sect. 3) is not considered for the \(\mu _{t\bar{t}Z}\) values
Table 7 List of relative uncertainties of the measured inclusive \(t\bar{t}Z\) cross section from the combined fit. The uncertainties are symmetrised for presentation and grouped into the categories described in the text. The quadrature sum of the individual uncertainties is not equal to the total uncertainty due to correlations introduced by the fit

9 Differential cross-section measurements

9.1 Description of the observables and reconstruction of the \(t{\bar{t}}\) system

A set of ten observables were selected for the differential cross-section measurements which probe the kinematics of the \(t\bar{t}Z\) system. The definitions of these variables are summarised in Table 8. With the exception of the number of reconstructed jets (\(N_{\mathrm {jets}}\)), which is unfolded to particle level only, all distributions are unfolded to both particle and parton level. Two of the variables, namely the transverse momentum and the absolute value of the rapidity of the \(Z\) boson (\(p_{\text {T}} ^{Z}\) and \(|y^{Z}|\)), which are sensitive to \(t\bar{t}Z\) generator modelling and various BSM effects, are defined identically for the trilepton and tetralepton selections. The differential measurements for these variables are therefore performed using an inclusive selection denoted by \(3\ell + 4\ell \).

The jet multiplicity is a natural variable to use to probe the modelling of QCD radiation and hadronisation in MC generators. It is measured separately for the trilepton and tetralepton selections due to the different number of final-state quarks from the decay of the \(t\bar{t}\) system in the two channels. The transverse momentum of the lepton which is not associated with the \(Z\) boson (\(p_{\text {T}} ^{\ell ,{\text {non-}}Z}\)) in the trilepton signal regions provides a good test of the modelling of the \(p_{\text {T}}\) of the top quark (antiquark) and its decay products in the MC generator.

The absolute azimuthal separation and rapidity difference between the \(Z\) boson and the leptonic top quark (\(|\Delta \phi (Z,t_{\mathrm {lep}})|\) and \(|\Delta y(Z,t_{\mathrm {lep}})|\)) in the trilepton signal regions, as well as the absolute azimuthal separation between the \(Z\) boson and the \(t\bar{t}\) system (\(|\Delta \phi (t\bar{t},Z)|\)) in the tetralepton regions, provide direct probes of the \(t\bar{t}Z\) vertex. These variables therefore offer sensitivity to a number of BSM effects which could modify the coupling between the \(Z\) boson and the top quark.

Table 8 Summary of the variables used for the differential measurements. Some variables are considered for the trilepton or tetralepton signal regions only, as indicated. The jet multiplicity is measured for the two topologies separately, whereas for the variables related only to the kinematics of the \(Z\) boson (\(p_{\text {T}} ^{Z}\) and \(|y^{Z}|\)), the trilepton and tetralepton regions are combined

The absolute azimuthal separation between the two leptons associated with the top quarks (\(|\Delta \phi (\ell ^{+}_{t},\ell ^{-}_{{\bar{t}}})|\)) in tetralepton events provides sensitivity to BSM effects modifying the spin correlations between the two top quarks. The transverse momentum of the \(t\bar{t}\) system (\(p_{\text {T}} ^{t\bar{t}}\)) is sensitive to the MC modelling of the hard-scattering process as well as the modelling of the QCD radiation in the parton shower.

In order to construct the \(|\Delta \phi (Z,t_{\mathrm {lep}})|\) and \(|\Delta y(Z,t_{\mathrm {lep}})|\) variables in the trilepton regions, the full four-vector of the leptonic top quark from the \(t\bar{t}\) system (\(t_{\mathrm {lep}}\)) is required.Footnote 10 For both detector- and particle-level quantities the reconstructed \(E_{\text {T}}^{\text {miss}}\)  (both its magnitude and azimuthal angle), is first attributed to the neutrino from the associated \(W\) boson decay. The SM value of the \(W\) boson mass [35] is then used to determine the z-component of the neutrino momentum by analytically solving the corresponding quadratic equation. In many cases the solution is ambiguous. For those, both real solutions are considered. For cases in which the discriminant of the quadratic equation is negative, the \(p_{\text {T}}\) of the neutrino is set to the particular value which yields a single solution. In order to form the final top-quark candidate, the reconstructed leptonically decaying \(W\) boson candidate – or candidates in the case of two neutrino solutions – is added, via a four-vector sum, to the closer (in \(\Delta R\)) of the two reconstructed jets in the event with the highest output from the b-tagging algorithm (MV2c10). At particle level, the two jets which are ghost-matched to a b-hadron (as described in Sect. 4.2) are considered. In the case of only a single such ghost-matched jet, that jet is selected to form the top-quark candidate. Events with two distinct neutrino solutions will have two possible top-quark candidates, so the one with an invariant mass of the \(W\)-b system more consistent with a top-quark decay is chosen.

In the tetralepton channel the \(t\bar{t}\) system is reconstructed in the transverse plane only. The underlying assumption is that the two neutrinos from the \(t\bar{t}\) decay represent the dominant source of missing transverse momentum in the event; the value of the reconstructed \(E_{\text {T}}^{\text {miss}}\) can, therefore, be taken to be a reasonable proxy for the vector sum of the neutrino momenta in the transverse plane. Such a partial reconstruction avoids the need to determine the full kinematics of both neutrinos separately, while still allowing the reconstruction of the \(p_{\text {T}} ^{t\bar{t}}\) and \(|\Delta \phi (t\bar{t},Z)|\) variables for the differential measurements. The selection of the two b-tagged jets is performed analogously to the trilepton case. At detector level, the two reconstructed jets with the highest b-tagging score are selected. At particle level, the two jets ghost-matched to a b-hadron are selected; in the case of only one ghost-matched jet, the jet with the highest \(p_{\text {T}}\) of those remaining is selected as the second b-jet.

9.2 Unfolding procedure

To measure the differential cross-section distributions at particle and parton levels in the specific fiducial phase-spaces defined in Sect. 4.2, an iterative Bayesian unfolding procedure is used [88]. It relies on the Bayesian probability formula starting from a given prior of the particle- or parton-level distribution and iteratively updating it with the posterior distribution. The unfolding is performed using the RooUnfold package [89]. The differential \(t\bar{t}Z\) cross sections are calculated using the following equation:

$$\begin{aligned} \dfrac{\mathrm {d}\sigma _{t\bar{t}Z}}{\mathrm {d}X^i}= & {} \dfrac{1}{{\mathcal {L}}\cdot {\mathcal {B}}\cdot \Delta X^i\cdot \epsilon _\mathrm {\,eff}^i} \cdot \sum _j[{\mathcal {M}}^{-1}]_{ij}\\&\cdot f_\mathrm {acc}^j\cdot \left( N_\mathrm {obs}^j-N_\mathrm {bkg}^j\right) , \end{aligned}$$

where X denotes the variable used for the differential measurement (with the bin-width \(\Delta X\)), the index i indicates the bin at particle (or parton) level and j the detector-level bin.

Fig. 7
figure 7

The migration matrices for the \(p_{\text {T}}\) of the \(Z\) boson in the combination of the trilepton and tetralepton regions. The matrices quantify the migrations from a particle or b parton level to detector level. The quoted values are expressed as a percentage. The matrices are normalised such that the sum of any given row is 100%, although small differences may be present due to rounding

Fig. 8
figure 8

The efficiency (\(\epsilon _{\,\mathrm {eff}}\)) and acceptance (\(f_{\mathrm {acc}}\)) correction factors for the a particle-level and b parton-level measurements as a function of the \(p_{\text {T}}\) of the \(Z\) boson in the combination of the trilepton and tetralepton regions. The error bars represent the MC statistical uncertainties per bin based on the nominal signal sample

The migration matrix \({\mathcal {M}}\) quantifies the detector response and can be derived from the bin-to-bin migrations of events from particle or parton level to detector level in the nominal \(t\bar{t}Z\) simulation for each of the considered differential variables. Its inverse, \({\mathcal {M}}^{-1}\), is determined through the iterative unfolding procedure. For each j, \(N_{\mathrm {obs}}^{j}\) denotes the number of observed data events, and \(N_{\mathrm {bkg}}^{j}\) is the expected background contribution. The various background contributions are estimated in the same way as for the inclusive measurement (see Sect. 6). In this case, the \(WZ/ZZ + l\) backgrounds are corrected by normalisation parameters obtained from an inclusive fit based on the combined \(3\ell + 4\ell \) channels. A statistics-only version of the fit was performed solely for the extraction of the normalisation parameters in this case. The acceptance corrections, \(f_{\mathrm {acc}}^{j}\), account for events that are generated outside the fiducial phase-space but pass the detector-level selection, whereas the efficiency correction terms, \(\epsilon _{\,\mathrm {eff}}^{i}\), correct for events that are in the fiducial phase-space but are not reconstructed in the detector. In either case, the term ‘fiducial’ refers to the corresponding type of unfolding being performed – either to parton or particle level. The integrated luminosity is denoted by \({\mathcal {L}}\). The branching ratio \({\mathcal {B}}\) is that of the \(t\bar{t}Z\) system to final states with three or four charged electrons or muons, originating directly from either the \(Z\) boson decay or the decay of the \(W\) bosons from the \(t\bar{t}\) system, and is used to extrapolate the measurements to cover all \(t\bar{t}\) and \(Z\) decays. The branching ratio correction is only applied for the parton-level measurements and corresponds to the decay channels applicable for the fiducial region based on the particular variable involved (see Table 8). The values for \({\mathcal {B}}\), calculated using inputs taken from Ref. [35], are 0.0193 \((3\ell )\), 0.0030 \((4\ell )\), and 0.0223 \((3\ell + 4\ell )\).

Fig. 9
figure 9

Absolute differential \(t\bar{t}Z\) cross sections measured at a particle level and b parton level as a function of the transverse momentum of the reconstructed \(Z\) boson. c, d Show the relative contributions from different categories of systematic uncertainties per bin. The large difference between the y-axis scales in a and b is a result of different efficiency and acceptance corrections between the particle- and parton-level measurements, together with the branching ratio correction of \({\mathcal {B}} = 0.0223\) for the combined \(3\ell + 4\ell \) channels, which is applied only for the parton-level result

Fig. 10
figure 10

Normalised differential \(t\bar{t}Z\) cross sections measured at a particle level and b parton level as a function of the transverse momentum of the reconstructed \(Z\) boson. c, d Show the relative contributions from different categories of systematic uncertainties per bin

Figure 7 shows the particle- and parton-level migration matrices that are used for the differential cross-section measurements depending on the \(p_{\text {T}}\) of the \(Z\) boson. The matrices are normalised such that the sum of entries in each row is equal to one. The entries in the matrices represent the fraction of events at either particle or parton level in a y-axis bin that are reconstructed at detector level in an x-axis bin. Thus, the fraction of events in the diagonal elements shows the quality of the resolution for a specific variable. In the case of \(p_{\text {T}} ^{Z}\), these fractions lie between 90 and 96% for both particle and parton level. For some of the other variables which do not depend only on the \(Z\) boson reconstruction (e.g. \(N_{\mathrm {jets}}\), \(p_{\text {T}} ^{t\bar{t}}\)), the migrations between bins can be significantly larger and reach a level of 20–25%. Figure 8 depicts the corresponding correction factors as a function of \(p_{\text {T}} ^{Z}\): \(\epsilon _{\,\mathrm {eff}}\) increases for larger \(p_{\text {T}} ^{Z}\) due to higher lepton reconstruction efficiencies for increasing transverse momenta. It lies between 33 and 43% at particle level and between 10 and 22% at parton level. The values of \(f_{\mathrm {acc}}\) are in all bins higher than 80% for both particle and parton level and show no notable dependence on \(p_{\text {T}} ^{Z}\).

The choice of binning is determined separately for each variable by performing a multi-dimensional scan in order to strike a reasonable balance between three partially competing aspects: retaining a large number of bins; limiting the relative impact from statistical uncertainties of the measured data; and ensuring large values (\(> {50}{\%}\)) for the diagonal elements of the matrices associated with the bin migrations between particle/parton and detector level. As a result, the binning for the differential measurements differs from that shown in Figs. 4, 5 and 6. The stability of the unfolding procedure is determined by constructing pseudo-data sets by randomly sampling events from the nominal \(t\bar{t}Z\) MC sample, such that the pseudo-data sets contain approximately the same number of events as in the measured data. So-called ‘pull tests’ are performed as part of the binning optimisation to verify that the unfolding is stable for the selected number and range of bins. In addition, linear re-weightings are applied to the pseudo-data to test the ability of the unfolding procedure to correct the pseudo-data back to their underlying true spectra, obtained from the MC event record. The number of iterations used in the iterative Bayesian unfolding is also optimised with pseudo-experiments: for each iteration, a \(\chi ^{2}\) per degree-of-freedom (ndf) is calculated by comparing the bin contents of the unfolded pseudo-data with those from the previous iteration. In the case of the first iteration, the unfolded pseudo-data are instead compared with the corresponding generator-level distribution. Iterations are performed until the \(\chi ^{2}/\)ndf value of a given distribution stabilises at a constant value while the statistical uncertainty returned from the unfolding procedure is kept as low as possible. For all variables, the number of iterations used lies between two and five. Systematic uncertainties are propagated to the unfolded distributions by varying the detector-level distributions within the uncertainties and repeating the unfolding procedure.

The normalised differential cross sections are obtained by dividing the distributions by the integrated fiducial cross sections, which are computed by adding up the contributions from all bins. The evaluation of systematic uncertainties is performed after the normalisation is done and it is on the same prescriptions employed for the absolute differential measurements.

9.3 Results of the differential measurements

The measured differential \(t\bar{t}Z\) cross sections unfolded to particle and parton levels for the \(p_{\text {T}}\) of the reconstructed \(Z\) boson are presented in Fig. 9.

The results are displayed in the seven \(p_{\text {T}}\) bin ranges used when performing the unfolding, with any additional contributions beyond \({400}\,\,\hbox {GeV}\) included in the uppermost bin. The relative contributions from statistical and systematic uncertainties in each bin are shown in the theory-to-data ratio panels of the upper figures, where the net effect corresponds to a sum in quadrature of the two. In the lower figures, the same relative contributions are shown as well as a decomposition of the systematic uncertainties into various categories.Footnote 11 The black data points in the upper figures correspond to the measured unfolded data and error bars representing the sum in quadrature of statistical and systematic uncertainties. The total uncertainty of this measurement is between 20 and 40%, depending on the bin, with the dominant uncertainty arising from the limited number of data events. Other significant sources of systematic uncertainty are associated with \(t\bar{t}Z\) modelling and b-tagging. Figure 10 shows the same set of results for the normalised distributions for this variable. The uncertainties on the normalised cross sections are notably smaller (15–35%) than those of the absolute cross sections because several systematic uncertainties cancel out. The differential cross sections measured in data are compared with the NLO QCD predictions from different \(t\bar{t}Z\) generators, as described in Sect. 3. The predictions are shown for MG5_aMC@NLO interfaced to Pythia 8 (red) or Herwig 7 (magenta), as well as for Sherpa 2.2.1 inclusive (blue) and multi-leg (green). For the \(p_{\text {T}} ^{Z}\) measurement, the different generators provide very similar predictions.

Fig. 11
figure 11

a Absolute differential \(t\bar{t}Z\) cross sections and b the relative contributions from different categories of systematic uncertainties per bin measured at parton level as a function of the absolute rapidity of the \(Z\) boson

Fig. 12
figure 12

Absolute differential \(t\bar{t}Z\) cross section measured at particle level as a function of \(N_{\mathrm {jets}}\) with \(p_{\text {T}} > {25}\,\,\hbox {GeV}\) for a the trilepton and b the tetralepton event selection. c, d Show the relative contributions from different categories of systematic uncertainties per bin

Fig. 13
figure 13

Absolute differential \(t\bar{t}Z\) cross sections measured at parton level as a function of a \(p_{\text {T}} ^{\ell ,{\text {non-}}Z}\) and b the azimuthal separation (\(\Delta \phi \)) between the \(Z\) boson and the reconstructed top quark. c, d Show the relative contributions from different categories of systematic uncertainties per bin

Fig. 14
figure 14

Absolute differential \(t\bar{t}Z\) cross sections measured at parton level as a function of a the rapidity difference (\(|\Delta y|\)) between the \(Z\) boson and the reconstructed top quark and b the azimuthal separation (\(\Delta \phi \)) between the two leptons associated with the top-quark pair. c, d Show the relative contributions from different categories of systematic uncertainties per bin

Results for the other observables described in Table 8 are presented in Figures 11, 12, 13, 14 and 15. For these variables, only the absolute parton-level differential measurements are shown, with the exception of \(N_{\mathrm {jets}}\), which is unfolded only to particle level. Additional differential \(t\bar{t}Z\) predictions at NLO, NLO + NNLL or approximate next-to-next-to-leading-order (nNLO) precision – including EW corrections – are shown in grey for the parton-level results for most of the observables. The calculations were carried out in similar fashion to that described in Ref. [18], but specifically performed in the context of this analysis in order to provide predictions for the measured observables and to match the number and ranges of bins for the different variables.Footnote 12 These additional parton-level predictions are not provided for two of the observables, namely \(p_{\text {T}} ^{\ell ,{\text {non-}}Z}\) and \(|\Delta \phi (\ell _{t}^{+},\ell _{{\bar{t}}}^{-})|\), since the decays of the \(t\bar{t}\) pair and the \(Z\) boson were not included in the theoretical calculations.

Fig. 15
figure 15

Absolute differential \(t\bar{t}Z\) cross sections measured at parton level as a function of a the azimuthal separation (\(\Delta \phi \)) between the \(Z\) boson and the reconstructed \(t\bar{t}\) system and b the \(p_{\text {T}}\) of the \(t\bar{t}\) system. c, d Show the relative contributions from different categories of systematic uncertainties per bin

In order to test the overall compatibility between the unfolded measurements and the various predictions, a \(\chi ^{2}/\)ndf and corresponding p value is evaluated for each of the differential measurements, separately for the parton level and particle level as well as for the absolute and normalised cases.

The \(\chi ^{2}\) value is defined as:

$$\begin{aligned} \chi ^2 = \sum _{i=1}^{N_{\text {bins}}}\sum _{j=1}^{N_{\text {bins}}}\left( n_{i}-\mu _{i}\right) \left( n_{j}-\mu _{j}\right) [C^{-1}]_{ij}, \end{aligned}$$

where \(n_{i}\) and \(\mu _{i}\) correspond to the content in bin i of the distributions from the unfolded data and the given prediction, respectively, and \(\left[ C^{-1}\right] _{ij}\) to the element in row i and column j of the inverse of the covariance matrix for the particular variable. The values, \(n_{i}\) and \(\mu _{i}\), are, therefore, notational shorthands for \(\mathrm {d}\sigma _{t\bar{t}Z} / \mathrm {d} X^{i}\), or \(1/\sigma \cdot \mathrm {d} \sigma _{t\bar{t}Z} / \mathrm {d} X^{i}\) in the normalised case.

A given p value can be interpreted as the probability of obtaining a value of \(\chi ^2\) greater or equal to the quoted value for a particular number of degrees of freedom, where the latter is equal to the number of bins (\(N_{\text {bins}}\)) in the case of the absolute measurements and \(N_{\text {bins}} - 1\) for the normalised measurements.

The construction of the covariance matrix is based on the approach described in Ref. [90], and it includes both the statistical and systematic uncertainties. The latter include detector-related uncertainties as well as those related to the modelling of the signal and various background processes. While all sources of uncertainty related to the measurements are incorporated in the covariance matrix elements, uncertainties in the theoretical predictions, themselves, are omitted, and their impact is, therefore, not reflected in the quoted \(\chi ^2\) and corresponding p values.

For a given variable, the elements of the covariance matrix, \(C_{ij}\), are evaluated using a bootstrap technique, whereby 150,000 Poisson-fluctuated distributions are produced, each corresponding to a pre-unfolded distribution for a given pseudo-experiment. For the detector-related uncertainties, Gaussian-distributed shifts are added coherently to each of the Poisson-fluctuated bin contents, with each shift corresponding to a particular uncertainty source. The shifts are applied as a multiplicative scale relative to the particular bin content, and with the amount and direction of each shift dictated by the corresponding uncertainty source.

Each of the varied distributions is subsequently unfolded using the nominal acceptance and efficiency corrections, as well as the nominal migration matrix – those derived from the nominal MG5_aMC@NLO  + Pythia 8 signal sample. Gaussian-distributed shifts are then added coherently to the post-unfolding distributions for each of the signal- and background-modelling uncertainty sources. These shifts are also determined and applied as relative variations for each particular source. The relative variations in this case are defined according to the difference between the generated and the unfolded cross section of a given alternative signal or background model, using nominal corrections in the unfolding.

The resulting changes to the unfolded distributions are, then, used to determine the elements of the covariance matrix for that particular variable. The covariance matrices for the normalised measurements are constructed in an analogous fashion. In this case, the distributions are normalised to unity after all effects are included. In order to avoid performing the unfolding on distributions with negative bin contents, that can arise due to the effects of systematic uncertainties and the subtraction of backgrounds, any such bin contents are set to zero prior to the unfolding.

The uncertainties reflected in the elements, \(C_{ij}\), initially evaluated as relative values, are then multiplied by the differential cross-section values from the measured data in each particular bin in order to yield absolute uncertainties.

In general, each of the individual sources contributing to the full covariance matrix will contribute to the off-diagonal terms, including even those from the limited data sample size where non-diagonal contributions arise from correlations between the bins introduced during the unfolding process.

The correlations between the bins in the absolute differential measurements for the trilepton and combined channels are sizeable – in many cases in the 20–55% range – and positive. In the case of the tetralepton channel, where statistical uncertainties are more dominant, the correlations are generally below 20% in absolute value. The correlations in the case of the normalised measurements are generally negative and reach absolute values even larger than in the absolute cross-section measurements, with the most extreme case being a value of \(\rho = -0.76\) between the first and second bin of the \(|\Delta y(Z,t_{\text {lep}})|\) variable in the trilepton channel. For the \(Z\)-related variables in the combined channel, the effect of the larger data sample is partially balanced by the increase in the number of bins, such that the correlations in the absolute measurements for \(p_{\text {T}} ^{Z}\) and \(|y^{Z}|\) are also positive, but lie in the 15–45% range. In the normalised measurements for these two variables, the correlations are also mostly negative but are smaller in magnitude than for other variables (strictly \(|\rho | < {40}\,{\%}\), but in most cases \(|\rho | < {20}\,{\%}\)).

Table 9 summarises the evaluated \(\chi ^2/\)ndf and p values used to quantify the compatibility between the measured unfolded data and the various predictions. For the parton-level measurements, the values for the additional theory predictions at NLO, NLO+NNLL or nNLO are also shown for those variables for which predictions are available [18].

Overall, the unfolded spectra from the measured data are compatible with the various predictions for most of the variables considered. For the \(p_{\text {T}} ^{Z}\) variable in the combined \(3\ell + 4\ell \) channel, as well as for \(p_{\text {T}} ^{\ell ,{\text {non-}}Z}\) and \(|\Delta \phi (Z,t_{\text {lep}})|\) in the trilepton channel, slightly lower p values are obtained for several predictions, but in all cases they are found to be greater than 0.05. For the \(p_{\text {T}} ^{Z}\) variable in the combined channel, the slightly poorer agreement is driven in large part by the sixth bin (\({220}\,\hbox {GeV} \le p_{\text {T}} ^{Z} < {290}\,\hbox {GeV}\)). For this variable, however, the p value is larger (0.17) for the additional NLO+NNLL prediction in the absolute differential measurement. For the \(|\Delta \phi (t\bar{t},Z)|\) variable in the tetralepton channel, for which the data exhibit a greater relative fraction of events with larger azimuthal separation between the \(Z\) boson and the \(t\bar{t}\) system, a slightly better level of agreement is observed for the Sherpa  predictions compared with those from MG5_aMC@NLO. As the statistical uncertainties of the measured data are almost always significantly larger than the differences between the predictions, no definite conclusion about the overall compatibility for these observables can be made. The effect of the uncertainties in the fixed-order parton-level theoretical predictions in the rightmost column of Table 9 was evaluated. This was done by adding terms of the form \(\delta \mu _i \delta \mu _j\) to the covariance matrix, where \(\delta \mu _{i(j)}\) is the sum in quadrature of the uncertainties associated with the scale and PDF choice for bin i(j). The bin contents of the theoretical predictions were, therefore, considered to be 100% correlated. The inclusion of these uncertainties leads to a relative increase of 20–50% in the p values for the variables \(|\Delta \phi (Z,t_{\text {lep}})|\), \(|\Delta \phi (t\bar{t},Z)|\), and \(p_{\text {T}} ^{t\bar{t}}\) relative to those quoted in Table 9. For the variables \(|\Delta y(Z,t_{\text {lep}})|\), \(p_{\text {T}} ^{Z}\), and \(|y^{Z}|\), the impact is negligible.

Table 9 Summary of the tests of compatibility between the unfolded differential measurements and the various predictions. Quoted are the \(\chi ^2/\)ndf and corresponding p values incorporating all bins for the given variable and based on the assumption that all sources of uncertainty are Gaussian-distributed. The values associated with the additional theory predictions (last column) are included where applicable. These additional predictions are obtained as described in Ref. [18], with their precisions depending on the particular variable: NLO for \(|\Delta \phi (t\bar{t},Z)|\) and \(p_{\text {T}} ^{t\bar{t}}\); NLO+NNLL for \(|\Delta \phi (Z,t_{\text {lep}})|\), \(|\Delta y(Z,t_{\text {lep}})|\) and \(p_{\text {T}} ^{Z}\); and nNLO for \(|y^{Z}|\)

A difference between the measured inclusive cross section quoted in Sect. 8 and the cross section based on the integrated absolute parton-level spectra in the combined \(3\ell +4\ell \) channel is observed. The two measurements differ both in terms of the method used and in their selection due to the use of different b-tagging WPs (refer to Table 1). Approximately 67% of selected data events are common to both measurements. The compatibility between the two cross-section measurements is evaluated using pseudo-experiments taking into account the correlation between uncertainties, including all sources of statistical and systematic effects, and it is found to be at the level of two standard deviations.

10 Conclusions

Inclusive and differential measurements of the production cross section of a \(t\bar{t}\) pair in association with a \(Z\) boson are presented. The full \(\sqrt{s} = {13}\,\hbox {TeV}\) \(pp\) collision data set collected by the ATLAS detector during Run 2 of the LHC between 2015 and 2018, corresponding to \({139}\,\hbox {fb}^{-1}\), was used for this analysis. Only final states with three or four isolated charged leptons (electrons or muons) were considered for the measurements. The measured inclusive cross section of the \(t\bar{t}Z\) process is \(\sigma _{t\bar{t}Z} = 0.99 \pm 0.05\) (stat.) \(\pm \, 0.08\) (syst.) pb, in agreement with the SM prediction. The dominant sources of systematic uncertainty in this measurement are associated with the \(t\bar{t}Z\) parton-shower modelling, b-tagging, and modelling of the \(tWZ\) background.

Absolute and normalised differential cross sections were measured as functions of nine different observables sensitive to the MC modelling of the \(t\bar{t}Z\) process and to potential BSM effects. The differential cross-section measurements were performed at particle and parton levels in specific fiducial volumes. The unfolded spectra from the measured data are compared with the predictions of different NLO QCD \(t\bar{t}Z\) MC generators and theoretical predictions at NLO, NLO + NNLL and nNLO precision, based on a \(\chi ^{2}/\)ndf and p value compatibility test. For most of the considered observables, good agreement between data and the predictions is observed. The differences between the various predictions are determined to be smaller than the uncertainties of the unfolded data. For the variables \(p_{\text {T}} ^{Z}\), \(p_{\text {T}} ^{\ell ,\text {non-}Z}\), \(|\Delta \phi (Z,t_{\mathrm {lep}})|\) and \(|\Delta \phi (t\bar{t},Z)|\), the observed and predicted differential results show slightly poorer agreement, but p values \(> 0.05\) are obtained in all cases.