1 Introduction

The Large Hadron Collider (LHC) centre-of-mass energy of 13 \(\text {TeV}\) greatly extends the sensitivity of the ATLAS experiment [1] to heavy new particles. In several new physics scenarios [2,3,4], these heavy new particles may have decay chains including the Higgs boson [5, 6]. The large mass-splitting between these resonances and their decay products results in a high-momentum Higgs boson, causing its decay products to be collimated. The decay of the Higgs boson into a \(b\bar{b}\) pair has the largest branching fraction within the Standard Model (SM), and thus is a major decay mode to use when searching for resonances involving high-momentum Higgs bosons (see e.g. Ref. [7]), as well as for measuring the SM Higgs boson properties. The signature of a boosted Higgs boson decaying into a \(b\bar{b}\) pair is a collimated flow of particles, in this document called a ‘Higgs-jet’, having an energy and angular distribution of the jet constituents consistent with a two-body decay and containing two b-hadrons. The techniques described in this paper to identify Higgs bosons decaying into bottom-quark pairs have been used successfully in several analyses [8,9,10] of 13 \(\text {TeV}\) proton–proton collision data recorded by ATLAS.

In order to identify, or tag, boosted Higgs bosons it is paramount to understand the details of b-hadron identification and the internal structure of jets, or jet substructure, in such an environment [11]. The approach to tagging presented in this paper is built on studies from LHC runs at \(\sqrt{s} = 7\) and 8 \(\text {TeV}\), including extensive studies of jet reconstruction and grooming algorithms [12], detailed investigations of track-jet-based b-tagging in boosted topologies [13], and the combination of substructure and b-tagging techniques applied in the Higgs boson pair search in the four-b-quark final state [14] and for discrimination of Z bosons from W bosons [15]. Gluon splitting into b-quark pairs at small opening angles has been studied at \(\sqrt{s} = 13\) \(\text {TeV}\) by ATLAS [16]. The identification of Higgs bosons at high transverse momenta through the use of jet substructure has also been studied by the CMS Collaboration and their techniques are described in Refs. [17, 18].

The Higgs boson tagging efficiency and background rejection for the two most common background processes, the multijet and hadronic top-quark backgrounds, are evaluated using Monte Carlo simulation. In addition, two processes with a topology similar to the signal, \(Z \rightarrow b \bar{b}\) decays and \(g\rightarrow b\bar{b}\) splitting, are used to validate Higgs-jet tagging techniques in data at \(\sqrt{s}=13\) \(\text {TeV}\). In particular the modelling of relevant Higgs-jet properties in Monte Carlo simulation is compared with data. The \(g\rightarrow b\bar{b}\) process allows the modelling of one of the main backgrounds to be validated. The \(Z \rightarrow b \bar{b}\) process is a colour-singlet resonance with a mass close to the Higgs boson mass and thus very similar to the \(H \rightarrow b\bar{b}\) signal.

After a brief description of the ATLAS detector in Section 2 and of the data and simulated samples in Section 3, the object reconstruction, selection and labelling is discussed in Section 4. Section 5 describes relevant systematic uncertainties. The Higgs-jet tagging algorithm and its performance are presented in Section 6. Sections 7 and 8 discuss a comparison between relevant distributions in data control samples dominated by \(g\rightarrow b\bar{b}\) and \(Z(\rightarrow b\bar{b})\gamma \) and the corresponding simulated events, respectively. Finally, conclusions are presented in Section 9.

2 ATLAS detector

The ATLAS detector [1] at the LHC covers nearly the entire solid angle around the collision point.Footnote 1 It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large superconducting toroid magnets. The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range \(|\eta | < 2.5\).

Preceding data-taking at a centre-of-mass energy of 13 \(\text {TeV}\), the high-granularity silicon pixel detector was equipped with a new barrel layer, located at a smaller radius (of about 34 mm) than the other layers [19, 20]. The upgraded pixel detector covers the vertex region and typically provides four measurements for tracks originating from the luminous region. It is followed by a silicon microstrip tracker, which usually provides four space points per track. These silicon detectors are complemented by a transition radiation tracker, which enables radially extended track reconstruction up to \(|\eta | = 2.0\). The transition radiation tracker also provides electron identification information based on the fraction of hits above a certain energy deposit threshold corresponding to transition radiation.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta |< 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\) to correct for energy loss in material upstream of the calorimeters. Hadronic calorimeter within \(|\eta | < 1.7\) is provided by a steel/scintillating-tile calorimeter, segmented into three barrel structures, and two copper/LAr hadronic endcap calorimeters covering \(1.5<|\eta | < 3.2\). The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic measurements respectively.

The muon spectrometer (MS) comprises separate triggering and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by superconducting air-core toroids. The precision chamber system covers the region \(|\eta | < 2.7\) with three layers of monitored drift tubes, complemented by cathode strip chambers in the forward region, where the background is highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive plate chambers in the barrel, and thin gap chambers in the endcap regions.

A two-level trigger system is used to select interesting events [21]. The level-1 trigger is implemented in hardware and uses a subset of detector information to reduce the event rate to a design value of at most 100 kHz. This is followed by a software-based high-level trigger, which reduces the event rate further to an average of 1 kHz.

3 Data and simulated event samples

The data used in this paper were recorded with the ATLAS detector during the 2015 and 2016 LHC proton–proton (pp) collision runs, and correspond to a total integrated luminosity of 36.1 fb\(^{-1}\) at \(\sqrt{s} = 13\) \(\text {TeV}\). This integrated luminosity is calculated after the imposition of data quality requirements, which ensure that the ATLAS detector was in good operating condition.

Several Monte Carlo (MC) simulated event samples were used for the optimisation of the Higgs boson tagger, estimation of its performance, and the comparisons between data and simulation.

Simulated events with a broad transverse momentum (\(p_{\text {T}}\)) spectrum of Higgs bosons were generated as decay products of Randall–Sundrum gravitons \(G^{*}\) in a benchmark model with a warped extra dimension [2], \(G^{*} \rightarrow HH \rightarrow b\bar{b}b\bar{b}\), over a range of graviton masses between 300 and 6000 \(\text {GeV}\). The events were simulated using the MadGraph5_aMC@NLO  generator [22]. Parton showering, hadronisation and the underlying event were simulated with Pythia8  [23] using the leading-order (LO) NNPDF2.3 parton distribution function (PDF) set [24] and the ATLAS A14 [25] set of tuned parameters.

Events containing the \(Z(\rightarrow b\bar{b})\gamma \) and \(\gamma \) + jets processes were simulated with the Sherpa  v2.1.1 [26,27,28,29] LO generator. The matrix elements were configured to allow up to three partons in the final state in addition to the Z boson or the photon. The Z boson was produced on-shell and required to decay hadronically. The CT10 next-to-leading-order (NLO) PDF set [30, 31] was used. The \(t\bar{t}\gamma \) MC events were modelled by MadGraph interfaced with Pythia8 for showering, hadronisation and the underlying event with the LO NNPDF2.3 PDF set and the A14 underlying-event tune. Simulated events of hadronically decaying \(W\gamma \) were generated using Sherpa  v2.1.1, with the same configuration as the one used for the \(Z\gamma \) sample.

To cover a large range of top-quark transverse momenta, hadronically decaying top quarks were generated using \(Z'\) bosons decaying into \(t\bar{t}\) pairs over a range of \(Z'\) boson masses between 400 and 5000 \(\text {GeV}\). These samples were simulated using Pythia8 with the LO NNPDF2.3 PDF set and the A14 underlying-event tune.

Finally, inclusive multijet events were generated using Pythia8, with the LO NNPDF2.3 PDF set and the A14 underlying-event tune; and with Herwig++  [32], with the CTEQ [33] PDF set and the UEEE [34] underlying event tune. To increase the number of simulated events with semimuonically decaying hadrons for the \(g\rightarrow b\bar{b}\) analysis, samples of multijet events filtered to have at least one muon with \(p_{\text {T}}\) above 3 \(\text {GeV}\) and \(|\eta | < 2.8\) were produced with Pythia8 and Herwig++ using the same PDF set and underlying-event tunes as the unfiltered multijet samples.

In all cases except events generated using Sherpa, EvtGen [35] was used to model the decays of b- and c-hadrons. All simulated event samples included the effect of multiple pp interactions in the same and neighbouring bunch crossings (‘pile-up’) by overlaying simulated minimum-bias events on each simulated hard-scatter event. The minimum-bias events were simulated with the single-, double- and non-diffractive pp processes of Pythia8 using the A2 tune [36] and the MSTW2008 LO PDF [37,38,39]. The detector response to the generated events was simulated with Geant 4 [40, 41].

4 Object and event reconstruction

In this section the object reconstruction, associations among the objects, jet labelling, and the procedure to determine the heavy-flavour content of jets are described.

Calorimeter jets: Calorimeter-based jets are built from noise-suppressed topological clusters and are reconstructed using FastJet  [42] with the anti-\(k_{t}\) algorithm [43] with a radius parameter of \(R=1.0\) (\(\text {large-}R\) jets) or \(R=0.4\) (\(\text {small-}R\) jets). The topological clusters of the \(\text {large-}R\) jets are brought to the hadronic energy scale using the local hadronic cell weighting scheme [44]. The \(\text {large-}R\) jets are groomed using trimming [12, 45] to discard the softer components of jets that originate from initial-state radiation, pile-up interactions or the underlying event. This is done by reclustering the constituents of the initial jet, using the \(k_{t}\) algorithm [46, 47], into subjets of radius parameter \(R_{\text {sub}} =0.2\) and removing any subjet that has a \(p_{\text {T}}\) less than 5% of the parent jet \(p_{\text {T}}\). The simulation-based calibration of the trimmed jet \(p_{\text {T}}\) and mass is described in Ref. [48]. \(\text {Large-}R\) jets are required to have \(p_{\text {T}} > 250\) \(\text {GeV}\) and \(|\eta | < 2.0\). \(\text {Small-}R\) jets are calibrated with a series of simulation-based corrections and in situ techniques, including corrections to account for pile-up energy entering the jet area, as described in Ref. [49]. They are required to have \(p_{\text {T}} > 20\) \(\text {GeV}\) and \(|\eta | < 2.5\). To reduce the number of \(\text {small-}R\) jets originating from pile-up interactions, these jets are required to pass the jet vertex tagger (JVT) [50] requirement if the jets are in the range \(p_{\text {T}} < 60\) \(\text {GeV}\) and \(|\eta | < 2.4\). The JVT requirement has an inclusive hard-scatter efficiency of about 97% in that kinematic region.

Truth jets: Truth jets are built in simulated events by using ‘truth’ information from MC generator’s event record to cluster stable particles with a lifetime \(\tau _0\) in the rest frame such that \(c\tau _0 > 10\) mm. Particles such as muons and neutrinos which do not leave significant energy deposits in the calorimeter are excluded. The same jet-clustering algorithm and trimming procedure as for calorimeter jets are used to reconstruct truth jets.

Track-jets: Track-jets are built with the anti-\(k_{t}\) algorithm with a radius parameter of \(R=0.2\) [13] from at least two ID tracks with \(p_{\text {T}} >0.4\) \(\text {GeV}\) and \(|\eta | < 2.5\) that are either associated with the primary vertex or have a longitudinal impact parameter \(|z_{0}\sin (\theta )|<3\) mm. Such requirements greatly reduce the number of tracks from pile-up vertices whilst being highly efficient for tracks from the hard-scatter vertex. Once the track-jet’s axis is determined, tracks selected with looser impact parameter requirements are matched to the jet in order to collect the tracks needed to effectively run the jet flavour tagging algorithms. The tracks are matched to the jet by using the angular separation \(\Delta R\) between the track and the track-jet’s axis. The \(\Delta R\) requirement varies as a function of jet \(p_{\text {T}}\), being wide for low-\(p_{\text {T}}\) jets and narrower for high-\(p_{\text {T}}\) jets as described in Ref. [51]. Only track-jets with \(p_{\text {T}} >10\) \(\text {GeV}\) and \(|\eta |<2.5\) are used for the analysis.

Muons: Muons are reconstructed from a combination of measurements from the ID and the MS. They are required to pass identification requirements based on quality criteria applied to the ID and MS tracks. The ‘Loose’ identification working point defined in Ref. [52] is used. Muons selected for this analysis are required to have \(p_{\text {T}} >5\) \(\text {GeV}\) and \(|\eta |<2.4\).

Photons: Photons are reconstructed from clusters of energy deposits in the electromagnetic calorimeter. Clusters without matching tracks are classified as unconverted photon candidates. A photon candidate that can be matched to a reconstructed vertex or track consistent with a photon conversion is considered as a converted photon candidate [53]. The photon energy estimate is described in Ref. [54]. Requirements on the shower shape in the electromagnetic calorimeter and on the energy fraction measured in the hadronic calorimeter are used to identify photons; the ‘Tight’ identification working point is applied in the analysis [53]. In order to select prompt photons, the photons are required to fulfil the ‘Tight’ isolation criteria. The photons are required to have \(|\eta |<1.37\) or \(1.52<|\eta |<2.37\) and \(E_{\text {T}} > 175\) \(\text {GeV}\). The latter requirement is applied to insure efficient triggering.

Track-jet ghost association: In events with a dense hadronic environment an ambiguity often exists when matching track-jets to calorimeter jets. The track-jet matching to \(\text {large-}R\) jets is performed by applying ghost association [12, 55, 56]: the \(\text {large-}R\) jet clustering process using the anti-\(k_{t}\) algorithm with \(R = 1.0\) is repeated with the addition of ‘ghost’ versions of the track-jets that have the same direction but infinitesimally small \(p_{\text {T}}\), so that they do not change the properties of the \(\text {large-}R\) calorimeter jets. A track-jet is associated with the \(\text {large-}R\) jets if its ghost version is contained in the jet after reclustering. The reclustering is applied to the untrimmed \(\text {large-}R\) jets. The reclustered jets are identical to the jets before the reclustering, with the addition of the matched track-jets retained as associated objects. This provides a robust matching procedure, and matching to jets with irregular boundaries can be achieved in a way that is less ambiguous than a simple geometric matching.

Jet labelling: The performance of the tagger is evaluated on the basis of labelled \(\text {large-}R\) jets. Higgs-jets are defined as calorimeter-based \(\text {large-}R\) jets with a Higgs boson and the corresponding two b-hadrons from the Higgs boson decay found in the MC event record within \(\Delta R = 1\) of the \(\text {large-}R\) jet. Only the Higgs boson with the highest \(p_{\text {T}}\) in the event is considered and it is required to have \(p_{\text {T}} > 250\) \(\text {GeV}\) and \(|\eta | < 2.0\). The b-hadron must have \(p_{\text {T}}\) above 5 \(\text {GeV}\) and \(|\eta | < 2.5\). Configurations where more than one Higgs boson is found within the \(\text {large-}R\) jet are excluded. Top-jets are defined as \(\text {large-}R\) jets in which exactly one top quark is found in the MC event record within \(\Delta R = 1\) of the \(\text {large-}R\) jet.

Jet flavour labelling: The labelling of the flavour of the track-jets in simulation is done by geometrically matching the jet with truth hadrons. If a weakly decaying b-hadron with \(p_{\text {T}}\) above 5 \(\text {GeV}\) is found within \(\Delta R = 0.2\) of the track-jet’s direction, the track-jet is labelled as a b-jet. In the case that the b-hadron could match more than one track-jet, only the closest track-jet is labelled as a b-jet. If no b-hadron is found, the procedure is repeated for weakly decaying c-hadrons to label c-jets. If no c-hadron is found, the procedure is repeated for \(\tau \)-leptons to label \(\tau \)-jets. A jet for which no such matching can be made is labelled as a light-flavour jet.

b-jet identification: Track-jets containing b-hadrons are identified using a multivariate MV2c10 algorithm [51, 57], which exploits the information about the jet kinematics, the impact parameters of tracks within jets, and the presence of displaced vertices. The training is performed on jets from \(t\bar{t}\) events with b-jets as signal, and a mix of approximately 93% light-flavour jets and 7% c-jets as background. A particular b-tagging requirement on MV2c10 results in a given efficiency, known as an efficiency working point (WP). The efficiency WP is calculated from the inclusive \(p_{\text {T}}\) and \(\eta \) spectra of jets from an inclusive \(t\bar{t}\) sample. For example a WP with 70% efficiency corresponds to a factor of 120 in the light-quark/gluon-track-jet rejection and a factor of seven in the c-track-jet rejection. Different WPs (60%, 70%, 77% and 85%) are studied in the analyses presented in this paper and jets satisfying a particular MV2c10 criterion WP are referred to as ‘b-tagged jets’.

Large-R jet mass: To overcome the limited angular resolution for the energy deposits used to reconstruct the calorimeter-based jet mass (\(m^{\text {calo}}\)), an independent jet mass estimate using tracking information is developed, the ‘track-assisted jet mass’, \(m^\mathrm {TA}\)  [48]. A weighted combination of calorimeter-based and track-assisted jet masses, \(m^{\text {comb}}\)  [48], is used in the analysis. The \(m^{\text {comb}}\) resolution is very similar to the \(m^{\text {calo}}\) resolution at Higgs-jet \(p_{\text {T}}\) below 700 \(\text {GeV}\) and improves with increasing \(p_{\text {T}}\). Muons from semileptonic b-hadron decays do not leave significant energy deposits in the calorimeter, so they are considered separately in the calculation of the \(m^{\text {comb}}\) observable. The resulting neutrinos are not taken into account because they are not measured by the detector directly. The four-momentum of the closest muon candidate within \(\Delta R=0.2\) of the b-tagged track-jet is added to the four-momentum of the \(\text {large-}R\)-jet after subtraction of the muon energy loss in the calorimeter. Only the calorimeter-based component of the \(m^{\text {comb}}\) observable is corrected [58]. The resolution of the muon-corrected Higgs-jet mass, \(m^{\text {corr}}\), is improved by about 10% at transverse momenta below 500 \(\text {GeV}\), while the improvement is not as pronounced at higher \(p_{\text {T}}\), as was shown in Ref. [59].

5 Systematic uncertainties

Large-R jets: The uncertainties in the jet energy, mass, and substructure scales are evaluated by comparing the ratio of calorimeter-based to track-based measurements in dijet data and simulation [48]. The sources of uncertainty in these measurements are treated as fully correlated among \(p_{\text {T}}\), mass, and substructure scales. The resolution uncertainty of the \(\text {large-}R\) jet observables is evaluated in measurements documented in Ref. [48] and is assessed by applying an additional smearing to these observables. The jet energy resolution uncertainty is estimated by degrading the nominal resolution by an absolute 2%. Similarly, the jet mass resolution is degraded by a relative 20% to estimate the jet mass resolution uncertainty. The parton-shower-related uncertainty for the \(g \rightarrow b\bar{b}\) analysis is estimated by comparing the nominal Pythia8 multijet sample with Herwig++ samples.

Flavour tagging: The flavour-tagging efficiency and its uncertainty for b- and c-jets is estimated in \(t\bar{t}\) events, while the light-flavour-jet misidentification rate and uncertainty is determined using dijet events [60,61,62]. Correction factors are applied to the simulated event samples to compensate for differences between data and simulation in the b-tagging efficiency for track-jets with \(p_{\text {T}} < 250\) \(\text {GeV}\). Correction factors and uncertainties for c-jets and light-flavour jets are derived for calorimeter-based jets and extrapolated to track-jets using MC simulation. An additional term is included to extrapolate the measured uncertainties to \(p_{\text {T}}\) above 250 \(\text {GeV}\). This term is estimated from simulated events by varying the quantities affecting the flavour-tagging performance such as the impact parameter resolution, percentage of poorly measured tracks, description of the detector material, and track multiplicity per jet. The total uncertainties are 1–10%, 15–50%, and 50–100% for b-jets, c-jets, and light-flavour jets respectively.

Muon: The uncertainties in the muon momentum scale and resolution are derived from data events with dimuon decays of \(J/\psi \) and Z bosons. In total, there are three independent components: one corresponding to the uncertainty in the inner detector track \(p_{\text {T}}\) resolution, one corresponding to the uncertainty in the muon spectrometer \(p_{\text {T}}\) resolution, and one corresponding to the momentum scale uncertainty [52].

Photon: The uncertainties in the reconstruction, identification, and isolation efficiency for photons are determined from data samples of \(Z\rightarrow \ell \ell \gamma \), \(Z\rightarrow ee\), and inclusive photon events [53]. Uncertainties in the electromagnetic shower energy scale and resolution are taken into account as well [54].

Background modelling uncertainties for \(t\bar{t}\gamma \), \(\gamma +\)jets and \(W(\rightarrow q\bar{q})\gamma \): These correspond to the main backgrounds in the \(Z(\rightarrow b\bar{b})\gamma \) studies presented in Section 8. The background modelling uncertainty for the \(\gamma +\)jets sample was estimated with the alternative MC generator, Pythia8 using the LO NNPDF2.3 PDF set and the A14 underlying event tune. The alternative sample includes LO photon plus jet events from the hard process and photon bremsstrahlung in dijet events.

In the case of the \(W(\rightarrow q\bar{q})\gamma \) background, the nominal samples were compared with samples produced using the MadGraph5_aMC@NLO generator interfaced with Pythia8. For the \(t\bar{t}\gamma \) background three different sources of modelling uncertainty were considered: uncertainty due to the parton shower and hadronisation estimated by comparing the nominal samples produced using MadGraph interfaced with Pythia8, with samples from MadGraph interfaced with Herwig7 [32, 63]; uncertainty due to different initial- and final-state radiation conditions from Pythia8 tunes with high or low QCD radiation activity; and uncertainty due to the choice of renormalisation and factorisation scales.

Uncertainties related to the photons and the \(\gamma \)+jets, \(W(\rightarrow q\bar{q})\gamma \), and \(t\bar{t}\gamma \) background modelling are applied only in the \(Z(\rightarrow b\bar{b})\gamma \) analysis.

6 Higgs-jet tagger

The Higgs-jet tagger algorithm consists of several reconstruction steps. First, the Higgs boson candidate is reconstructed as a \(\text {large-}R\) jet. Second, the b-tagging requirement is applied to track-jets associated with the \(\text {large-}R\) jet in order to select candidates corresponding to \(H \rightarrow b\bar{b}\) decays. Third, the b-tagged \(\text {large-}R\) jet mass can be required to be around the SM Higgs boson mass of 125 \(\text {GeV}\). Finally, a requirement on other \(\text {large-}R\) jet substructure variables can be applied depending on the Higgs-jet tagger working point.

The signal acceptance for the first reconstruction step where the Higgs boson candidate is reconstructed as a \(\text {large-}R\) jet depends strongly on its transverse momentum. The angular separation between Higgs boson decay products can be approximated as \(\Delta R \approx 2m_{H}/ p_{\text {T}} \). Therefore, in most of the cases the Higgs boson decay products will fall within a single \(\text {large-}R\) jet with a radius parameter of \(R=1.0\) if the Higgs boson \(p_{\text {T}}\) is at least 250 \(\text {GeV}\). The signal acceptance shown in Figure 1 is determined as the fraction of Higgs bosons in simulation which are reconstructed and labelled as a Higgs-jet following the definition in Section 4. Only Higgs bosons with \(p_{\text {T}} > 250\) \(\text {GeV}\), \(|\eta | < 2.0\), and associated b-hadrons from its decay that have \(p_{\text {T}} > 5\) \(\text {GeV}\) and \(|\eta | < 2.5\) are considered. The Higgs boson acceptance is around 50% at 250 \(\text {GeV}\), where the jet \(p_{\text {T}}\) resolution have a significant impact as well, and increases to 95% for transverse momenta above 750 \(\text {GeV}\).

Fig. 1
figure 1

Fraction of Higgs bosons in simulation which are reconstructed and labelled as a Higgs-jet following the definition in Section 4, as a function of Higgs boson \(p_{\text {T}}\). Only Higgs bosons with \(p_{\text {T}} > 250\) \(\text {GeV}\), \(|\eta | < 2.0\) and with associated b-hadrons from its decay are considered. Same \(p_{\text {T}} \) and \(\eta \) requirements are applied to the Higgs-jets

The Higgs-jet tagging efficiency is defined as the number of Higgs-jets passing a given selection requirement divided by the total number of Higgs-jets. The background rejection is defined as the inverse of the efficiency for a background jet to pass the given selection requirement.

6.1 Two-step sample reweighting

To construct the signal sample, all graviton samples are combined. To allow a valid comparison between the signal efficiency and the background rejection, the \(\text {large-}R\) jet \(p_{\text {T}}\) spectrum of the combined graviton sample is reweighted to the reconstructed multijet \(p_{\text {T}}\) spectrum for the Higgs boson tagger performance studies in a two-step procedure. The same two-step reweighting procedure is also applied to the \(Z'\rightarrow t\bar{t}\) background sample. The multijet spectrum is chosen as a reference because of its smoothly falling \(p_{\text {T}}\) spectrum being representative for many analyses. During the first step of the reweighting the highest-\(p_{\text {T}}\) truth Higgs-jet is used, whereas for the second reweighting step the highest-\(p_{\text {T}}\) reconstructed Higgs-jet is used. The reconstructed Higgs-jet and the truth Higgs-jet must both contain the highest-\(p_{\text {T}}\) Higgs boson to mitigate effects from initial-state radiation (ISR).

In the first step, the \(p_{\text {T}}\) spectrum of the truth Higgs-jet in the combined signal sample is reweighted to the \(p_{\text {T}}\) spectrum of the reconstructed \(\text {large-}R\) jet in the multijet sample. In the second step, the reconstructed Higgs-jet \(p_{\text {T}}\) spectrum is reweighted to the reconstructed \(\text {large-}R\) jet \(p_{\text {T}}\) spectrum in the multijet sample. A one-step reweighting using the reconstructed Higgs-jet \(p_{\text {T}}\) spectrum results in large weights for jets with \(p_{\text {T}}\) much larger or smaller than half of the graviton mass. Furthermore, the reconstructed Higgs-jet can contain additional energy which does not stem from the Higgs boson decay, such as ISR, energy missing due to neutrinos, ‘out-of-cone’ effects, or trimming. The frequency of these effects depends on the Higgs boson boost, i.e. on the graviton mass, introducing a dependence on the choice of simulated graviton masses used in the combined signal sample. The second step is needed to account for a residual difference between reconstructed and truth Higgs-jet transverse momenta.

6.2 Flavour-tagging working points

To apply b-tagging to identify \(H\rightarrow b\bar{b}\) decays, the track-jets are matched to the \(\text {large-}R\) jets by ghost association as described in Section 4. At least two track-jets must be matched to the \(\text {large-}R\) jet for the double-b-tagging benchmarks, and at least one track-jet in the case of single-b-tagging benchmarks. The track-jet is considered to be b-tagged if its MV2c10 b-tagging discriminant value is larger than a given threshold value. These threshold values are defined for several b-tagging working points: 60%, 70%, 77% and 85% b-jet tagging efficiencies.

The following b-tagging benchmarks are studied:

  • double b-tagging: the two highest-\(p_{\text {T}}\) track-jets must both pass a given b-tagging requirement;

  • asymmetric b-tagging: the track-jet which is more consistent with the interpretation of being a b-jet must pass a given fixed 60%, 70%, 77%, or 85% working point, while the b-tagging requirement on the second track-jet is varied;

  • single b-tagging: at least one of the two highest-\(p_{\text {T}}\) track-jets must pass the b-tagging requirement;

  • leading single b-tagging: the highest-\(p_{\text {T}}\) track-jet must pass the b-tagging requirement.

The Higgs-jet efficiencies and background rejections as a function of the jet \(p_{\text {T}}\) for the 70% double-b-tagging benchmark are shown in Figure 2. The signal efficiency varies from 52% at low \(p_{\text {T}}\) to about 5% for \(1500< p_{\text {T}} < 2500\) \(\text {GeV}\). The drop in efficiency at high transverse momenta due to the increasing collimation and eventual merging of the two b-jets can be partially recovered using single-b-tagging working points as indicated in Figure 6. The multijet (top-jet) rejection is relatively constant over the whole \(p_{\text {T}}\) range and is about 250 (60) at low \(p_{\text {T}}\) and 500 (50) at high \(p_{\text {T}}\).

Fig. 2
figure 2

The Higgs-jet efficiency (top left) and rejection against multijet (top right) and top-jet backgrounds (bottom) as a function of the jet \(p_{\text {T}}\) for the 70% double-b-tagging working point. The nominal curves correspond to the requirement on the MV2c10 discriminant described in Section 6.2. The b-tagging-related uncertainties defined in Section 5 are shown

The multijet and top-quark background rejections as a function of the Higgs tagging efficiency for various b-tagging benchmarks are shown in Figure 3. Plots on the left show the performance for Higgs-jet \(p_{\text {T}}\) above 250 \(\text {GeV}\) and plots on the right show the performance for Higgs-jet \(p_{\text {T}}\) above 1000 \(\text {GeV}\). The double-b-tagging and asymmetric-b-tagging selections give the best background rejection in a large range of Higgs tagging efficiencies. At high Higgs-jet efficiencies above \(\sim \)90% (\(\sim \)55%) for Higgs-jet transverse momenta above 250 (1000) \(\text {GeV}\) the single-b-tagging benchmark shows a higher multijet and top-quark background rejection. To achieve such a high Higgs-jet efficiency, a very loose double-b-tagging or asymmetric-b-tagging requirement is needed, which results in a low light-flavour jet rejection. The double-b-tagging and asymmetric b-tagging working points do not reach an efficiency of 100% due to a requirement of at least two track-jets. In the case of asymmetric b-tagging, Higgs tagging efficiencies are below 100% because of the fixed b-tagging working point requirement on one of the track-jets. The drop in performance is pronounced at high jet transverse momenta due to the lower efficiency to reconstruct two subjets and the decrease in the MV2c10 b-tagging performance [64].

Fig. 3
figure 3

The multijet (top) and the top-jet (bottom) rejection as a function of the Higgs tagging efficiency for \(\text {large-}R\) jet \(p_{\text {T}}\) above 250 \(\text {GeV}\) (left) and above 1000 \(\text {GeV}\) (right) for various b-tagging benchmarks defined in Section 6.2. The stars correspond to the 60%, 70%, 77% and 85% b-tagging WPs (from left to right). The curves for the double-b-tagging and asymmetric-b-tagging working points coincide over a large range of Higgs-jet efficiency

6.3 Mass window optimisation

The reconstructed Higgs boson mass distribution provides a powerful way to distinguish the Higgs boson signal from background processes. The muon-corrected combined mass described in Section 4 is used to impose the Higgs boson mass requirement and select \(\text {large-}R\) jets with a mass around the SM Higgs boson mass. The Higgs boson mass resolution, \(\sigma _{m}\), varies as a function of the reconstructed \(\text {large-}R\) jet \(p_{\text {T}}\), so the mass window is optimised and parameterised as a function of Higgs-jet \(p_{\text {T}}\). Two working points are defined:

  • tight mass window, containing 68% of Higgs-jets;

  • loose mass window, containing 80% of Higgs-jets.

The mass window is defined as the smallest window containing the given fraction of Higgs-jets. The out-of-cone effects, ISR and the missing neutrinos from semileptonic b-hadron decays have an impact on the mass resolution that is similar to their impact on the \(p_{\text {T}}\) response; therefore, the mass window optimisation depends on the applied Higgs-jet selection and on the Higgs-jet \(p_{\text {T}}\) spectrum.

Figure 4 shows the reconstructed Higgs boson mass distribution for Higgs-jets with a \(p_{\text {T}}\) in the range 350 to 500 \(\text {GeV}\). The mass region below 50 \(\text {GeV}\) is affected by grooming and out-of-cone effects. In the case of asymmetric \(H \rightarrow b \bar{b}\) decays, where one of the b-hadrons carries a large fraction of the Higgs boson \(p_{\text {T}}\), the \(\text {large-}R\) jet’s axis is close to the direction of the higher-\(p_{\text {T}}\) b-hadron. The decay products of the lower-\(p_{\text {T}}\) b-hadron could be removed by grooming or not fully captured in the \(\text {large-}R\) jet. That leads to smaller Higgs-jet masses. The mass region above 150 \(\text {GeV}\) suffers from additional contributions from initial-state radiation. A large fraction of the ISR is suppressed by selecting the reconstructed Higgs-jet containing the highest-\(p_{\text {T}}\) Higgs boson candidate. However, the high mass tails are still substantial in high Higgs-jet \(p_{\text {T}}\) regions and affect the Higgs boson mass window definition.

Fig. 4
figure 4

The Higgs-jet mass distribution for jet transverse momenta in the range 350 to 500 \(\text {GeV}\) after reweighting the \(p_{\text {T}}\) spectrum. The dotted and dash-dotted blue curves correspond to the two components of the fit function, while the solid blue curve shows the combination thereof. The vertical lines indicate the boundaries of the mass ranges for 68% (light green) and 80% (dark green) containment

In order to suppress the impact of the tails on the mass window definition, a fit of the mass distribution is performed. The fit function is chosen empirically to describe the core of the mass distribution, while mitigating the tails. The chosen function is a linear combination of a Landau function to describe the low mass part of the distribution and a Gaussian function to describe the high mass part.

The fit is performed in 12 Higgs-jet \(p_{\text {T}}\) bins across the entire range of transverse momentum from 250 to 2500 \(\text {GeV}\).

A toy MC simulation is used as input to model the mass window and to estimate the statistical uncertainty on the mass window determination. This toy MC simulation samples the fit functions mentioned above and is performed many times in each \(p_{\text {T}}\) slice. For each toy MC sample, the mass window is calculated by selecting the smallest window containing the required signal fraction. The final upper and lower boundaries for a given \(p_{\text {T}}\) slice are found by averaging over the upper and lower boundaries from the corresponding toy MC samples. The mean defines the position and the RMS the uncertainty of the window boundaries in each \(p_{\text {T}}\) slice. Using the mean and RMS from the toy MC samples as input, the mass window is parameterised as a function of the Higgs-jet \(p_{\text {T}}\) using the fit function: \(f(p_{\text {T}}) = \sqrt{ {\left( a + b / p_{\text {T}} \right) }^{2} + {\left( c \cdot p_{\text {T}} + d \right) }^2}\). The jet mass depends primarily on the energies of the jet constituents and their angular separations. Consequently, there are two competing effects: the improving precision of the calorimeter energy scale with increasing jet \(p_{\text {T}}\) and the decreasing ability of the calorimeter granularity to resolve individual energy deposits due to increasing decay collimation with increasing jet \(p_{\text {T}}\). Fit results are shown in Figure 5 for tight and loose mass window working points.

Fig. 5
figure 5

The Higgs-jet mass window interval for a loose (left) and a tight (right) working point. The dashed lines show a fit to the derived intervals (blue and red markers) as a function of the Higgs-jet \(p_{\text {T}}\). The black markers show the position of the maximum of the Higgs-jet mass distribution

The Higgs boson acceptance times efficiency is presented in Figure 6. In addition to the truth-matching requirements defined for Figure 1, the double- and single-b-tagging, tight, loose and no mass window working points are applied. The double-b-tagging requirement in particular leads to a significant drop in the Higgs boson acceptance times efficiency at high Higgs boson transverse momenta, where the efficiency to reconstruct two track-jets and the double-b-tagging efficiency decrease quickly.

Fig. 6
figure 6

The Higgs boson acceptance times efficiency is shown for a few working points: the double and single b-tagging with the loose mass window requirement and the double b-tagging with the tight, loose and no mass window requirements

Figure 7 shows the rejection of the multijet background as a function of the Higgs-jet \(p_{\text {T}}\). Applying a combination of loose mass window and double-b-tagging requirements improves the rejection by a factor of about four relative to the corresponding benchmark without the mass requirement shown in Figure 2. The tight mass window requirement leads to an additional improvement of about 30–50% in the background rejection. The efficiency of the mass window requirements changes by a few percent after the application of the double b-tagging-requirement due to the dependence of the b-tagging efficiency on the jet kinematics.

Fig. 7
figure 7

Rejection of multijet background as a function of the Higgs-jet \(p_{\text {T}}\) for the loose (left) and tight (right) mass window requirements, in combination with the 70% double-b-tagging working point. The nominal curves correspond to the requirement on the MV2c10 discriminant described in Section 6.2. Systematic uncertainties defined in Section 5 as well as their sum in quadrature (total uncertainty) are shown. ‘Jet Scale’ refers to the sum in quadrature of the jet energy and mass scale uncertainties and ‘Jet Resolution’ refers to the sum in quadrature of the jet energy and mass resolution uncertainties

The corresponding rejection of the multijet background as a function of the Higgs-jet efficiency is shown in Figure 8 for different Higgs-jet \(p_{\text {T}}\) ranges, b-tagging benchmarks, and mass window requirements. Application of the mass window requirement improves the performance of the tagger substantially. For a fixed signal efficiency of 40% and \(\text {large-}R\) jet \(p_{\text {T}}\) above 250 \(\text {GeV}\), the multijet rejection rises from roughly 360 after applying the double-b-tagging requirement to about 1480 (1670) for the combination of the double-b-tagging and loose (tight) mass window requirements.

Fig. 8
figure 8

Rejection of multijet background as a function of the Higgs boson tagging efficiency for loose (top) and tight (bottom) mass window requirements for \(\text {large-}R\) jet \(p_{\text {T}}\) above 250 \(\text {GeV}\) (left) and above 1000 \(\text {GeV}\) (right) for various b-tagging benchmarks. The stars correspond to the 60%, 70%, 77% and 85% b-tagging WPs (from left to right). The curves for the double- and asymmetric-b-tagging working points coincide over a large range of Higgs-jet efficiency

Figure 9 shows the hadronic top-quark background rejection as a function of the Higgs-jet \(p_{\text {T}}\) for combinations of mass window and b-tagging benchmarks. The background rejection is higher for multijets than for hadronically decaying top quarks. The rejection varies between 120 (170) at low \(p_{\text {T}}\) and 1000 (1300) at high \(p_{\text {T}}\) for the loose (tight) mass window and double-b-tagging benchmark. In comparison with the benchmarks without the mass window requirement, the rejection is improved by about one order of magnitude, but the shape as function of \(p_{\text {T}}\) is fundamentally different. At low \(p_{\text {T}}\), not all decay products of the top quark are contained in the \(\text {large-}R\) jet. Thus the reconstructed jet mass has a long tail towards low jet masses with a substantial fraction of jets within the mass window of the tagger. Hence, the rejection at low jet \(p_{\text {T}}\) is not improved as much as at high jet \(p_{\text {T}}\). The tight mass window requirement further improves the background rejection by 15–40% as function of \(p_{\text {T}}\).

Fig. 9
figure 9

Rejection of the top-jet background as a function of the Higgs-jet \(p_{\text {T}}\) for the loose (left) and tight (right) mass window requirements, in combination with the 70% double-b-tagging working point. The nominal curves correspond to the requirement on the MV2c10 discriminant described in Section 6.2. Systematic uncertainties defined in Section 5 as well as their sum in quadrature (total uncertainty) are shown. ‘Jet Scale’ refers to the sum in quadrature of the jet energy and mass scale uncertainties and ‘Jet Resolution’ refers to the sum in quadrature of the jet energy and mass resolution uncertainties

The rejection of the hadronic top-quark background as a function of the Higgs tagging efficiency is shown in Figure 10. For the loose mass window requirement, an improvement from 140 to 200 is found at a fixed Higgs-jet efficiency of 40%, whereas for the tight mass window a smaller improvement from 140 to 160 is observed relative to no mass requirement for \(\text {large-}R\) jet \(p_{\text {T}}\) above 250 \(\text {GeV}\). The rejection values are lower for double b-tagging and asymmetric b-tagging for \(\text {large-}R\) jet \(p_{\text {T}}\) above 1 \(\text {TeV}\), and for high Higgs tagging efficiency single and single leading b-tagging are better options.

Fig. 10
figure 10

Rejection of the top-jet background as a function of the Higgs tagging efficiency for loose (top) and tight (bottom) mass window requirements for \(\text {large-}R\) jet \(p_{\text {T}}\) above 250 \(\text {GeV}\) (left) and above 1000 \(\text {GeV}\) (right) for various b-tagging benchmarks. The stars correspond to the 60%, 70%, 77% and 85% b-tagging WPs (from left to right). The curves for the double- and asymmetric-b-tagging working points coincide over a large range of Higgs-jet efficiency

6.4 Jet substructure

Sections 6.2 and 6.3 present the performance of the Higgs-jet tagger based on the b-tagging and jet mass requirements designed to distinguish \(\text {large-}R\) jets produced by Higgs boson decays from backgrounds. This section discusses the possibility of improving the background rejection with the help of other jet substructure variables and tighter selections on jet mass and b-tagging applied on top of the previously defined jet mass window and b-tagging benchmark working points. These additional selections are referred to as secondary selections.

Table 1 Overview of jet substructure variables. A short description of these substructure variables can be found in Refs. [65, 66]. \(^{(*)}\) Exclusive dipolarity forces the jet to have exactly two subjets from the \(k_{t}\) algorithm to begin with, which is different from the dipolarity, which runs \(k_{t}\) clustering and then takes all jets with \(p_{\text {T}}\) above 5 \(\text {GeV}\)

Many jet substructure variables exist that can capture features of a jet’s internal structure and can potentially give additional discrimination power against backgrounds from multijet production and top-quark decays. They are based on the jet constituents and exploit quantities such as transverse momentum and angular distance between the constituents. They give information about different jet attributes such as shape (e.g. sphericity, aplanarity) or number of axes (e.g. two-subjettiness \(\tau _2\)). Ratios are often used to avoid scale dependence of substructure variables. Table 1 lists the jet substructure variables that are investigated in this study, together with a short description and references. Secondary selections on jet mass and the flavour-tagging discriminant for the track-jets, MV2c10, are also considered relative to the previously defined mass window and b-tagging benchmark working points and their performance is compared with that achieved by the application of additional jet substructure variables to these benchmarks. Two categories of secondary selections are used for the b-tagging discriminant MV2c10, and these exploit the potential of tighter b-tagging working points where the criteria are tightened for both track-jets (double b-tagging) or for only one track-jet (single b-tagging).

Fig. 11
figure 11

Multijet background rejection at 80% signal efficiency (\(\varepsilon _\text {S} = 80\%\)) for a variety of substructure variables using different benchmarks in terms of b-tagging strategy and transverse momentum range. The z-axis colour scale represents the absolute value of the linear-correlation coefficient, \(|C(m_\text {corr},v_\text {JSS})|\), between the jet mass and the jet substructure variables. The selection efficiency is determined relative to the mass window and b-tagging benchmark working points defined in Sections 6.3 and 6.2 respectively

Fig. 12
figure 12

Hadronic top-quark background rejection at 80% signal efficiency (\(\varepsilon _\text {S} = 80\%\)) for a variety of substructure variables using different benchmarks in terms of b-tagging strategy and transverse momentum range. The z-axis colour scale represents the absolute value of the linear-correlation coefficient, \(|C(m_\text {corr},v_\text {JSS})|\), between the jet mass and the jet substructure variables. The selection efficiency is determined relative to the mass window and b-tagging benchmark working points defined in Sections 6.3 and 6.2 respectively

For all secondary selection variables an optimal two-sided range is chosen for each variable and each benchmark working point. Searches of new-physics resonances typically use tagging definitions with relatively high signal efficiency, around 40% (75%) for Higgs-jets with \(p_{\text {T}} = 500\) \(\text {GeV}\) for double (single) b-tagging and a mass requirement. Hence, the two-sided range for a secondary variable which contains the smallest fraction of background but at least 80% of signal events is determined. Figures 11 and 12 show the background rejection for a 80% retention of signal efficiency relative to the jet mass and b-tagging benchmark working points for multijet and hadronic top-quark backgrounds, respectively. The matrices in Figures 11 and 12 show the background rejection for substructure variables, secondary jet mass, and MV2c10 b-tagging discriminant on the y-axis for the four benchmark points of the Higgs-jet tagger on the x-axis. The z-axis colour scale represents the absolute value of the linear-correlation coefficient of the substructure variable and the jet mass for the corresponding background. For each benchmark, five variables with the largest background rejection are selected and all selected variables for every benchmark are shown.

In general, there are improvements across the various benchmark points. The background rejection is often higher for the multijet background than for the hadronically decaying top quarks. The secondary b-tagging discriminant is very powerful, and there are only a few areas of phase space where substructure yields larger improvements than an optimised b-tagging working point. However, substructure variables are an interesting alternative to tighter b-tagging working points for \(\text {large-}R\) jet \(p_{\text {T}}\) above 1 \(\text {TeV}\). For the multijet background (Figure 11), a tighter requirement on the double b-tagging achieves a background rejection of 3.62 (1.35) in the inclusive range \(p_{\text {T}} > 250\) \(\text {GeV}\) for the single-b-tagging (double-b-tagging) working point. In contrast, the improvement from the double-b-tagging discriminant is small for working points for \(p_{\text {T}} > 1000\) \(\text {GeV}\), achieving a background rejection of 1.29 (1.37) for the single-b-tagging (double-b-tagging) working point. At large \(p_{\text {T}}\) the background rejection for substructure variables varies between 2.12 (\(D_{2}^{\beta =1}\)) and 1.55 (Fox–Wolfram ratio \(\mathcal{F } _1 / \mathcal{F } _0\)) for a signal efficiency of 80%. In general, correlations with the jet mass greater than 10% are observed for most of the jet substructure variables. The Fox–Wolfram ratios \(\mathcal{F } _3 / \mathcal{F } _0\) and \(\mathcal{F } _1 / \mathcal{F } _0\) show the lowest correlations: less than 1% for most of the benchmarks.

The room for improvement is smaller if secondary jet substructure selections on top of the jet mass window and b-tagging benchmark working points are used in the case of the hadronic top-quark background (Figure 12). A tighter double-b-tagging working point reaches a factor of 4.81 (2.34) background rejection in the inclusive range \(p_{\text {T}} > 250\) \(\text {GeV}\) for the single-b-tagging (double-b-tagging) working point. In contrast, the improvement from the double-b-tagging discriminant is small at large \(p_{\text {T}}\), achieving a background rejection of 1.18 (1.57) for the single-b-tagging (double-b-tagging) working point. The background rejection for other variables varies between 1.84 (Fox–Wolfram ratio \(\mathcal{F } _2 / \mathcal{F } _0\) and exclusive dipolarity) and 1.24 (\(k_{t} \Delta R\)) for a signal efficiency of 80%. Compared with the multijet background the correlations between the jet mass and the jet substructure variables are smaller in the case of the top-quark background, especially for \(p_{\text {T}} > 1000\) \(\text {GeV}\). The Fox–Wolfram ratio \(\mathcal{F } _4 / \mathcal{F } _1\) shows the lowest correlation: less than 1% for most of the benchmarks.

In conclusion, the application of jet substructure variables improves the background rejection moderately, while better improvements are observed for high transverse momenta. Furthermore, it is important to take into account the correlation between the \(\text {large-}R\) jet mass and the substructure variables since requirements on the substructure variables sculpt the jet mass distribution [79, 80].

7 Modelling tests in \(g \rightarrow b\bar{b}\) data

Multijet events enriched in b-jets, which predominately originate from gluon to \(b\bar{b}\) production, are used to evaluate the b-tagging efficiency in data and simulation as well as the modelling of jet substructure variables. The multijet background is one of the main backgrounds for searches in fully hadronic final states, for example the Higgs boson pair search in the four-b-quark final state [81]. This background also provides a unique opportunity to validate the modelling of the double-b-jets in a large data sample. Events with one \(\text {large-}R\) jet with two ghost-associated track-jets (‘\(g \rightarrow b\bar{b}\) candidate jet’) and one recoiling ISR \(\text {small-}R\) jet (‘recoil jet’, \(j_\text {recoil}\)) are used for this study.

7.1 Event selection

Events are required to have a primary vertex that has at least two tracks, each with \(p_{\text {T}} > 500\) \(\text {MeV}\) [82]. The primary vertex with the highest \(p_{\text {T}} ^2\) sum of associated tracks is selected. A single-\(\text {small-}R\)-jet trigger with an online \(E_{\text {T}}\) threshold of 380 \(\text {GeV}\) was used to collect the data. An offline \(R = 0.4\) recoil jet with \(p_{\text {T}}\) above 500 \(\text {GeV}\) is matched to the jet which fired the trigger.

Non-collision backgrounds originating from calorimeter noise, beam-halo interactions, or cosmic rays can lead to spurious calorimeter signals. This effect is suppressed by applying the criteria described in Ref. [83].

Selected events are required to have at least one \(\text {large-}R\) jet with \(p_{\text {T}} >500\) \(\text {GeV}\) and \(|\eta |<2.0\), for which the \(\text {small-}R\) jet trigger is fully efficient and unbiased. The \(\text {large-}R\) jet must have at least two ghost-associated \(R=0.2\) track-jets. To enrich the event sample in jets containing b-hadrons, it is required that at least one of the ghost-associated track-jets be matched to a muon. The highest-\(p_{\text {T}}\) track-jet matched to a muon is called the muon-tagged jet, \(j^\text {trk}_\mu \). The matching is performed using a geometric \(\Delta R < 0.2\) requirement between the track-jet’s axis and the muon. The highest-\(p_{\text {T}}\) jet among the remaining track-jets matched by ghost association to the \(\text {large-}R\) jet is called the non-muon jet, \(j^\text {trk}_{\text {non-}\mu }\). The highest-\(p_{\text {T}}\) \(\text {large-}R\) jet satisfying these criteria is selected as gluon-jet candidate. Furthermore, the event must satisfy \(\Delta R(j_\text {recoil},j^\text {trk}_\mu )>1.5\). This requirement ensures that the triggering jet and the gluon-jet candidate are well separated.

7.2 Flavour fraction corrections

To reduce discrepancies between data and MC simulation in the flavour composition of the \(\text {large-}R\) jet, the flavour fractions of the sample are determined from the data before applying b-tagging. Each \(\text {large-}R\) jet carries two flavours, that of \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\), leaving nine possible flavour combinations for the \(\text {large-}R\) jet (each track-jet can be a b-jet, c-jet, or light-flavour jet; B, C and L abbreviations are used in the following). The long decay length of b- and c-hadrons makes the signed impact parameter significance, \(s_{d_{0}}\), of tracks associated with a jet a good discriminating variable for different jet flavours. The \(s_{d_{0}}\) of a track is defined as:

$$\begin{aligned} s_{d_{0}}= \frac{d_0}{\sigma (d_0)}s_j, \end{aligned}$$

where \(d_0\) is the track’s transverse impact parameter relative to the primary vertex, \(\sigma (d_0)\) is the uncertainty in the \(d_0\) measurement, and \(s_j\) is the sign of \(d_0\) relative to the track-jet’s axis, depending on whether the track crosses the track-jet’s axis in front of or behind the primary vertex. For a given track-jet, the average \(\left\langle s_{d_{0}}\right\rangle \) is built from the three highest-\(p_{\text {T}}\) tracks associated with the track-jet. The tracks from b- and c-hadron decays are expected to have higher \(p_{\text {T}}\) than tracks in light-flavour jets, because the heavy-flavour hadrons carry on average a larger fraction of the jet energy. The requirement that \(\left\langle s_{d_{0}}\right\rangle \) is built from the three highest-\(p_{\text {T}}\) tracks helps to distinguish them from light-flavour jets, which may have tracks with large \(s_{d_{0}}\) values, e.g. from \(\Lambda \) and \(K_\text {s}\) decays.

The impact parameter resolution depends on the intrinsic track resolution, the traversed detector material, the detector alignment, and other effects. To determine the impact parameter resolution in data, minimum-bias, dijet, and Z+jets events are used. The impact parameter resolution is extracted in fine bins of track \(p_{\text {T}}\) and \(\eta \) with an iterative method described in Ref. [51]. The simulation is corrected to match the measured impact parameter resolution as a function of track \(p_{\text {T}}\) and \(\eta \) by using a Gaussian function to smear the impact parameter resolution in the simulation.

The \(\left\langle s_{d_{0}}\right\rangle \) values of \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\) are found to be uncorrelated and thus the one-dimensional distributions of each jet’s \(\left\langle s_{d_{0}}\right\rangle \) are fit simultaneously. Furthermore, the flavour combinations of (\(j^\text {trk}_\mu \),\(j^\text {trk}_{\text {non-}\mu }\)) = {(B,C), (C,B), (L,C), (L,B)} are predicted to be less than 1% of the total, so they are merged with other flavour categories which have the closest shape. The shape similarity is determined using the \(\chi ^2\)-statistic. Thus a total of five flavour categories are used, (\(j^\text {trk}_\mu \),\(j^\text {trk}_{\text {non-}\mu }\)) = {(B,B), (B,L), (C,C), (C,L), (L,L)}. Figure 13 shows the templates inclusive in \(p_{\text {T}}\).

Fig. 13
figure 13

Averaged impact parameter significance, \(\left\langle s_{d_{0}}\right\rangle \), distributions for the muon (left) and non-muon jets (right) inclusive in \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\) transverse momenta. The double flavour labels denote the true flavour of the jet pair, with the \(j^\text {trk}_\mu \) given first

Since the flavour fractions vary with \(p_{\text {T}}\), the flavour fraction fits to the data are performed in bins of \(p_{\text {T}}\) of the two track-jets. For each jet \(p_{\text {T}}\) bin, individual MC templates are used. The following jet-\(p_{\text {T}}\) bins are considered: \(j^\text {trk}_\mu \) \(p_{\text {T}}\) bins = {(0–100), (100–200), >200} \(\text {GeV}\) and \(j^\text {trk}_{\text {non-}\mu }\) \(p_{\text {T}}\) bins = {(0–100), (100–200), (200–300), >300} \(\text {GeV}\). Figure 14 shows an example of the flavour fraction fit to the \(s_{d_{0}}\) distributions of \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\) for one particular bin of the track-jet transverse momenta. The fit uncertainty includes the statistical uncertainty of the templates and is evaluated using toy MC simulations. The flavour fraction corrections relative to the simulated fractions vary between 0.7 and 1.7 in the jet \(p_{\text {T}}\) bins with a statistical uncertainty below 10%.

Fig. 14
figure 14

Averaged impact parameter significance, \(\left\langle s_{d_{0}}\right\rangle \), distributions of the muon (left) and non-muon jet (right) in the (100–200) \(\text {GeV}\) bin of the \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\) transverse momenta

After correcting for the observed flavour-pair fractions the level of agreement between data and MC simulation is evaluated in the selected event sample before and after b-tagging is applied to the track-jets. The \(70\%\) double-b-tagging working point is used.

7.3 b-tagging results

Since the flavour fractions are corrected in the MC simulation, differences between the data and predictions after the b-tagging can be attributed to a difference between data and MC simulation in the dependence of the b-tagging performance on the \(\text {large-}R\) jet topology, in particular on the topology with two closely spaced track-b-jets.

Figure 15 shows the flavour-fit-corrected \(p_{\text {T}}\) spectrum of the \(\text {large-}R\) jet as well the \(j^\text {trk}_\mu \) and \(j^\text {trk}_{\text {non-}\mu }\) before and after b-tagging. As seen in the ratio plots, there is good agreement within uncertainties between data and MC simulation. The shape differences between data and MC simulations especially for the \(j^\text {trk}_{\text {non-}\mu }\) transverse momentum can be partially explained by the difference observed between Pythia8 and Herwig++ MC simulations. The double-b-tagging rate is defined as the number of selected \(\text {large-}R\) jets with at least two track-jets, two of which are b-tagged, divided by the number of all selected \(\text {large-}R\) jets with at least two track-jets. Figure 16 shows the double-b-tagging rate as a function of the \(\text {large-}R\) jet \(p_{\text {T}}\). Data and MC simulation agree within the uncertainties. The performance of the double b-tagging applied to two track-jets seems not to depend on the \(\text {large-}R\) jet topology with two closely spaced track-b-jets, and the default b-tagging calibration described in Section 5 can be applied for this analysis.

Fig. 15
figure 15

Transverse momentum distributions of the \(\text {large-}R\) jet (top), \(j^\text {trk}_\mu \) (middle) and \(j^\text {trk}_{\text {non-}\mu }\) (bottom) before (left) and after (right) double b-tagging. The flavour-tagging correction factors and the flavour-fit corrections have been applied. The two largest systematic uncertainties, generator modelling and the b-tagging-related uncertainties, are shown as well. The total uncertainty includes all systematic uncertainties listed in Section 5 and the fit uncertainty summed in quadrature

Fig. 16
figure 16

Comparison of data and MC simulation double-b-tagging rates as a function of the \(\text {large-}R\) jet \(p_{\text {T}}\). The flavour-tagging correction factors and the flavour-fit corrections have been applied. The two largest systematic uncertainties, generator modelling and the b-tagging-related uncertainties, are shown as well. The total uncertainty includes all systematic uncertainties listed in Section 5 and the fit uncertainty summed in quadrature. The size of the flavour-fit uncertainty is below 1%

7.4 Jet substructure results

As possible variations of Higgs taggers may make use of the \(\text {large-}R\) jet \(p_{\text {T}}\), and substructure variables such as mass, n-subjettiness, or \(D_{2}^{\beta =1}\), it is important to ensure that these variables are well modelled by MC simulations. The distributions of kinematic and substructure variables are shown in Figure 17, for double-b-tagged jets after the flavour-fit correction. As seen in the ratio plots, there is acceptable agreement within uncertainties between data and MC simulations.

Fig. 17
figure 17

Distributions of \(\text {large-}R\) jet mass (top), \(D_{2}^{\beta =1}\) (middle) and \(\tau _{21}\) (bottom) before (left) and after (right) double b-tagging. The flavour-tagging correction factors and the flavour-fit corrections have been applied. The two largest systematic uncertainties, generator modelling and the b-tagging-related uncertainties, are shown as well. The total uncertainty includes all systematic uncertainties listed in Section 5 and the fit uncertainty summed in quadrature. The \(D_{2}^{\beta =1}\) and \(\tau _{21}\) uncertainty bands include additional substructure variable uncertainties [48]

The relative impact of the systematic uncertainties on the yields of signal and background are presented in Table 2. The dominant signal uncertainty is the modelling uncertainty followed by the b-tagging-related uncertainties. The b-tagging-related uncertainties (misidentification of light-flavour jets and c-jets as b-jets) are dominant for background. The dominant uncertainties are shown separately in Figure 17. The difference in the shapes between data and MC simulations can be partially explained by the difference observed between Pythia8 and Herwig++ MC simulations.

Table 2 Relative impact of the systematic uncertainties on the yields of signal and the main background for the \(g\rightarrow b\bar{b}\) analysis. Multiple independent components have been combined into groups of systematic uncertainties. ‘Jet scales’ refers to the sum in quadrature of the jet energy, mass and substructure scale uncertainties

8 Modelling tests in \(Z \rightarrow b\bar{b}\) data

As mentioned in the introduction, the \(Z \rightarrow b \bar{b}\) process is a colour-singlet resonance with a mass close to the Higgs boson mass, so kinematic properties of the \(Z\rightarrow b\bar{b}\) and \(H\rightarrow b\bar{b}\) events are expected to be similar. Events with one double-b-tagged \(\text {large-}R\) jet (‘\(Z \rightarrow b\bar{b}\) candidate jet’) and a photon that are back-to-back are used for this study. The photon requirement improves the signal-to-background ratio in comparison with the fully hadronic final state.

8.1 Event selection

Events are selected using a single-photon trigger with a transverse energy (\(E_{\text {T}}\)) threshold of 140 \(\text {GeV}\) and loose photon identification requirements [21]. This trigger is non-prescaled for the entire data-taking period and is fully efficient for offline photons with \(E_{\text {T}}>175\) \(\text {GeV}\). The same primary vertex and jet-cleaning requirements are applied as for the \(g \rightarrow b\bar{b}\) study, described in Section 7.

Exactly one photon and at least one \(\text {large-}R\) jet are required to be present in the event. The \(\text {large-}R\) jet is required to have \(p_{\text {T}} > 200\) \(\text {GeV}\), \(|\eta | < 2.0\), and mass greater than 30 \(\text {GeV}\). A jet–photon overlap removal procedure is applied, removing photons within \(\Delta R=1.0\) of the \(\text {large-}R\) jet. The \(\text {large-}R\) jet with the highest \(p_{\text {T}}\) is chosen as the \(Z \rightarrow b\bar{b} \) candidate. The two highest-\(p_{\text {T}}\) track-jets that are associated with the \(Z \rightarrow b\bar{b} \) candidate are required to be identified as b-jets using the 70% working point.

8.2 Background estimate

The dominant SM background in this analysis is \(\gamma \)+jets with gluon-to-\(b\bar{b}\) splitting. The contribution from the Standard Model \(t\bar{t}\gamma \) and \(W\gamma \) processes is smaller than that from the \(\gamma \) + jets process. Other background contributions such as jets faking photons, electrons faking photons, and \(t\bar{t}\) are found to be negligible. To extract the \(Z \rightarrow b\bar{b} \) and \(\gamma \)+jets normalisations, the \(Z \rightarrow b\bar{b} \) candidate jet mass distribution is fitted to data. Both templates are taken from the MC simulation as described in Section 3. The \(t\bar{t}+\gamma \) and \(W(q\bar{q})+\gamma \) background contributions estimated from MC simulation are subtracted before the fit to data. The jet mass variable is used in the fit because the difference between the shapes of the \(Z \rightarrow b\bar{b} \) and \(\gamma \)+jets templates is larger than for other substructure variables. The extracted normalisations are applied to all other distributions.

8.3 Jet substructure results

Figure 18 shows the \(Z \rightarrow b\bar{b} \) candidate jet mass, \(p_{\text {T}}\), \(D_{2}^{\beta =1} \), and \(\tau _{\text {2}1}\) distributions in data and MC simulation. Systematic uncertainties summarised in Section 5 are applied to the templates, and for each systematic variation the fit to data is performed. The fit uncertainty and the contribution for each of the systematic uncertainties summed in quadrature are presented in Figure 18. The relative impact of the systematic uncertainties on the \(Z \rightarrow b\bar{b} \) and \(\gamma \)+jets yields are presented in Table 3. The observed data/MC discrepancies are covered by systematic uncertainties.

Fig. 18
figure 18

Jet mass, \(p_{\text {T}}\), \(D_{2}^{\beta =1} \) and \(\tau _{21}\) distributions. Events with two b-tagged track-jets are used. The \(\gamma \)+ jets background and the \(Z \rightarrow b\bar{b} \) signal are normalised to data by applying a scale factor of 1.51 and 0.98, respectively. The upward- or downward-pointing arrows indicate that the Data/Fit ratio is out of the histogram range for these bins

Table 3 Relative impact of the systematic uncertainties on the \(Z \rightarrow b\bar{b} \) and \(\gamma \)+jets yields . Multiple independent components have been combined into groups of systematic uncertainties. ‘Jet scales’ refers to the sum in quadrature of the jet energy, mass and substructure scale uncertainties

Further requirements on the jet substructure variables can improve the purity of the selection. Figure 19 shows the \(Z \rightarrow b\bar{b} \) candidate jet mass after further selections: \(\tau _{21}<0.45\) or \(D_{2}^{\beta =1} <1.3\). Figure 20 shows the \(D_{2}^{\beta =1}\) and \(\tau _{21}\) distributions after requiring the \(Z \rightarrow b\bar{b} \) candidate jet mass to be between 70 and 110 \(\text {GeV}\).

Fig. 19
figure 19

\(Z \rightarrow b\bar{b} \) candidate jet mass after applying the \(\tau _{21}<0.45\) (left) or \(D_{2}^{\beta =1} <1.3\) (right) requirement. Events with two b-tagged track-jets are used. The \(\gamma \)+ jets background and the \(Z \rightarrow b\bar{b} \) signal are normalised to data by applying a scale factor of 1.51 and 0.98, respectively

Fig. 20
figure 20

\(Z \rightarrow b\bar{b} \) candidate \(D_{2}^{\beta =1}\) and \(\tau _{21}\) distributions after requiring the \(Z \rightarrow b\bar{b} \) candidate jet mass to be between 70 and 110 \(\text {GeV}\). Events with two b-tagged track-jets are used. The \(\gamma \)+ jets background and the \(Z \rightarrow b\bar{b} \) signal are normalised to data by applying a scale factor of 1.51 and 0.98, respectively. The upward- or downward-pointing arrows indicate that the Data/Fit ratio is out of the histogram range for these bins

The \(Z(\rightarrow b \bar{b})\gamma \) process provides a unique possibility to validate the Higgs-jet tagging algorithm given the similarity of the \(H \rightarrow b \bar{b}\) and \(Z \rightarrow b \bar{b}\) processes. For the current integrated luminosity of 36 fb\(^{-1}\), the dominant uncertainties are the statistical and systematic uncertainties of the jet scales and jet mass for the \(Z \rightarrow b\bar{b} \) process and the \(\gamma \)+jets modelling uncertainties. To reduce the dominant uncertainties, a larger dataset is needed. Within the uncertainties the studied jet substructure variables are modelled well by the signal plus background MC simulations.

9 Conclusions

Techniques to identify Higgs bosons at high transverse momenta decaying into bottom-quark pairs are described in this paper. The identification is based on the b-tagging of \(R = 0.2\) track-jets matched to the Higgs-jet and requirements placed on the Higgs-jet mass and other substructure variables. The modelling of the relevant input distributions is studied in 36 fb\(^{-1}\) of 13 \(\text {TeV}\) proton–proton collision data recorded by the ATLAS detector at the LHC in 2015 and 2016.

The choice of b-tagging working point for an analysis depends on the required background rejection rate and on the Higgs-jet \(p_{\text {T}}\) range relevant for the analysis. The double-b-tagging working points give the best background rejection for a large range of the Higgs-jet-tagging efficiency but the efficiency decreases faster with increasing Higgs-jet \(p_{\text {T}}\) than it does for single-b-tagging working points. At high efficiencies above \(\sim 90\%\) (\(\sim 55\%\)) for Higgs-jet \(p_{\text {T}}\) above 250 (1000) \(\text {GeV}\) the single-b-tagging selection provides better background rejection.

Application of the Higgs boson mass window requirement improves the performance of the Higgs-jet tagger substantially. The multijet background rejection improves by a factor of about five by adding a loose (corresponding to 80% signal efficiency) mass window requirement on top of the double-b-tagging requirement. The tight (corresponding to 68% signal efficiency) mass window requirement leads to an additional 30–50% improvement in the multijet background rejection. The multijet background rejection has a weak dependence on the jet \(p_{\text {T}}\) for both mass window requirements. The hadronic top-quark rejection depends strongly on the jet \(p_{\text {T}}\). The rejection varies between 60 and 230 for the loose mass window and double-b-tagging working points. The largest improvement in the top-quark rejection for the tight mass window is about 70% and corresponds to the high \(p_{\text {T}}\) and double-b-tagging working point.

The performance of the additional jet substructure variables depends on the chosen Higgs-jet tagger working point. The jet mass and other substructure variables are often correlated and the double-b-tagging requirement enforces a two-prong structure. In general, the background rejection is larger for the multijet background than for hadronically decaying top quarks but still below two for the individual variables and the loose mass window working point. The b-tagging discriminant is very powerful but the jet substructure variables offer an alternative to the b-tagging working points. Especially at high Higgs-jet \(p_{\text {T}}\) the efficiency to reconstruct two track-jets and the double-b-tagging efficiency decrease quickly. A combination of several substructure variables using multivariate methods could potentially increase the gain in performance in this phase space.

The modelling of representative Higgs-jet properties is tested in ATLAS data for \(g \rightarrow b \bar{b}\) and \(Z (\rightarrow b \bar{b})\gamma \) event selections. Good modelling is observed given the size of the available data sample and the systematic uncertainties. In particular, the use of jet substructure variables is shown to improve the purity of the \(Z (\rightarrow b \bar{b})\gamma \) event selection.