1 Introduction

Several extensions of the Standard Model (SM) predict the existence of a dark sector weakly coupled to the SM [1,2,3,4]. Depending on the structure of the dark sector and its coupling to the SM, some unstable dark states may be produced at colliders, and could decay into SM particles with sizeable branching fractions. In order to avoid a new long-range force, a dark Higgs boson is introduced in such scenarios, to give mass to the dark gauge bosons. The dark Higgs boson may also lead to an exotic decay mode of the Higgs boson, via mixing between the two Higgs sectors, which is one of the favoured production modes that may be probed at the Large Hadron Collider (LHC). This is the mode explored in this search. Branching fractions of up to 10% are currently not excluded for Higgs-boson decays into exotic final states [5, 6]. This paper investigates the case where the two sectors couple via a vector portal, in which a dark photon (\(\gamma _{\mathrm {d}}\)) mixes kinetically with the SM photon and decays into SM leptons and light quarks [7,8,9]. The kinetic mixing term (\(\epsilon \)), which can vary over a wide range of values, \(\epsilon \sim 10^{-11}\)\(10^{-2}\), determines the lifetime of the dark photon. For a small kinetic mixing value, the \(\gamma _{\mathrm {d}}\) has a long lifetime, so that it decays at a macroscopic distance from its production point. This analysis focuses on small values of the kinetic mixing term, \(\epsilon <~10^{-5}\), and a dark photon mass range between twice the muon mass and twice the tau mass. Due to their small mass, the dark photons are expected to be produced with large boosts, resulting in collimated groups of leptons and light hadrons in a jet-like structure, referred to hereafter as dark-photon jets (DPJs).

The search for displaced DPJs presented in this paper uses the dataset collected by the ATLAS detector during 2015–2016 in proton–proton (pp) collisions at a centre-of-mass energy \(\sqrt{s}\ =\ 13\) TeV, corresponding to an integrated luminosity of 36.1 fb\(^{-1}\). The analysis exploits multivariate techniques for the suppression of the main multi-jet background, optimised for the different DPJ channels. This technique allows the exploitation of the fully hadronic signature for the first time in ATLAS DPJ searches, resulting in increased sensitivity compared with previous ATLAS results using the data collected in 2011 and 2012 at 7 and 8 TeV respectively [10, 11]. The results are complementary to those from related ATLAS searches for prompt DPJs using 7 and 8 TeV data [12,13,14], which probed higher values of \(\epsilon \), and for displaced dimuon vertices using 13 TeV data [15], which probed higher dark photon mass values. Related searches for dark photons were conducted by the CDF and D0 collaborations at the Tevatron [16,17,18] and by the CMS [19,20,21,22] and LHCb [23, 24] collaborations at the LHC. Additional constraints on scenarios with dark photons are extracted from, e.g., beam-dump and fixed-target experiments [25,26,27,28,29,30,31,32,33,34,35], \(e^{+}e^{-}\) colliders [36,37,38,39,40,41,42,43,44], electron and muon anomalous magnetic moment measurements [45,46,47] and astrophysical observations [48, 49]. Given the various constraints, a displaced dark photon with a kinetic mixing term \(\epsilon <~10^{-5}\) is allowed for \(\gamma _{\mathrm {d}}\) masses greater than 100 MeV.

2 The ATLAS detector

ATLAS [50] is a multipurpose detector at the LHC, consisting of an inner detector (ID) contained in a superconducting solenoid, which provides a 2 T magnetic field parallel to the beam direction, electromagnetic and hadronic calorimeters (ECAL and HCAL) and a muon spectrometer (MS) that has a system of three large air-core toroid magnets, each composed of eight coils.

The ID provides measurements of charged-particle momenta in the region of pseudorapidity \(|\eta |\le \) 2.5.Footnote 1 The highest spatial resolution is obtained around the vertex region using semiconductor pixel detectors arranged in four barrel layers [51, 52] at average radii of 3.3 cm, 5.05 cm, 8.85 cm, and 12.25 cm, and three discs on each side, covering radii between 9 and 15 cm. The pixel detector is surrounded by four layers of silicon microstrips covering radial distances from 29.9 to 56.0 cm. These silicon detectors are complemented by a transition radiation tracker (TRT) covering radial distances from 56.3 to 106.6 cm.

The ECAL and HCAL calorimeter system covers \(|\eta | \le 4.9\), and has a total depth of 9.7 interaction lengths at \(\eta ~=~0\), including 22 radiation lengths in the ECAL. The ECAL barrel starts at a radius of 1.41 m and ends at 1.96 m with a z extension of \(\pm 3.21\) m, covering the \(|\eta | \le 1.475\) interval. In the \(1.37 \le |\eta | \le 3.2\) region, the ECAL endacap starts at \(z \pm 3.70\) m and end at \(z \pm 4.25\) m. The HCAL barrel starts at a radius of 2.28 m and ends at 4.25 m with a z extension of \(\pm 4.10\) m, covering the \(|\eta | \le 1.0\) interval. In the endcaps regions up to \(|\eta | \le 4.9\), the HCAL starts at \(z \pm 4.3\) m and ends at \(z \pm 6.05\) m.

The MS provides trigger information and momentum measurements for charged particles in the pseudorapidity ranges \(|\eta |~\le ~2.4\) and \(|\eta |~\le ~2.7\) respectively. It consists of one barrel (\(|\eta |~\le ~1.05\)) and two endcaps (\(1.05~\le ~|\eta |~\le ~2.7\)), each with 16 sectors in \(\phi \), equipped with fast detectors for triggering and with chambers for reconstructing the tracks of the outgoing muons with high spatial precision. The MS detectors are arranged in three stations at increasing distances from the IP: inner, middle and outer. Three planes of MS trigger chambers are located in the middle and outer stations. The toroidal magnetic field allows precise reconstruction of charged-particle momenta independent of the ID information.

The ATLAS trigger system has two levels [53], level-1 (L1) and the high-level trigger (HLT). The L1 trigger is a hardware-based system using information from the calorimeters and MS. It defines one or more regions-of-interest (RoI), which are geometric regions of the detector identified by (\(\eta \), \(\phi \)) coordinates, containing interesting physics objects. The L1 trigger reduces the event rate from the LHC crossing frequency of 40 MHz to a design value of 100 kHz. L1 RoI information provides a seed for the reconstruction of physics objects by the HLT, a software-based system that can access information from all subdetectors. It is implemented in software running on a PC farm that processes the events and reduces the rate of recorded events to 1 kHz.

3 Benchmark model

Among the numerous models predicting dark photons, one class particularly interesting for the LHC features a hidden sector communicating with the SM through the Higgs portal for production and through vector portal for decay. The benchmark model used in this analysis is the Falkowski–Ruderman–Volansky–Zupan (FRVZ) model [8, 9], where a pair of dark fermions \(f_{\mathrm {d}_2}\) is produced via a Higgs boson (H) decay. Two different cases of this model are considered, involving the production of either two or four dark photons. In the first case, shown in Fig. 1 (left), each dark fermion decays into a \(\gamma _{\mathrm {d}}\) and a lighter dark fermion assumed to be the hidden lightest stable particle (HLSP). In the second case, shown in Fig.1 (right), each dark fermion decays into an HLSP and a dark scalar \(s_\text {d}\) that in turn decays into a pair of dark photons.

Fig. 1
figure 1

The two processes of the FRVZ model used as benchmarks in the analysis. In the first process (left), the dark fermion \(f_{\mathrm {d}_2}\) decays into a \(\gamma _{\mathrm {d}}\) and an HLSP. In the second process (right), the dark fermion \(f_{\mathrm {d}_2}\) decays into an HLSP and a dark scalar \(s_\text {d}\) that in turn decays into a pair of dark photons. The \(\gamma _{\mathrm {d}}\) decays into SM fermions, denoted by \(f^{+}\) and \(f^{-}\)

In general, dark-sector radiation [54] can produce extra dark photons. The number of radiated dark photons is proportional to the size of the dark gauge coupling \(\alpha _\text {d}\) [7]. The dark radiation is not considered in this signal model, which corresponds to an assumed dark coupling \(\alpha _\text {d} \lesssim 0.01\).

The vector portal communication of the hidden sector with the SM is through kinetic mixing of the dark photon and the standard photon

$$\begin{aligned} \mathcal {L}_{\text {gauge mixing}} = \frac{\epsilon }{2} B_{\mu \nu } b^{\mu \nu }, \end{aligned}$$

where \(B_{\mu \nu }\) and \(b_{\mu \nu }\) denote the field strengths of the electromagnetic fields for the SM and dark sector respectively, and \(\epsilon \) is the kinetic mixing parameter. A dark photon with a mass \(m_{\gamma _\text {d}}\) up to a few GeV that mixes kinetically with the SM photon will decay into leptons or light mesons, with branching fractions that depend on its mass [8, 55, 56].

The mean lifetime \(\tau \), expressed in seconds, of the \(\gamma _{\mathrm {d}}\) is related to the kinetic mixing parameter [57] by the relation

$$\begin{aligned} \tau ~ \propto ~ \Biggl (\frac{10^{-4}}{\epsilon }\Biggr )^2 \Biggl (\frac{100~{\mathrm {MeV}}}{m_{\gamma _\text {d}}}\Biggr ) ~. \end{aligned}$$
(1)

Equation (1) is an approximate expression based on the full relation in Ref. [56].

4 Data and simulation samples

The analysis presented in this paper uses \(\sqrt{s} = 13\) \(\text {Te}\text {V}\) pp collision data recorded by the ATLAS detector during the 2015–2016 data-taking periods. Only runs in which all the ATLAS subdetectors were operating normally are selected. The total integrated luminosities are 3.2 fb\(^{-1}\) and 32.9 fb\(^{-1}\) for 2015 and 2016 respectively.

Data were collected using a set of dedicated triggers that were active during collision bunch crossings as well as during empty and unpaired bunch-crossing slots. The LHC configuration for pp collisions contains 3564 bunch-crossing slots per revolution. An empty bunch-crossing is defined as a slot in which neither beam is filled with protons, and in addition is separated from filled bunches by at least five unfilled bunches on each side. Data collected during empty bunch crossings, referred to as the cosmic dataset, are used for the estimation of the cosmic-ray background. The ratio of the number of filled to empty bunch crossings, \(F_\text {CR}\) = 2.1, is used to scale the number of events in the cosmic dataset to that in the pp collision data. In unpaired bunch crossings, protons are present in only one of the two beams. Data taken during unpaired bunch crossings are used to study characteristic features of beam-induced backgrounds [58] (BIB) and are referred to as the BIB dataset.

Monte Carlo (MC) simulation samples were produced for the model considered in this paper and are summarised in Table 1.

Samples were generated for the Higgs boson mass of 125 \(\text {Ge}\text {V}\), and for a hypothetical beyond-the-SM (BSM) heavy scalar boson with a mass of 800 \(\text {Ge}\text {V}\), considering only the dominant gluon–gluon fusion (ggF) production mechanism. The ggF Higgs boson production cross section in pp collisions at \(\sqrt{s}\) = 13 TeV, estimated at next-to-next-to-leading order (NNLO) [59,60,61,62], is \(\sigma _{\text {SM}} = \) 43.87 pb for \(m_{H}=\) 125 GeV. The BSM heavy scalar with a mass of 800 GeV production cross section is conventionally assumed to be \(\sigma = 5~\text {pb}\).

The mass of the hidden fermion \(m_{f_{\mathrm {d_2}}}\) and of the hidden scalar \(m_{s_{\mathrm {d}}}\) were chosen to be low relative to the Higgs boson mass. Due to the production from a two-body decay of the Higgs boson generated at rest in the transverse plane, events with two back-to-back DPJs are expected. This is also the case leading to four dark photons where each DPJ consist of two collimated dark photons.

The dark-photon mass was chosen to be 0.4 GeV, above the pion pair mass threshold, and the \(\gamma _{\mathrm {d}}\) decay branching fractions (\(\mathcal {B}\)) are expected to be \(\mathcal {B}\)(\(\gamma _{\mathrm {d}}\) \(\rightarrow ee\)) = 45%, \(\mathcal {B}\)(\(\gamma _{\mathrm {d}}\) \(\rightarrow \mu \mu \)) = 45%, \(\mathcal {B}\)(\(\gamma _{\mathrm {d}}\) \(\rightarrow \pi \pi \)) = 10% [8]. In the generated samples, the proper decay length c\( \tau \) of the \(\gamma _{\mathrm {d}}\) was chosen such that \(\sim \) 80% of the decays occur in the volume delimited by the muon trigger chambers (i.e. up to 7 m in radius and 13 m along the z-axis). Since the analysis is sensitive to a wide range of mean proper lifetimes, a weighting method is used to extrapolate the signal efficiency to other mean proper lifetimes.

All MC samples described above were generated at leading order using MadGraph 5_aMC@NLO 2.2.3  [63] interfaced to Pythia 8.210 [64] for parton shower generation. The A14 set of tuned parameters (tune) for parton showering and hadronisation [65] was used together with the NNPDF2.3LO parton distribution function (PDF) set [66].

Table 1 Parameters used for the Monte Carlo simulations of the benchmark model

One of the main SM backgrounds in this analysis is multi-jet events. Such events were simulated to perform background studies and to evaluate systematic uncertainties. The MC samples were generated with Pythia 8.210 using the same tune and PDF as for the signal samples.

Potential sources of background also include W+jets, Z+jets, \(t\bar{t}\), single-top-quark, WW, WZ, and ZZ events. Simulation samples are used to study these backgrounds. The W+jets, Z+jets , WW, WZ, and ZZ events were generated using Sherpa 2.2.2 [67] with the NNPDF 3.0 NNLO [68] PDF set. Single-top-quark and \(t\bar{t}\) MC samples were generated using Powheg-BOX 1.2856 [69,70,71,72] and Pythia 6.428 [73] with the Perugia2012 [74] tune for parton showering and hadronisation, and CT10/CTEQ6L1 [75, 76] PDF sets.

Data and MC samples of \(J/\psi \rightarrow \mu \mu \) events are used to evaluate systematic uncertainties in muon trigger and reconstruction efficiencies. The MC sample was generated using Pythia8+Photos++ [77] with the A14 tune for parton showering and hadronisation, and the CTEQ6L1 PDF set. The \(J/\psi \rightarrow \mu \mu \) data sample was selected in 2015–2016 pp collisions using the triggers described in Ref. [78].

The generated MC events were processed through a full simulation of the ATLAS detector geometry and response [79] using the Geant4 [80] toolkit. The simulation included multiple pp interactions per bunch crossing (pile-up), as well as the detector response to interactions in bunch crossings before and after the one producing the hard interaction. To model the effect of pile-up, simulated inelastic pp events were overlaid on each generated signal and background event. The multiple interactions were simulated with Pythia 8.210 using the A2 tune [81] and the MSTW2008LO PDF set [82].

5 Definition of the dark-photon jets

5.1 Dark-photon jet classification

Displaced DPJs are reconstructed with criteria that depend on the \(\gamma _{\mathrm {d}}\) decay channel. A \(\gamma _{\mathrm {d}}\) decaying into a muon pair is searched for by looking for two closely spaced muon tracks in the MS, while a \(\gamma _{\mathrm {d}}\) decaying into an electron or pion pair, given the high boost of the \(\gamma _{\mathrm {d}}\), is searched for as an energy deposit in the calorimeters identified as a single narrow jet. MC simulations show that DPJs containing two dark photons both decaying into an electron or pion pair are reconstructed as a single jet.

Tracks that are reconstructed in the MS and are not matched to any track in the ID are used to identify displaced \(\gamma _{\mathrm {d}}\) decays into muons. Since the ID track reconstruction in ATLAS [83] requires at least one hit in one of the two innermost pixel layers, this analysis is sensitive only to displaced \(\gamma _{\mathrm {d}}\) decays occurring after the first pixel layers. The search is limited to the pseudorapidity interval \(|\eta | < ~2.5\), corresponding to the ID coverage, to ensure that selected muons are isolated from ID tracks. Muons with pseudorapidity in the range \(1.0 \le | \eta | \le 1.1\) are rejected to avoid the transition region of the MS between barrel and endcap. In order to reconstruct \(\gamma _{\mathrm {d}}\) decays that occur outside of the innermost layer of muon chambers but before the first MS trigger chamber, muons are required to have at least one hit in two of the three MS tracking station.

Jets used in this search are reconstructed from clusters [84] of energy deposits in the ECAL and HCAL using the anti-\(k_{t}\) algorithm [85, 86] with radius parameter \(R = 0.4\). The search is limited to \(\gamma _{\mathrm {d}}\) decays into electron or hadron pairs in the hadronic calorimeter. Jets produced in the HCAL are expected to be isolated from activity in the ID, with a high ratio of energy deposited in the HCAL (\(E_\mathrm {HCAL}\)) to energy deposited in the ECAL (\(E_\mathrm {ECAL}\)), and appear narrower than ordinary jets. The standard jet-cleaning requirements [87] applied in most ATLAS analyses reject jets with high values of \(E_\mathrm {HCAL}\)/\(E_\mathrm {ECAL}\). A dedicated cleaning algorithm for jets created in the HCAL is applied instead, with no requirements on the ratio \(E_\mathrm {HCAL}\)/\(E_\mathrm {ECAL}\). Jets are required to have transverse momentum \(p_{\text {T}}\) \(\ge \) 20 GeV and \(|\eta |~<~2.5\). In addition, the weighted time of the energy deposit in the calorimeter cells is required to be in the range [–4 ns, 4 ns] of the expected arrival time for particles produced at \(t~=~0\) (bunch-crossing time) and moving with the speed of light, to reduce cosmic-ray background and BIB jets.

DPJs are classified according to the number of muons and jets found within a given cone of angular size \(\Delta R\equiv \sqrt{(\Delta \phi )^2 + (\Delta \eta )^2}\) around a muon or jet candidate with the highest transverse momentum. The cone size is fixed to \(\Delta R\)  = 0.4, since the MC simulations show that this selection retains up to 90% of the dark-photon decay products in the \(H\rightarrow ~4\gamma _\text {d}+X\) decay channel with \(m_H\) = 125 GeV. The DPJ classification is summarised as follows:

  • muonic-DPJ (\({\varvec{\mu }}\)DPJ) – to select DPJs with all constituent dark photons decaying into muons, at least two muons are required and no jets are allowed in the cone.

  • hadronic-DPJ (hDPJ) – to select DPJs with all constituent dark photons decaying into electron or pion pairs in the HCAL, one jet is required and no muons are allowed to be in the cone. The electromagnetic fraction of the jet energy, defined as the ratio of the energy deposited in the ECAL to the total jet energy (\(E_\mathrm {ECAL}\)/\(E_\mathrm {total}\)), is required to be less than 0.4. This helps reduce the overwhelming background due to multi-jet production. This variable is also used later, as described in Sect. 5.3

Reconstructed DPJs with both muon and jet constituents are not considered in this analysis.

5.2 Muonic-DPJ selection

Muonic-DPJs are reconstructed using a Cambridge–Aachen clustering algorithm [88] that combines all the muons lying within a cone of fixed size in (\(\eta \), \(\phi \)) space. The algorithm starts from the highest-\(p_{\text {T}}\) muon, searching for additional muons within the \(\Delta R\)  = 0.4 cone around the muon momentum vector. If a second muon is found in the cone, the axis of the cone is rotated to the vector sum of the momenta of the two muons, and the search is repeated until no additional muons are found in the cone.

Cosmic-ray muons that cross the detector in time coincidence with a pp interaction constitute the main source of background to the muonic-DPJ. The cosmic dataset is used to study this background. A boosted decision tree (BDT) with gradient boosting, implemented in the TMVA framework [89], is trained to discriminate signal DPJs from the DPJ candidates that originate from cosmic-ray background. The BDT uses the following track variables, for each muon in the DPJ, to classify a DPJ as being from signal or background:

  • longitudinal impact parameter \(z_0\), defined as the minimum separation in the z-coordinate between the muon track and the primary vertex (PV);Footnote 2

  • arrival times measured by the trigger detectors of the MS;

  • pseudorapidity \(\eta \);

  • azimuthal angle \(\phi \).

Even if the decay is displaced, signal muons point to the primary vertex because of the high boost of the dark photon, resulting in a narrow \(z_0\) distribution peaking around zero. By contrast, cosmic-ray muons have a broad \(z_0\) distribution.

Cosmic-ray muons mainly come through the two shafts above the ATLAS detector, resulting in two well-defined peaks in the \(\eta \) and \(\phi \) distributions. Each hit in the trigger detector of the MS provides a measurement of the time for the muon track, corrected by the time of flight assuming the pp interaction point as the origin of the muon [90]. The difference in time measured by the two layers in the middle station and in the outer station is thus useful for discriminating between cosmic-ray muons and collision muons. Since cosmic-ray muons are downward going, their arrival times in the layers in the upper part of the MS (\(0~<~\phi ~<~\pi \)) are different from those of collision muons, which are upward-going in this part of the detector. In the lower part of the MS (\(\pi ~<~\phi ~<~2\pi \)), cosmic-ray muons and collision muons travel downwards, making hit timing less useful for separating between them.

The cosmic dataset and the signal MC sample \(H\rightarrow ~2\gamma _\text {d}+X\) with \(m_H\) = 125 GeV are used for the training of the BDT. The gain in signal significance obtained from dedicated BDT training with the other signal MC samples is found to be negligible. Figure 2 (left) shows the BDT output (\(\mu \)BDT) for the constituent muons of the \(\mu \)DPJs: the distribution provides a clear separation between signal and background muons from cosmic rays. The \(\mu \)BDT output is required to be \(\mu \)BDT \(>~0.21\); the value is chosen to yield the highest signal significance, \(S/\sqrt{S+B}\), where S is the number of signal events and B the number of background events.

5.3 Hadronic-DPJ selection

Signal jets are discriminated from multi-jets using a second classifier also based on a BDT (hBDT). The following variables are used as input to the hBDT:

  • jet width, defined as the \(p_{\text {T}}\)-weighted sum of the \(\Delta R\) between each energy cluster and the jet axis;

  • jet vertex tagger (JVT) output [91];

  • \(E_\mathrm {ECAL}\)/\(E_\mathrm {total}\);

  • jet mass, as defined by the jet clustering algorithm [92];

  • jet charge, defined as the momentum-weighted charge sum constructed from tracks associated with the jet; tracks are associated with jets using ghost association [93];

  • jet timing, defined as the energy-weighted average of the timing for each cell in the jet.

The JVT is designed to differentiate between pile-up jets and jets originating from the PV. The algorithm uses a multivariate combination of track variables that are sensitive to pile-up. Since jets produced in the hadronic calorimeter have a JVT output distribution similar to that of pile-up jets, the JVT output is used for selection of hadronic-DPJs. Possible pile-up jets contamination is reduced by the analysis selection to a negligible level.

The signal MC sample \(H\rightarrow ~2\gamma _\text {d}+X\) with \(m_H\) = 125 GeV and the simulated multi-jet background events are used for the BDT training. The gain in signal significance obtained from dedicated BDT training with the other signal MC samples is found to be negligible. Figure 2 (right) shows the BDT output for the hDPJs (hBDT). The peak at  \(\sim \) –0.2 in the BDT distributions corresponds to jets with a JVT output that indicates a low pile-up probability. The hBDT output is required to be hBDT \(>~0.91\); the value is chosen to yield the highest signal significance.

Fig. 2
figure 2

BDT output distributions for signal and background for \(\mu \)DPJs (left) and hDPJs  (right). For muonic-DPJs the background is the cosmic dataset and the FRVZ signal sample is the \(H\rightarrow ~2\gamma _\text {d}+X\) process with \(m_H\) = 125 GeV. For hadronic-DPJs the signal MC sample is the \(H\rightarrow ~2\gamma _\text {d}+X\) process with \(m_H\) = 125 GeV and the background is the simulated multi-jet background sample

6 Trigger and event selection

The standard ATLAS triggers are optimised to select prompt events and are thus usually very inefficient in the selection of displaced objects. This search uses events selected by the logical OR of three dedicated triggers targeting displaced objects: two muon triggers and one calorimeter trigger.

The L1 muon trigger used in this analysis requires hits in the middle stations to create a low-\(p_{\text {T}} \) \((\ge 6\) GeV) muon RoI or hits in both the middle and outer stations for a high-\(p_{\text {T}} \) \((\ge 20\) GeV) muon RoI. The muon RoIs have a \(\Delta \eta \times \Delta \phi \) spatial extent of \(0.2\times 0.2\) in the barrel and of \(0.1\times 0.1\) in the endcaps. L1 RoI information seeds the reconstruction of muon momenta by the HLT, which uses precision-chamber information to confirm or reject the L1 decision.

The first muon trigger, the tri-muon MS-only [94], requires at least three L1 muons with \(p_{\text {T}}\) \(\ge \) 6 GeV in the event, confirmed by the HLT using only MS information.

The second muon trigger, the muon narrow-scan, is specifically designed to select non-prompt collimated muons originating in the region between the first pixel layer and the first muon trigger plane. It requires an L1 muon with \(p_{\text {T}}\)  \(\ge \) 20 GeV confirmed by the HLT using only MS information. At the HLT a ‘scan’ is then performed in a cone of \(\Delta R = 0.5\) around this muon, looking for a second muon reconstructed using only MS information. During the course of the 2015–2016 data taking, in order to stay within the allocated trigger-rate limits given the increasing luminosity delivered by the LHC, the \(p_{\text {T}}\) requirement on the second muon was increased from 6 GeV to 15 GeV. In addition, both muons were required to be unmatched to any track in the ID, and isolation was required for the leading muonFootnote 3. This tends to selects events with dark photons of higher \(p_{\text {T}}\) and with more displaced decay position.

The calorimeter trigger, the CalRatio [94], is designed to select narrow jets produced in the hadronic calorimeter. At L1, the trigger requires a transverse-energy deposit of \(E_{\mathrm T}~\ge ~60\) GeV within a \(0.2\times 0.2\) (\(\Delta \eta \times \Delta \phi \)) region in the pseudorapidity range \(|\eta |~\le \) 2.4. At the HLT, jet reconstruction is then performed with the anti-\(k_{t}\) algorithm using a radius parameter of R = 0.4. Transverse energy \(E_{\mathrm T}~\ge ~30\) GeV and \(\log ( E_\mathrm {HCAL} / E_\mathrm {ECAL} )~\ge ~1.2\) are required. Jets are required to have no tracks with \(p_\mathrm {T}~\ge ~2\) GeV  within \(\Delta R\)  \(=~0.2\) of the jet axis. Finally, jets are required to pass a BIB removal algorithm that relies on calorimeter cell timing and position. Muons from BIB enter the HCAL and can radiate a bremsstrahlung photon, generating an energy deposit that may be reconstructed as a jet with characteristics similar to the hadronic-DPJ. The algorithm identifies events as containing BIB if the triggering jet has at least four HCAL cells at the same \(\phi \) and in the same layer with timing consistent with that of a BIB energy deposit.

Two DPJs satisfying the selection criteria described in Sect. 5 are required in the events selected by the triggers. If more than two DPJs are reconstructed, the one with the highest transverse momentum, labelled the leading DPJ, and the one farthest in \(\Delta \phi \) from the leading one, labelled the subleading DPJ, are used to classify the event. More than two DPJs are found in 9% of the events in the signal MC sample \(H\rightarrow ~2\gamma _\text {d}+X\) with \(m_H\) = 125 GeV. Events are classified as one of the three following channels:

  • \(\mu \)DPJ–\(\mu \)DPJ,

  • \(\mu \)DPJ–hDPJ,

  • hDPJ–hDPJ.

In the \(\mu \)DPJ–hDPJ channel, either the \(\mu \)DPJ or the hDPJ may be the leading DPJ.

7 Multi-jet background estimation

A data-driven ABCD method is used to estimate the multi-jet background in each of the three channels. The ABCD method uses two nearly uncorrelated variables defined at the event level to create a two-dimensional plane that is split into four parts: region A, where most signal events are located, and three control regions (B, C, and D) that contain mostly background. The number of background events in A can be predicted from the population of the other three regions: \(N_{\text {A}} = N_{\text {B}} \times N_{\text {D}}/N_{\text {C}}\), assuming negligible leakage of signal into regions B, C and D. For each channel, the ABCD calculation is performed in two separate regions: one background-dominated validation region (VR) to test the validity of the method, and one signal region (SR). The SRs are defined by the selection criteria described in Sects. 5 and 6. These define also the VRs except for the BDT cuts. The VRs BDT cuts for the leading and the subleading DPJs are chosen to have negligible signal contamination, which otherwise can bias the ABCD method validation. SR and VR definitions are summarised in Table 2.

The two event-level variables used to define the ABCD plane are the isolation of the DPJs relative to tracks in the inner detector and the opening angle between the two DPJs in the transverse plane (\(|\Delta \phi |\)). Displaced DPJs are expected to be highly isolated in the ID. The track isolation (\(\sum {p_{\text {T}}}\)) is defined as the scalar sum of the transverse momenta of the tracks reconstructed in the ID and matched to the PV of the event within a \(\Delta R\) = 0.4 cone around the DPJ direction. Matching to the PV helps reduce the dependence of \(\sum {p_{\text {T}}}\) on the amount of pile-up. The PV is correctly selected in the \(\sim \) 56% of the events. However, the selection efficiency does not depend significantly on whether the PV is correctly identified. The larger of the two \(\sum {p_{\text {T}}}\) values, \(\text {max}(\Sigma p_{\mathrm {T}})\), is used as the event-level variable. For signal, the opening angle \(|\Delta \phi |\) is expected to be large, due to production of the DPJs in the two-body decay of a Higgs boson generated at rest in the transverse plane.

The ABCD method relies on there being only one source of background, or multiple sources that have identical distributions in the ABCD plane. Muons from BIB originate from beam-halo interactions with the collimators upstream of the ATLAS detector, resulting in muons travelling parallel to the beam-pipe. The analysis requirements select events with two separate energy deposits in the hadronic calorimeter produced by two BIB muons. The requirement \(|\Delta \phi |~>~0.1\) in the ABCD plane removes BIB events that would otherwise contaminate the method for the hDPJ–hDPJ channel, and has no effect on the signal efficiency. After the final selection, the contribution of BIB events to the signal region is negligible.

The ABCD plane is defined for all the three channels by 0 \(\le \) \(\text {max}(\Sigma p_{\mathrm {T}})\)  \(\le \) 20 GeV and 0.1 \(\le \) \(|\Delta \phi |\) \(\le ~\pi \). The region A is defined by \(\text {max}(\Sigma p_{\mathrm {T}})\) < 4.5 GeV and \(|\Delta \phi |~>~\) 0.625. Regions B, C, and D are defined by reversing one or both of the requirements: \(\text {max}(\Sigma p_{\mathrm {T}})\) > 4.5 GeV and \(|\Delta \phi |~>~\) 0.625, \(\text {max}(\Sigma p_{\mathrm {T}})\) > 4.5 GeV and \(|\Delta \phi |~<~\) 0.625, \(\text {max}(\Sigma p_{\mathrm {T}})\) < 4.5 GeV and \(|\Delta \phi |~<~\) 0.625 respectively.

Table 2 Summary of the definitions of the signal regions (SRs) and validation regions (VRs) used in the ABCD method

In order to estimate the residual cosmic-ray background component in the ABCD plane, the event selection is applied to the cosmic dataset, and the resulting event yield is multiplied by \(F_\text {CR}\). The expected number of cosmic-ray events in the validation regions is: 4 ± 3 in region A and 2 ± 2 in region D for the \(\mu \)DPJ–\(\mu \)DPJ channel; 10 ± 5 in region D for the \(\mu \)DPJ–hDPJ channel. In the signal regions, the expected number of cosmic-ray events is: 8 ± 4 in region A and 2 ± 2 in region D for both the \(\mu \)DPJ–\(\mu \)DPJ and the \(\mu \)DPJ–hDPJ channel; 2 ± 2 in region A for the hDPJ–hDPJ channel. No events are observed in the remaining regions in the cosmic dataset. The estimated cosmic-ray event yields are subtracted from each of the ABCD regions before using the method to estimate the multi-jet background yield.

Other potential backgrounds to the signal include all the processes that lead to real prompt muons and muons plus jets in the final state, such as the SM production of W+jets, Z+jets, \(t\bar{t}\), single-top-quark, WW, WZ, and ZZ events. MC samples are used to study these processes. They give no contribution after the trigger selection and the definition of muonic-DPJ and hadronic-DPJ and do not enter in the ABCD plane.

The signal contamination in the VRs is verified to be less than 5% for all channels. The linear correlation coefficient between the \(\text {max}(\Sigma p_{\mathrm {T}})\) and \(|\Delta \phi |\) variables is verified to be less than 6% in the VR data, as well as in the SR using multi-jet MC events. The effect of this correlation on the final result is found to be negligible compared to the statistical accuracy. Table 3 shows the event counts in each of the four regions of the ABCD plane in the validation regions and the expected number of background events in the validation region A in data. Only statistical uncertainties are shown. The expected contribution from cosmic rays is included in all regions and in the background estimation. The observed number of events in the validation region A is in agreement with the number predicted by the ABCD method within the statistical uncertainties.

As additional validation of the ABCD method, control region D of the SR is divided into four subregions. The subregion with low \(\text {max}(\Sigma p_{\mathrm {T}})\) and high \(|\Delta \phi |\) is treated as a mock signal region, with the other subregions serving as control regions. Applying the method, the expected and the observed numbers of events in the mock signal region are: \(231\pm 58\) and 184 for the \(\mu \)DPJ–\(\mu \)DPJ channel, \(131\pm 41\) and 145 for the \(\mu \)DPJ–hDPJ channel, \(402\pm 77\) and 479 for the hDPJ–hDPJ channel. These are in agreement within the statistical uncertainties.

Table 3 Event count in each of the four regions of the ABCD plane in the validation regions and expected number of background events in region A. Only statistical uncertainties are shown. The expected contribution from cosmic rays is included in all regions and in the background estimation

Figure 3 shows the distribution of events in the ABCD plane of the \(\mu \)DPJ–\(\mu \)DPJ channel in the SR for the collision data and the MC signal \(H\rightarrow ~2\gamma _\text {d}+X\) with \(m_{H}\)  = 125 GeV, assuming a 10% Higgs boson decay branching fraction into \(\gamma _{\mathrm {d}}\). As a reference, the boundaries defining regions A, B, C and D are indicated in the figure by solid red lines.

Fig. 3
figure 3

Opening angle between the two DPJs, \(|\Delta \phi |\), vs inner-detector isolation, \(\text {max}(\Sigma p_{\mathrm {T}})\), in the \(\mu \)DPJ–\(\mu \)DPJ channel for data (left) and MC signal \(H\rightarrow ~2\gamma _\text {d}+X\) with \(m_{H}\) = 125 GeV (right), assuming a 10% Higgs boson decay branching fraction into \(\gamma _{\mathrm {d}}\). The red (solid) lines show the boundaries of the ABCD regions

In order to take into account the small signal contamination in regions B, C and D, a likelihood-based ABCD method is used for the background estimation in the SR. It estimates the background in region A by performing a fit to the background and signal yields in the four regions. A likelihood function is formed from the product of four Poisson functions, one for each of the A, B, C, and D regions, describing signal and background expectations. The likelihood takes the form:

$$\begin{aligned}&\mathcal {L}(n_{\text {A}},n_{\text {B}},n_{\text {C}},n_{\text {D}} | s, b, \tau _{\text {B}}, \tau _{\text {C}}) = \prod _{i={\text {A,B,C,D}}} \frac{{\text {e}}^{-N_i} N_i^{n_i}}{n_i !} \quad , \end{aligned}$$

where \(n_{\text {A}}\), \(n_{\text {B}}\), \(n_{\text {C}}\) and \(n_{\text {D}}\) are the four observables that denote the number of events observed in each region in data. The \(N_i\) are linear combinations of the signal and background expectation in each region, defined as follows:

$$\begin{aligned}&N_{\text {A}} = s + b \\&N_{\text {B}} = s \, \varepsilon _{\text {B}} + b \, \tau _{\text {B}} \\&N_{\text {C}} = s \, \varepsilon _{\text {C}} + b \, \tau _{\text {C}} \\&N_{\text {D}} = s \, \varepsilon _{\text {D}} + b \, \tau _{\text {C}} \, /\ \tau _{\text {B}} \end{aligned}$$

where s (b) is the signal (background) yield in region A, \(\varepsilon _i\) is the signal contamination derived from MC simulation, and \(\tau _{\text {B}}\) and \(\tau _{\text {C}}\) are the nuisance parameters that describe the ratio of the background expectation in the control region to the background expectation in the signal region. The s, b, \(\tau _{\text {B}}\) and \(\tau _{\text {C}}\) values are allowed to float in the fit to the four data regions.

8 Systematic uncertainties

The uncertainty in the ABCD-method background estimate is evaluated from the impact of a possible correlation between the ABCD variables in the SR. The correlation is evaluated using multi-jet MC events and validated in VR data. This effect is found to lead to a potential variation of less than 4% in the background estimate. The size of this uncertainty is therefore considered negligible when compared to the statistical one and it is not included in the fit.

The following effects are considered as possible sources of systematic uncertainty in the signal.

Luminosity

The uncertainty in the combined 2015–2016 integrated luminosity is 2.1% [95], obtained using the LUCID-2 detector [96] for the primary luminosity measurements.

Trigger

The systematic uncertainty in the narrow-scan trigger efficiency is evaluated using a tag-and-probe method applied to \(J/\psi \rightarrow \mu \mu \) events in data and simulation. The difference between the trigger efficiency in data and that in simulation is evaluated as a function of the opening angle between the two muons. The difference in the region \(\Delta R\)  \(<~0.05\), corresponding to the \(\Delta R\) expected for signal, is taken as the uncertainty and is 6\(\%\). The systematic uncertainty in the tri-muon MS-only trigger efficiency is 5.8\(\%\), taken from the analysis of 2012 data [11] since the algorithm has not undergone a major change since then. The systematic uncertainty in the CalRatio trigger efficiency is taken from Ref. [97] and is 2\(\%\).

BDT shape

The systematic uncertainty in the MC modelling of the input variables used for the BDT training is evaluated for both the \(\mu \)DPJ and the hDPJ. For the \(\mu \)DPJ, the data-to-MC ratio is computed for muon timing and \(z_0\) BDT input variables using samples of \(Z\rightarrow \mu \mu \) events. This comparison relies on the fact that, due to the high boost of low-mass dark photons and the high \(p_{\text {T}}\) of signal muons, the muon \(z_0\) and timing distributions are similar to those of prompt muons from \(Z\rightarrow \mu \mu \). Muons from Z boson decay are reconstructed using information only from the MS. The BDT is retrained using MC signal variables scaled to the data using these ratios, and the fit procedure is repeated. The resulting change in the final signal yield is taken as the systematic uncertainty, and its value is 3%. The same procedure is used for the hDPJ, where the ratios of data to simulated distributions are computed from data and MC samples of multi-jet events. The resulting uncertainty is 14%.

Muon reconstruction

The systematic uncertainty in the single-\(\gamma _{\mathrm {d}}\) reconstruction efficiency is evaluated using a tag-and-probe method applied to \(J/\psi \rightarrow \mu \mu \) events in 2015 data and simulation. \(J/\psi \rightarrow \mu \mu \) decays are selected, and the efficiency is evaluated as a function of the opening angle \(\Delta R\) between the two muons, for both the data and simulated \(J/\psi \) decays. For low \(\Delta R\) values, the efficiency decreases due to the difficulty of reconstructing two tracks with small angular separation in the MS. The difference in \(J/\phi \rightarrow \mu \mu \) reconstruction efficiency between simulation and data in the \(\Delta R\)  interval between 0 and 0.06 (where the DPJ samples are concentrated) amounts to 15\(\%\), and is taken as the uncertainty.

Jet energy scale and jet energy resolution

The jet energy scale and jet energy resolution introduce uncertainties in the signal yield of 1–8% and 1–5% respectively, depending on the signal process, where the processes with two dark photons are less affected. These uncertainties are calculated using the procedure detailed in Ref. [98]. Since the jets used in this analysis are required to have a low fraction of energy in the electromagnetic calorimeter, additional jet energy uncertainties are derived as a function of electromagnetic energy fraction as well as of pseudorapidity. These additional jet energy uncertainties are found to have an effect of up to 4% on the signal yield, and are taken in quadrature with the regular jet energy uncertainties.

Effect of pile-up on \({\varvec{\Sigma }}{{\varvec{p}}} {\mathbf {T}}\)

The presence of multiple collisions per bunch crossing affects the efficiency of the ID track isolation criterion quantified in terms of \( \Sigma p_{\mathrm {T}}\). The systematic uncertainty is evaluated by comparing \( \Sigma p_{\mathrm {T}}\) for muons from a sample of reconstructed \(Z \rightarrow \mu \mu \) events in data with that in simulation, as a function of the number of interaction vertices in the event. The systematic uncertainty is evaluated as the maximum difference at the value of the selection requirement on \(\text {max}(\Sigma p_{\mathrm {T}})\). It is found to be 5.1\(\%\).

9 Results and interpretation

The observed numbers of events in the ABCD regions and the expected number of background events in the signal region A are summarised in Table 4. The expected number of background events in region A is estimated using the likelihood-based ABCD method, assuming no signal and not including the observed data in region A. The background estimate includes both the multi-jet and cosmic-ray background, where the former is obtained as described in Sect. 7, and the latter is estimated from the cosmic dataset. Both sources are included in the expected background given in Table 4. The observed number of events in region A is in agreement with the predicted number of background events.

Table 4 Observed numbers of events in the ABCD regions and expected number of background events in region A. In the estimate, the data in region A are not considered and the signal strength is fixed to zero. Both the statistical and systematic uncertainties in the background expectations are given. The expected contribution from cosmic rays is included in all regions

Table 5 shows the expected number of signal events in region A for the FRVZ model parameters of Table 1 and the following assumptions: a value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f}_{\mathrm {d}_2}~)\) = 10% for a Higgs boson with \(m_{H}\)  = 125 \(\text {Ge}\text {V}\), which is not excluded by the current measurements [5]; a value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f}_{\mathrm {d}_2}~)\) = 100% and a production cross section of \(\sigma ~=~5\) pb for a BSM scalar boson with \(m_{H}\)  = 800 \(\text {Ge}\text {V}\).

Table 5 Expected numbers of signal events in region A. A branching fraction value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f}_{\mathrm {d}_2}~)\) = 10% is assumed for DPJ production in the decay of the \(m_{H}\)  = 125 GeV Higgs boson. For DPJ production in the decay of a \(m_{H}\)  = 800 GeV BSM scalar boson, a value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f}_{\mathrm {d}_2}~)\) = 100% and a production cross section of \(\sigma ~=~5\) pb are assumed. Only statistical uncertainties are reported

Upper limits on the production cross section times branching fraction (\(\sigma \times \mathcal {B}\)) as a function of the \(\gamma _{\mathrm {d}}\) proper decay length are derived for the FRVZ \(H\rightarrow ~2\gamma _\text {d}+X\) and \(H\rightarrow ~4\gamma _\text {d}+X\) processes using the CL\(_\text {s}\) method [99]. Since each signal sample was generated for a particular proper decay length, it is necessary to extrapolate the signal efficiency to other decay lengths to obtain limits as a function of c\( \tau \). This is achieved by applying to the i-th dark photon in the event a weight

$$\begin{aligned} w_i(t_i) = \frac{\tau _\text {ref}}{\text {e}^{-t_i/\tau _\text {ref}}} \cdot \frac{\text {e}^{-t_i/\tau _\text {new}}}{\tau _\text {new}}, \end{aligned}$$

where \(\tau _\text {ref}\) is the lifetime with which the event sample was simulated, \(\tau _\text {new}\) is the lifetime for which it is weighted, and \(t_i\) is the proper decay time of the i-th dark photon. Each event is weighted by the product of the individual dark-photon weights. The weighted sample is used to evaluate the signal efficiency for \(\tau _{\text {new}}\). Figure 4 shows the extrapolated signal efficiency for the \(H\rightarrow ~2\gamma _\text {d}+X\) and \(H\rightarrow ~4\gamma _\text {d}+X\) processes as a function of c\( \tau \) of the dark photon in the \(\mu \)DPJ–\(\mu \)DPJ, \(\mu \)DPJ–hDPJ and hDPJ–hDPJ channels. The tri-muon MS-only trigger has a lower efficiency for the \(H\rightarrow ~2\gamma _\text {d}+X\) process with a \(m_{H}\)  = 125 GeV Higgs boson than for the other processes, resulting in a lower signal efficiency in the \(\mu \)DPJ–\(\mu \)DPJ channel. The \(p_{\text {T}}\) requirements of the CalRatio trigger are not optimal for selecting jets produced by \(\gamma _{\mathrm {d}}\) decays in the processes with a \(m_{H}\)  = 125 GeV Higgs boson, resulting in a signal efficiency below 1% in the hDPJ–hDPJ channel. The muon narrow-scan trigger helps to recover some efficiency in the \(\mu \)DPJ–hDPJ channel for these processes.

The observed 95\(\%\) CL cross-section upper limits in the \(\mu \)DPJ–\(\mu \)DPJ   channel for the \(H\rightarrow ~2\gamma _\text {d}+X\)   and \(H\rightarrow ~4\gamma _\text {d}+X\)   processes are presented in Fig. 5 for \(m_{H}\)  = 125 GeV. The 95\(\%\) CL exclusion limits in the \(\mu \)DPJ–\(\mu \)DPJ   and hDPJ–hDPJ   channels for the process \(H\rightarrow ~2\gamma _\text {d}+X\)   are presented in Fig. 6 for \(m_{H}\)  = 800 GeV. The figures also show the expected limits obtained from the likelihood-based ABCD method, using the background estimate derived from the background-only fit using data in the four regions. Excluded c\( \tau \) ranges are summarised in Table 6, assuming \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f_{\mathrm {d}_2}~})~=~10\%\) for the Higgs boson with \(m_{H}\)  = 125 GeV and \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f_{\mathrm {d}_2}~})~=~100\%\) for the BSM Higgs boson, with subsequent decay of the \(f_{\mathrm {d}_2}~\) and \(\bar{f_{\mathrm {d}_2}~}\) giving rise to the production of two or four dark photons.

Fig. 4
figure 4

Extrapolated signal efficiencies as a function of proper decay length of the \(\gamma _{\mathrm {d}}\) for the \(H\rightarrow ~2\gamma _\text {d}+X\) and \(H\rightarrow ~4\gamma _\text {d}+X\) processes and for the three different channels: \(\mu \)DPJ–\(\mu \)DPJ (left), \(\mu \)DPJ–hDPJ (right) and hDPJ–hDPJ (bottom). The signal efficiency in the hDPJ–hDPJ  channel for \(m_{H}\)  \(=125\) GeV \(H\rightarrow ~4\gamma _\text {d}+X\) process is small compared with the other channels and is not shown. The vertical bars represent the statistical uncertainties

Fig. 5
figure 5

Upper limits at 95\(\%\) CL on the cross section times branching fraction for the processes \(H\rightarrow ~2\gamma _\text {d}+X\) (left) and \(H\rightarrow ~4\gamma _\text {d}+X\) (right) in the \(\mu \)DPJ–\(\mu \)DPJ final states for \(m_{H}\)  \(=125\) GeV. The horizontal lines correspond to the cross section times branching fraction for a value of the branching fraction of the Higgs boson decay into dark fermions of 10%

Fig. 6
figure 6

Upper limits at 95\(\%\) CL on the cross section times branching fraction for the process \(H\rightarrow ~2\gamma _\text {d}+X\), where H is an 800 GeV BSM Higgs boson, in the \(\mu \)DPJ–\(\mu \)DPJ (left) and hDPJ–hDPJ (right) final states. The horizontal lines correspond to a cross section times branching fraction of 5 pb

Table 6 Ranges of \(\gamma _{\mathrm {d}}\) c\( \tau \) excluded at 95\(\%\) CL for \(H\rightarrow ~2\gamma _\text {d}+X\) and \(H\rightarrow ~4\gamma _\text {d}+X\). A branching fraction value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f_{\mathrm {d}_2}~})\) = 10% is assumed for DPJ production in the decay of a \(m_{H}\)  = 125 \(\text {Ge}\text {V}\) Higgs boson. For DPJ production in the decay of a \(m_{H}\)  = 800 GeV BSM scalar boson, a value of \(\mathcal {B}\) \((H\rightarrow f_{\mathrm {d}_2}~\bar{f_{\mathrm {d}_2}~})\) = 100% and a production cross section of \(\sigma ~=~5\) pb are assumed

The results for the \(\mu \)DPJ–\(\mu \)DPJ   channel is also interpreted in terms of the kinetic mixing parameter \(\epsilon \) and \(\gamma _{\mathrm {d}}\) mass, shown in Fig. 7 as exclusion contours. These limits assume four possible values of the Higgs boson decay branching fractions into \(\gamma _{\mathrm {d}}\), ranging from 1 to 20%, and the NNLO gluon–gluon fusion Higgs boson production cross section. The \(\gamma _{\mathrm {d}}\) detection efficiency for a \(\gamma _{\mathrm {d}}\) mass of 0.4 GeV is used for the mass interval 0.25–2 GeV, as the detection efficiency is constant throughout this interval [11]. The decay branching fraction variations as a function of the \(\gamma _{\mathrm {d}}\) mass are estimated and included in the 90\(\%\) CL exclusion region evaluations [56]. The low sensitivity in the hDPJ–hDPJ channel prevents the exclusion of the mass regions where the \(\gamma _{\mathrm {d}}\) decays into hadronic resonances: \(\gamma _{\mathrm {d}}\) mass regions around 0.8 and 1.0 GeV, where the \(\gamma _{\mathrm {d}}\) decays into the \(\rho \), \(\omega \), and \(\phi \) resonances. Figure 7 also shows previous exclusions for a Higgs boson decay branching fractions into \(\gamma _{\mathrm {d}}\) of 10% from a search for displaced dark-photon jets [11] and prompt dark-photon jets [14] at ATLAS. The search of Ref [11], which explored the same region probed by this analysis, is slightly more sensitive in the region of high \(\gamma _{\mathrm {d}}\) mass and low \(\epsilon \). This is due to inclusion of dark-photon jets with both muon and hadron constituents, which are not used in the current analysis. The search of Ref. [14] excluded high \(\epsilon \) values (shorter lifetimes), a region complementary to this analysis.

Fig. 7
figure 7

The 90\(\%\) CL exclusion regions for the decay \(H\rightarrow ~2\gamma _\text {d}+X\) of the Higgs boson as a function of the \(\gamma _{\mathrm {d}}\) mass and of the kinetic mixing parameter \(\epsilon \). These limits are obtained assuming the FRVZ model with decay branching fractions of the Higgs boson into \(\gamma _{\mathrm {d}}\) between 1 and 20%, and the NNLO Higgs boson production cross sections via gluon–gluon fusion. The figure also shows excluded regions with decay branching fraction of the Higgs boson into \(\gamma _{\mathrm {d}}\) of 10% from the run-1 ATLAS displaced  [11] (black line) and prompt [14] (red line) dark-photon jets searches

10 Conclusions

The ATLAS detector at the LHC is used to search for the production of displaced dark-photon jets in a sample of pp collisions at \(\sqrt{s} =13\) TeV corresponding to an integrated luminosity of 36.1 fb\(^{-1}\). No significant excess of events compared with the background expectation is observed, and 95% confidence-level upper limits are set on the production cross section times branching fraction of scalar bosons that decay into dark photons according to the FRVZ model. The upper limits are computed as a function of the proper decay length c\( \tau \)  of the dark photon \(\gamma _{\mathrm {d}}\). In addition to the increase in integrated luminosity and centre-of-mass energy, improvements in background suppression and the exploitation of hadronic \(\gamma _{\mathrm {d}}\) decays result in increased sensitivity compared with the ATLAS search using 8 \(\text {Te}\text {V}\) pp data. In the pure muonic channel, assuming a branching ratio \(\mathcal {B}\) \((H\rightarrow ~2(4)\gamma _\text {d}+X)\) = 10% for \(m_H~=~125\) GeV, decays of dark photons are excluded at 95% CL for \(c\tau ~\in ~[1.5, 307\)] mm and \(c \tau ~\in ~[3.7, 178]\) mm for production of two and four dark photons, respectively. For \(m_H~=~800\) GeV, assuming \(\sigma ~\times \) \(\mathcal {B}\) \((H\rightarrow ~2(4)\gamma _\text {d}+X)\) = 5 pb, the excluded regions are \(c \tau ~\in ~[5, 1420]\) mm and \(c\tau ~\in ~[10.5, 312]\) mm. In the pure hadronic channel, the \(m_H~=~800\) GeV excluded regions become \(c\tau ~\in ~[7.3, 1298]\) mm and \(c\tau ~\in ~[13.6, 231]\) mm.

The results for \(H\rightarrow ~2\gamma _\text {d}+X\), when H is the Higgs boson, are also interpreted as 90% confidence-level limits on the kinetic mixing parameter as a function of the dark-photon mass. These results improve upon the constraints set in previous LHC searches.