1 Introduction

The Large Hadron Collider (LHC) entered a new energy regime in 2015, at the start of Run 2, with proton–proton collisions at a centre-of-mass energy of 13 \(\text {TeV}\). Events with \(\text {TeV}\)-scale jets showering in the detectors, or \(\tau \)-leptons and b-hadrons that pass through multiple active layers of material, now occur at high enough rates to be studied in detail. These signatures also occur in potential new physics scenarios including massive new resonances decaying to highly boosted bosons or top quarks whose decay products are often reconstructed as a single jet [1]. In the cores of highly energetic hadronic jets and \(\tau \)-leptons, the average separation between highly collimated charged particles is comparable to the granularity of individual sensors of the inner detector. This can create confusion within the algorithms used to reconstruct charged-particle trajectories, or tracks. Therefore, without careful consideration, the track reconstruction efficiency in these dense environments is limited, resulting in difficulties in identifying long-lived b-hadrons and hadronic \(\tau \)-decays, or in calibrating the energy and mass of jets. To prevent losses in efficiency, to increase the possibility of discovering new phenomena and to allow more detailed measurements of the newly opened kinematic regime, a dedicated optimization for dense environments was performed and deployed in the ATLAS [2] reconstruction for the start of Run 2. This updated reconstruction provides superior physics performance, reduces the required computing resources, and is now the default used by ATLAS.

This paper first describes the ATLAS detector (Sect. 2). Then, a general overview of the track reconstruction algorithm (Sect. 3) is given, focusing on the performance of charged-particle reconstruction in dense environments at the start of Run 2. The data set utilized is described in Sect. 4. The quality of the expected performance is evaluated in dedicated single-particle and dijet simulation samples (Sect. 5), and comparisons between simulation and data are performed in events with energetic jets. Extending these mainly Monte Carlo (MC) simulation-based studies, a fully data-driven method is introduced in Sect. 6 which probes the fraction of tracks lost in reconstruction, due to the high density and collimation of charged particles in high-transverse-momentumFootnote 1 (\(p_{{\text {T}}}\)) jets. This is achieved by using the ionization energy loss (\({\text {d}{\textit{E}}/d{\textit{x}}}\)) in the pixel detector.

2 The ATLAS detector

The ATLAS experiment, a multipurpose particle detector at the LHC, covers almost the entire solid angle around the collision point, and consists of an inner detector (ID) tracking system surrounded by a thin superconducting solenoid magnet producing a 2 T axial magnetic field, electromagnetic and hadronic calorimeters, and a muon spectrometer incorporating three large toroid magnet assemblies.

The ID, shown in Fig. 1, provides position measurements for charged particles in the range \(|\eta | < 2.5\) by combining information from three subdetectors. It consists of a cylindrical barrel region (full coverage for \(|\eta | \lesssim 1.5\)) arranged around the beam pipe, and two end-caps. Disks in the end-cap region are placed perpendicular to the beam axis, covering \(1.5< |\eta | < 2.5\). Starting from the interaction point, the high-granularity silicon pixel detector segmented in r\(\phi \) and z (including the new innermost layer, the insertable B-layer (IBL) [3, 4] added for Run 2) covers the vertex region and typically provides four measurements per track. The IBL has a mean radius of 33 mm and a typical IBL pixel has a size of 50 \(\mu \)m by 250 \(\mu \)m in the transverse and longitudinal directions with a sensor thickness of 200 \(\mu \)m. For the remaining three layers of the pixel system, located at mean radii of 50.5, 88.5, and 122.5 mm respectively, a typical pixel has a size of 50 \(\mu \)m by 400 \(\mu \)m in the transverse and longitudinal directions with a thickness of 250 \(\mu \)m. The pixel layer at a radius of 50.5 mm is referred to as the B-layer in this paper. The coverage in the end-cap region is enhanced by three disks on either side of the interaction point. The pixel detectors measure the charge collected in each individual pixel using the time over threshold (ToT) [5]. ToT is the time the pulse exceeds a given threshold and is proportional to the deposited energy.

Outside the pixel volume, the barrel of the silicon microstrip detector (SCT) consists of four double strip layers at radii of 299–514 mm, complemented by nine disks in each of the end-caps. A typical strip of a barrel SCT sensor has a length of 126 mm and a pitch of 80 \(\mu \)m. On each layer, the strips are parallel to the beam direction on one side and at a stereo angle of 40 mrad on the other. The information from the two sides of each layer can be combined to provide an average of four three-dimensional measurements per track. The SCT sensors are connected to binary read-out chips, which do not provide information about the collected charge. The silicon detectors are complemented by the transition radiation tracker (TRT) [6], which extends track reconstruction radially up to a radius of 1082 mm for charged particles within \(|\eta | = 2.0\) while providing r\(\phi \) information. The raw timing information from its straw tubes is translated into calibrated drift circles that are matched to track candidates reconstructed from the silicon detectors [6].

Fig. 1
figure 1

Sketch of the barrel region of the ATLAS inner detector

The solenoid is surrounded by sampling calorimeters. Calorimetry is provided by three distinct detectors outside the ID volume. A lead/liquid-argon sampling electromagnetic calorimeter is split into barrel (\(|\eta | < 1.5\)) and end-cap (\(1.5< |\eta | < 3.2\)) sections. A steel/scintillator-tile hadronic calorimeter covers the barrel region (\(|\eta | < 1.7\)) and two end-cap copper/liquid-argon sections extend to higher pseudorapidity (\(1.5< |\eta | < 3.2\)). Finally, the forward region (\(3.1< |\eta | < 4.9\)) is covered by a liquid-argon calorimeter with a copper (tungsten) absorber in the electromagnetic (hadronic) section. In the outermost part, air-core toroids provide the magnetic field for the muon spectrometer. It consists of three layers of gaseous detectors: monitored drift tubes and cathode strip chambers for muon identification and momentum measurements for \(|\eta | < 2.7\), and resistive-plate and thin-gap chambers for online event selection up to \(|\eta | = 2.4\). A two-level trigger system, custom hardware followed by a software-based level, is used for online event selection and to reduce the event rate to about 1 kHz for offline reconstruction and storage.

3 ATLAS track reconstruction

The following provides an overview of primary-track reconstruction in the pixel and SCT detectors. After cluster creation, the primary-track reconstruction algorithm utilizes iterative track-finding seeded from combinations of silicon detector measurements, while additional methods are employed to recover non-prompt tracks. A staged pattern-recognition approach is used: a loose track candidate search, which allows a number of combinatorial track candidates, is followed by a stringent ambiguity-solver that compares and rates the individual tracks by assigning a relative track score to each track. This follows current approaches to track reconstruction first introduced in Ref. [7]. Further details, including a description of TRT track extensions, can be found in Ref. [8].

3.1 Clusterization

Charged-particle reconstruction in the pixel and SCT detectors begins by assembling clusters from the raw measurements. A connected component analysis (CCA) [9] groups pixels and strips in a given sensor, where the deposited energy yields a charge above threshold, with a common edge or corner into clusters. From these clusters, three-dimensional measurements referred to as space-points are created. They represent the point where the charged particle traversed the active material of the ID. In the pixel detector, each cluster equates to one space-point, while in the SCT, clusters from both sides of a strip layer must be combined to obtain a three-dimensional measurement.

The charge in a pixel sensor is often collected on multiple adjacent pixels. In the data set described in Sect. 4, the average size of pixel clusters in the barrel is about two pixels in the \(r-\phi \) plane and from one to three pixels in the longitudinal direction increasing with \(\eta \). The total charge is proportional to the path length in the sensor and thus dependent on the incident angle of the particle. The particle’s intersection point with the sensor is determined from the pixels contributing to the cluster using a linear approximation refined with a charge interpolation technique [10]. In dense environments, the spatial separation between charged particles traversing the sensor is only a few pixels, and the CCA algorithm, at times, reconstructs only one cluster which includes energy deposits from multiple particles. Identifying such clusters reliably and quickly is paramount for an efficient charged-particle reconstruction in dense environments.

It is useful to introduce the several classes of clusters identified by either the “truth information”, only available in simulation and referring to information at MC generator level, or reconstructed quantities in both collision data and MC simulation. Clusters created by charge deposits from one particle are called single-particle clusters. Clusters created by charge deposits from multiple particles are called merged clusters. These definitions rely on truth information and both cases are illustrated in Fig. 2. Based on information available in the track reconstruction algorithm described below, clusters which are compatible with a merged cluster can be identified. These are labelled identified as merged. Ideally, all clusters identified as merged are, in fact, merged clusters, and all merged clusters are identified as merged. Shared clusters are those which are used in multiple reconstructed tracks but are not sufficiently compatible with the properties of a merged cluster to be identified as merged by the reconstruction. Multiply used clusters – clusters used by multiple tracks – are either identified as merged or shared but not both.

Fig. 2
figure 2

Illustration of a single-particle pixel clusters on a pixel sensor and b a merged pixel cluster due to very collimated charged particles. Different colours represent energy deposits from different charged particles traversing the sensor and the particles trajectories are shown as arrows. a Single-particle pixel clusters. b Merged pixel cluster

3.2 Iterative combinatorial track finding

Track seeds are formed from sets of three space-points. This approach maximizes the possible number of combinations while still allowing a first crude momentum estimate. The impact parameters of a track seed, with respect to the centre of the interaction region, are estimated by assuming a perfect helical trajectory in a uniform magnetic field.

The purity, or fraction of seeds that result in good-quality tracks, varies significantly depending on which subdetector(s) recorded the space-points used in the seed. Therefore, seed types are considered starting with SCT-only, then pixel-only and finally mixed-detector seeds, representing the order of purity. A number of criteria are placed on the seeds to maximize purity: first and foremost seed-type-dependent momentum and impact parameter requirements. Also, the use of space-points in multiple seeds is carefully controlled. Purity is further improved by requiring that one additional space-point is compatible with the particle’s trajectory estimated from the seed. A combinatorial Kalman filter [11] is then used to build track candidates from the chosen seeds by incorporating additional space-points from the remaining layers of the pixel and SCT detectors which are compatible with the preliminary trajectory. The filter creates multiple track candidates per seed if more than one compatible space-point extension exists on the same layer.

These criteria result in a very high efficiency for reconstructing primary particles (for example, the muon reconstruction efficiency is greater than 99% [12]) and the removal of tracks created from purely random collections of space-points. Suppressing such purely combinatorial tracks is essential in order to remain within the available CPU budget for event reconstruction. From approximately 13 space-point combinations created for an isolated charged particle traversing the entire ID, the time-intensive combinatorial Kalman filter is, on average, called in its entirety 1.1 times. As all realistic combinations of space-points have been made, there are a number of track candidates where space-points overlap, or have been incorrectly assigned. This necessitates an ambiguity-solving stage.

3.3 Track candidates and ambiguity solving

In the ambiguity solver, track candidates considered to create the reconstructed track collection are processed individually in descending order of a track score, favouring tracks with a higher score. This design relies on having an appropriate track score definition that puts tracks into an order that scores more highly the candidates likely to correctly represent the trajectory of a charged primary particle.

The method used to determine the track score, discussed in the following, applies a robust approach based largely on simple measures of the track quality. Clusters assigned to a track increase the track score according to configurable weight fractions reflecting the intrinsic resolutions and expected cluster multiplicities in the different subdetectors. HolesFootnote 2 reduce the score. The \(\chi ^{2}\) of the track fit is also considered to penalize candidates with a poor fit. Finally, the logarithm of the track momentum is considered to promote energetic tracks and suppress the larger number of tracks with incorrectly assigned clusters, which typically have a low \(p_{{\text {T}}}\).

After the track scores have been calculated, the ambiguity solver deals with clusters assigned to multiple track candidates. Clusters compatible with multiple track candidates are a natural consequence of having merged clusters in dense environments. High reconstruction efficiency is facilitated by the identification of merged clusters, as explained in Sect. 3.4. However, shared clusters, clusters used in multiple track candidates which are not identified as merged, must be limited as they are a strong indicator of incorrect assignments.

To count shared clusters, a track candidate is only compared to those tracks previously accepted by the ambiguity solver. Clusters can be shared by no more than two tracks, giving preference to tracks processed first in the ambiguity solver. Also, a track can have no more than two shared clusters. A cluster is removed from a track candidate if it causes either the candidate or an accepted track to not meet the shared-cluster criterion. The track candidate is then scored again and returned to the ordered list of remaining candidates. Track candidates are rejected by the ambiguity solver if they fail to meet any of the following basic quality criteria:

  • \(p_{{\text {T}}} > 400\) \(\text {MeV}\),

  • \(|\eta | < 2.5\),

  • Minimum of 7 pixel and SCT clusters (12 are expected),

  • Maximum of either one shared pixel cluster or two shared SCT clusters on the same layer,

  • Not more than two holes in the combined pixel and SCT detectors,

  • Not more than one hole in the pixel detector,

  • \(|d_0^{{\text {BL}}}|<\) 2.0 mm,

  • \(|z_0^{{\text {BL}}}\sin \theta |<\) 3.0 mm,

where \(d_0^{{\text {BL}}}\) is the transverse impact parameter calculated with respect to the measured beam-line position, \(z_0^{{\text {BL}}}\) is the longitudinal difference along the beam line between the point where \(d_0^{{\text {BL}}}\) is measured and the primary vertex,Footnote 3 and \(\theta \) is the polar angle of the track. In the remainder of the paper, all studied tracks fulfil these requirement. A simplified flow of track candidates through the ambiguity solver is shown in Fig. 3.

Fig. 3
figure 3

Sketch of the flow of tracks through the ambiguity solver

3.4 Neural–network pixel clustering

To aid the ambiguity solver and minimize the loss of efficiency due to limitations on the number of shared clusters per track, an artificial neural network (NN) trained to identify merged clusters is used. The measured charge, which is proportional to the deposited energy, and relative position of pixels in the cluster can be used to identify merged clusters. Additional information about the particle’s incident angle, provided from the track candidate, significantly improves the NN’s performance [14]. For merged clusters created by two charged particles, the NN identification efficiency of this cluster as being created by two particles is about 90%. Merged clusters created by three charged particles are identified as such with an efficiency of 85%. Only a few percent of single particle clusters are incorrectly identified as a two-particle merged cluster and a negligible amount are identified as three-particle merged clusters. The NN is not able to distinguish clusters from exactly three and more than three charged particles. It is not possible for the NN to separate the energy deposits of each charged particle in an identified merged cluster and subsequently divide it into multiple clusters. Unlike the Run-1 reconstruction algorithm [8], the NN is consulted only when a cluster is used in multiple track candidates largely mitigating the impact of misidentification of merged clusters by the NN.

The inherent randomness of charged-particle interactions with thin silicon layers prevents the NN from performing perfectly. For example, the emission of \(\delta \)-rays causes difficulties as they can lead to bigger clusters and larger energy deposits than expected from a single particle. These inefficiencies can be mitigated by correlating information from consecutive layers of the pixel detector. In general, the separation between collimated charged particles increases as they travel outward through the ID. Therefore, if a pair of tracks uses a merged cluster on a given layer, then the inner layer is likely to contain a merged cluster as well. Furthermore, both clusters should be used by the same track candidates in this logic.

In summary, a cluster can be identified as merged in two ways. Either it is used by multiple track candidates and the NN identifies it as a merged cluster, or if two track candidates compete for clusters on two consecutive layers, the cluster on the inner layer is identified as merged if the cluster on the outer layer is identified as merged. Clusters identified as merged are used by the competing track candidates without penalty. Clusters which are not identified as merged, shared clusters, can still be used in multiple tracks but with the penalty described in Sect. 3.3.

3.5 Track fit

For track candidates fulfiling the requirements listed in Sect. 3.3, a high-resolution fit is performed using all available information. Fitted tracks which pass through the ambiguity solver without modification are added to the final track collection. Delaying the track fit until this stage minimizes the number of times the fitter is called, which is advantageous as it is a relatively CPU-intensive process.

For the high-resolution track fits, the position and uncertainty of each cluster is determined by additional NNs [14]. They predict the positions where the charged particles intersected the sensor based on the same input to the NN described in Sect. 3.4. The predicted number of charged particles which created the cluster determines the number of particle intersections the additional NNs predict. This decreases the discrepancy between the reconstructed cluster position and the cluster’s fitted track position at the detector surface, especially for merged clusters, resulting in more precise track parameters.

4 Data and Monte Carlo samples

Data from proton–proton collisions at \(\sqrt{s}=13~\text {TeV}\), collected during 2015 and corresponding to an integrated luminosity of 3.2 fb\(^{-1}\), are used in this paper. Events are selected using triggers requiring a single jet above various \(p_{{\text {T}}}\) thresholds. The minimum jet trigger \(p_{{\text {T}}}\) threshold is 100 \(\text {GeV}\). The numbers of events selected by the triggers were reduced by a factor depending on the instantaneous luminosity and the jet \(p_{{\text {T}}}\) threshold. This suppresses the number of low-\(p_{{\text {T}}}\) jets while keeping all events with at least one jet with \(p_{{\text {T}}}\) \(> 450~\text {GeV}\). Standard ATLAS data-quality requirements are applied to all data sets, ensuring all detectors were operational.

The data are compared to a leading-order dijet MC sample generated with Pythia 8.186 [15] with the A14 tuned parameter set [16] and the NNPDF2.3LO parton distribution function (PDF) set [17]. MC samples generated with Herwig++2.7.1 [18], and Sherpa 2.1 [19] are also studied. For Herwig++, the UEEE5 tuned parameter set is used with the CTEQ6L1 PDF set [20] and for Sherpa, parameters corresponding to the CT10 PDF set [21] are used. The ATLAS detector response is fully simulated [22] using the Geant 4 framework [23]. The average number of proton–proton interactions per bunch crossing (pile-up) was approximately 15 during the 2015 data-taking period. The expected contribution from additional proton–proton interactions is accounted for by overlaying minimum-bias events simulated with Pythia 8. The MC samples are reweighted to match the distribution of the number of interactions per bunch crossing and then reweighted to the inclusive jet-\(p_{{\text {T}}}\) spectrum observed in collision data. In dense environments, the impact of pile-up on the track reconstruction performance is small. The change in tracking efficiency considering only one interaction per bunch crossing to an average pile-up of 40 in the dijet MC sample for jets with a \(p_{{\text {T}}}\) above 200 \(\text {GeV}\) is below 0.3%.

In order to perform detailed simulation-based studies on event topologies with highly collimated particles, four large MC samples, with a single particle decaying into a set of nearby charged particles, are employed. The initial particles have different lifetimes and decay multiplicities, and are generated with a uniform transverse momentum spectrum from 10 to 1 \(\text {TeV}\) within \(|\eta |\) of 1.0. Topologies with two highly collimated tracks are studied in a simulated \(\rho \rightarrow \pi ^{+}\pi ^{-}\) sample. Simulated decays of a single \(\tau \)-lepton to three charged hadrons (\(\tau ^{\pm } \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\)) are used to study topologies with three charged particles. To study the performance in topologies with higher charged-particle multiplicities, two additional samples are created; a sample containing all decays of a \(B^{0}\) into multiple particles and a \(\tau \)-lepton decaying to a final state including five charged hadrons.

5 Track reconstruction performance in dense environments

This section first compares basic properties of tracks inside jets in data with those in simulated dijet samples (Sect. 5.2). Using truth-based quantities, Sect. 5.3 studies single-particle decays with collimated decay products. These relatively simple topologies allow the behaviour of the track reconstruction to be studied as a function of the momentum of the initial particle, and the spatial separation between the tracks. Section 5.4 presents analogous results, but derived from a dijet MC sample of high-\(p_{{\text {T}}}\) jets.

5.1 Classification

In simulation, tracks are classified using a truth-matching probability. It is the ratio of the weighted number of measurements originating, at least in part, from the same simulated particle, to the weighted number of all measurements used in a track. A subdetector-specific weight of ten for measurements in the pixel detector, five for the SCT and one for the TRT is used. These weights reflect the average number of expected measurements in each subdetector. A properly reconstructed track is required to have a truth-matching probability above 0.5. Such a requirement is imposed for all reconstruction efficiencies presented in this paper.

Fake tracks are those which have a truth-matching probability below 0.5. Due to the careful pruning of seeds, the majority of reconstructed fake tracks are from the misallocation of clusters from other particles to a track and not purely random combinations of clusters. The track reconstruction procedure described in Sect. 3 results in a negligible number of fake tracks in dense environments. For jets with a \(p_{{\text {T}}}\)  above 200 \(\text {GeV}\) in the dijet MC sample described in Sect. 4, the fraction of fake tracks is below 0.5%. From only one pp interaction per bunch crossing to an average pile-up of 40, this fraction increases by about 0.5%, still making it negligible. Consequently, fake tracks are not discussed in further detail.

Jets are reconstructed from topological clusters [24] of energy deposits in the calorimeter using the anti-\(k_t\) algorithm [25] with a radius parameter \(R=0.4\) and are selected requiring a minimum jet \(p_{{\text {T}}}\) of 200 \(\text {GeV}\) and \(|\eta ^{{\text {jet}}}|<2.5\). Jets are corrected for the effects of non-compensating response in the calorimeter and inactive material by using energy- and \(\eta \)-dependent calibration factors, based on MC simulation and pp collision data. Additional corrections are applied to reduce the dependence of the jet energy measurement on the longitudinal and transverse structure of the jets and also to correct for jets that are not fully contained in the calorimeter [26].

5.2 Data and MC simulation comparison

This section gives an overview of basic properties of tracks inside jets. Data and MC simulation comparisons establish fair agreement between the two.

The average number of tracks per unit of angular area versus the angular distance from the jet axis in data and MC events is compared in Fig. 4. The charged-particle density in jets increases linearly with the logarithm of the jet momentum, which reflects the average number of tracks inside the jet. Moreover, most tracks are located within an angular distance of 0.05 from the jet axis. Jets in data tend to have a slightly wider distribution of reconstructed charged particles than those in simulation.

Fig. 4
figure 4

The average number of primary tracks per unit of angular area as a function of the angular distance from the jet axis. Data (markers) and dijet MC (lines) samples are compared in bins of jet \(p_{{\text {T}}}\)  showing the high density in the cores of energetic jets

Due to the large number of collimated charged particles the number of multiply used clusters rises steeply at small distances to the jet axis. Figure 5 shows the number of pixel clusters that are identified as merged and the number of shared pixel clusters on the track for data and MC simulation versus the angular distance from the jet axis. The average number of shared pixel clusters remains relatively low compared to the number of clusters identified as merged, down to the smallest distances, because the reconstruction algorithm identifies merged clusters with high efficiency, and these consequently are not counted as shared. MC simulation and data show reasonable agreement in the individual bins of jet \(p_{{\text {T}}}\).

Fig. 5
figure 5

The average number of a pixel clusters identified as merged and b shared pixel clusters on primary tracks (with a production vertex before the IBL) are shown as a function of the angular distance of the track from the jet axis. Data (markers) and dijet MC (lines) samples are compared in bins of jet \(p_{{\text {T}}}\). The rise in both populations at small distances from the jet axis is expected due to the increasingly dense environment. a Pixel clusters identified as merged. b Shared pixel clusters

Inefficiencies in the identification and treatment of merged clusters affect the number of IBL clusters on tracks in dense environments. Figure 6 shows the average number of IBL clusters on the track, for data and MC simulation versus the angular distance from the jet axis. For small distances the number of IBL clusters shows a drop, explained by a residual inefficiency in assigning clusters to the appropriate track. MC simulation and data agree within expectations in each of the individual jet \(p_{{\text {T}}}\) bins. The overall lower average number of IBL clusters on track in data is due to a not fully functional IBL detector module, which is not correctly considered in MC simulation.

Fig. 6
figure 6

The average number of IBL clusters on primary tracks (with a production vertex before the IBL) shown as a function of the angular distance of the track from the jet axis. Data (markers) and dijet MC (lines) samples are compared in bins of jet \(p_{{\text {T}}}\) showing a slight drop at small distances explained by a residual cluster-to-track assignment inefficiency

Although the SCT sensors are located at much higher radii than the pixel sensors, the expected number of shared clusters is considerably larger than for the pixels as shown in Fig. 7. This is due to the coarser segmentation of the SCT strips in one dimension and the lack of charge information hindering the identification of merged SCT clusters. The average number of shared SCT clusters decreases with the angular distance from the jet axis, correlated with the decrease in charged-particle density visible in Fig. 4 for data and MC simulation. In the studied jet-\(p_{{\text {T}}}\) range, the average number of SCT clusters on tracks is approximately 7.7 with little variation with respect to angular distances from the jet axis. The MC simulation agrees within expectations with data in the individual bins of jet \(p_{{\text {T}}}\).

Fig. 7
figure 7

The average number of shared SCT clusters for primary tracks with a production vertex before the IBL is shown as a function of the angular distance of the track from the jet axis. Data (markers) and dijet MC (lines) samples are compared in bins of jet \(p_{{\text {T}}}\). Due to the lack of charge information and the coarse sensor dimensions, the clusters cannot be readily identified as merged

5.3 Performance for collimated tracks

Quantities such as cluster assignment and track reconstruction efficiencies can be studied using truth information from simulation to elucidate the track reconstruction behaviour in the presence of highly collimated charged particles. This section utilizes the single-particle samples described in Sect. 4. Figure 8 shows how the minimum separation between charged particles at the IBL sensor surfaces evolves with the initial particle’s \(p_{{\text {T}}}\). For the same \(p_{{\text {T}}}\), the density of the decay products may differ significantly: the lighter the initial particle, or the higher the multiplicity of its decay products, the smaller the distance. The degradation of the track reconstruction performance is mainly driven by the distance between charged particles and the charged-particle multiplicity in their vicinity. The results presented hereafter are therefore representative of the reconstruction performance in many physics processes, provided these parameters are known. Throughout this section, unless otherwise noted, it is required that all charged particles are created before the IBL (production radius smaller than 29 mm) in all figures shown.

Fig. 8
figure 8

A comparison of the average minimum distance between charged decay products at the IBL sensor surfaces as a function of initial particle’s \(p_{{\text {T}}}\) for single-particle samples

The average number of merged clusters is compared to the average number of clusters identified as merged in Fig. 9 for the single \(\rho \) and three-prong \(\tau \) samples. The average charged-particle separation decreases with increasing initial-particle \(p_{{\text {T}}}\) leading to more merged pixel clusters as shown in the points labelled Ideal. The average numbers of both the merged clusters and the clusters identified as merged fall to zero at the lowest initial-particle \(p_{{\text {T}}}\), confirming a low rate of false-positives. Both grow at a similar rate with increasing initial-particle \(p_{{\text {T}}}\). The residual inefficiency of the pixel NN is apparent in a lower number of clusters identified as merged compared to the ideal number of merged clusters at high initial-particle \(p_{{\text {T}}}\). The reconstruction performance correlates directly with the multiplicity and distances at a given initial-particle \(p_{{\text {T}}}\) shown in Fig. 8.

Fig. 9
figure 9

A comparison of the average number of merged pixel clusters expected for truth particles from simulation and pixel clusters identified as merged used in reconstructed tracks is shown as a function of the \(\rho \) and three-prong \(\tau \) (\(\tau \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\)) transverse momentum. Ideal represents the true number of merged clusters, which would be obtained as the number of identified merged clusters in the case of perfect performance. It is required that the stable charged particles are created before the IBL. a \(\rho \rightarrow \pi ^{+}\pi ^{-}\) sample. b \(\tau ^{\pm } \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\) sample

Merged clusters failing identification can result in shared clusters, which (as explained in Sect. 3.3) need to be limited. To study possible inefficiencies of the reconstruction algorithm, the cluster assignment efficiency is shown in Fig. 10 as a function of the minimum truth particle separation at the sensor’s surface for the first two layers of the pixel detector. It is defined as the fraction of clusters created by a particle that are then used on the reconstructed track of said particle. With the closest truth particle separated by 400 \(\mu \)m at the IBL, the cluster assignment efficiency at this layer is in excess of 99% for the \(\rho \) and three-prong \(\tau \) samples, and 98% for the \(B^{0}\) samples. When going to smaller separations, individual clusters start to merge and eventually only a single merged cluster remains. Since in the simpler topology \(\rho \rightarrow \pi ^{+}\pi ^{-}\) the cluster has to be assigned to a maximum of two tracks, the cluster assignment efficiency is 99% down to the smallest distances shown. In case of the \(B^{0}\) and three-prong \(\tau \) decays, several daughter particles are likely to contribute to a merged cluster. The NN described in Sect. 3.4 lacks the ability to distinguish between merged clusters from more than three particles and those from exactly three particles [14]. Also, the track reconstruction algorithm limits the number of tracks using the same cluster without penalties to three. As a result, at much smaller particle separations, the cluster assignment efficiency is limited in the \(B^{0}\) and three-prong \(\tau \) samples. The case of more than three charged particles contributing to a pixel cluster in the \(B^{0}\) decay results in an additional assignment inefficiency on the B-layer.

Fig. 10
figure 10

For the \(\rho \) (top), three-prong \(\tau \) (middle), and \(B^{0}\) (bottom) samples, the efficiency with which reconstructed clusters are properly assigned to a track is shown for the two innermost pixel layers (IBL and B-layer) as a function of the minimum truth-particle separation in local y (left) and x (right), corresponding to the pixel dimensions longitudinal and transverse to the beam axis. It is required that the stable charged particles are created before the IBL. a \(\rho \rightarrow \pi ^{+}\pi ^{-}\) sample. b \(\rho \rightarrow \pi ^{+}\pi ^{-}\) sample. c \(\tau \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\) sample. d \(\tau \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\) sample. e \(B^{0} \rightarrow X\) sample. f \(B^{0} \rightarrow X\) sample

Regardless of how well the ambiguity solver identifies merged pixel clusters and assigns them to tracks, a substantial inefficiency remains at high initial-particle momenta due to the necessary limitations on shared SCT clusters. Figure 11 shows the reconstructable efficiency of the \(\rho \) and three-prong \(\tau \) decays utilizing MC truth information. This is defined as the efficiency to be able to reconstruct all of the charged decay products from a given resonance having satisfied the cluster multiplicity requirements defined in Sect. 3.3. All merged pixel clusters are assumed to have been identified, so for a fixed maximum number of allowed shared SCT clusters, this represents the maximum achievable reconstruction efficiency. The loss in efficiency is exacerbated by increasing charged-particle multiplicities as in the three-prong \(\tau \) sample. This limit is fixed at two shared clusters. The efficiency improvement obtained from loosening this limit is not sufficient to justify the associated increase in the proportion of fake tracks. In simulated events with several jets, the inclusive number of fake tracks increases by 25% when loosening the limit to three shared clusters.

Fig. 11
figure 11

The reconstructable efficiency, defined as the efficiency to reconstruct all of the charged decay products of the parent particle, is shown for the \(\rho \) and three-prong \(\tau \) samples with various limits on the number of shared clusters allowed on a track candidate assuming all the merged pixel clusters have been identified as merged. It is required that the stable charged particles are created before the IBL. a \(\rho \rightarrow \pi ^{+}\pi ^{-}\) sample. b \(\tau ^{\pm } \rightarrow \pi ^{+} \pi ^{-} \pi ^{\pm } \nu _{\tau }\) sample

Finally, the per-track reconstruction efficiency is shown in Fig. 12 as a function of particle \(p_{{\text {T}}}\) and production radius. The production radius is defined as the radial distance of the decay of the parent particle from the beam axis. The efficiency degrades with increased multiplicity. The visible inefficiency in all samples at low initial-particle \(p_{{\text {T}}}\) is due to inelastic interactions, such as hadronic interactions. At higher transverse momentum of the initial particle, a decrease in efficiency is driven by the increasingly collimated nature of the decay products. A decrease in efficiency is also seen with a increasing production radius as the charged particles arrive at each active layer with less average separation. The requirement on the total number of clusters for track reconstruction leads to discrete drops in efficiency at each active layer.

Fig. 12
figure 12

Single-track reconstruction efficiency is shown as a a function of the initial particle’s \(p_{{\text {T}}}\) when it is required that the parent particle decays before the IBL for the decay products of a \(\rho \), three- and five-prong \(\tau \) and a \(B^{0}\) and, b versus the production radius for the decay products of a three- and five-prong \(\tau \) as well as a \(B^{0}\), where no requirement is imposed on the production radius of stable charged particles. a Efficiency versus initial particle’s \(p_{{\text {T}}}\). b Efficiency versus production radius

5.4 Performance for tracks in jets

In the previous sections, the performance in simple topologies is discussed. These samples are crucial for understanding the effects of charged-particle separations and multiplicities on the performance, but they are insufficient to quantify the expected performance in the dense jet environments evident in Fig. 4. As demonstrated in Sect. 5.2, samples of dijet MC events do provide a reasonable description of jets in data. The following contains studies of the track reconstruction efficiency in these samples.

Figure 13 shows the charged-primary-particle reconstruction efficiency dependence on the angular distance of a particle to the jet axis for different jet \(\eta \) and \(p_{{\text {T}}}\) ranges. All charged particles studied are required to be created before the IBL. The efficiency drops rapidly towards the centre of the jet, where the charged-particle density is maximal. A slight decrease in efficiency towards the edge of the jet is consistent with an isolated-track efficiency that rises with charged-particle \(p_{{\text {T}}}\)  [27] and a decrease in the average charged-particle \(p_{{\text {T}}}\) with distance from the jet core. The dependence of the efficiency on the jet \(p_{{\text {T}}}\) and on the production radius of the charged particle, where charged particles are not required to be created before the IBL, is shown in Fig. 14. The decrease in efficiency with production radius is from two effects. Firstly, particles created beyond the first active layers of the ID create fewer clusters. Secondly, with the shorter flight length to the next active layer, the average separation between particles is smaller compared to prompt decays, producing more merged clusters. The overall trend for all efficiencies shown is the same at all \(\eta \). However, the loss in absolute efficiency is exacerbated at high \(|\eta |\), while the degradation at small separations between a track and the jet axis is alleviated.

Fig. 13
figure 13

The efficiency to reconstruct charged primary particles in jets with a \(|\eta |<1.2\) and b \(|\eta |>1.2\) is shown as a function of the angular distance of the particle from the jet axis for various jet \(p_{{\text {T}}}\) for simulated dijet MC events

Fig. 14
figure 14

The track reconstruction efficiency is compared for charged primary particles in jets with \(|\eta |<1.2\) (\(|\eta |>1.2\)) for the entire jet-\(p_{{\text {T}}}\) range as a function of a the jet \(p_{{\text {T}}}\) and b the production radius of the charged particle for simulated dijet MC events, where charged particles are not required to be created before the IBL

6 Measurement of track reconstruction efficiency in jets from data

Previous sections discuss the performance of the track reconstruction in dense environments based mainly on MC simulation. This section introduces a novel method to probe this performance in data. A measurement of the fraction of tracks lost in reconstruction due to the high density and collimation of charged particles in high-\(p_{{\text {T}}}\) jets is presented for the subset of tracks with a B-layer cluster created by two charged particles.

The \({\text {d}{\textit{E}}/d{\textit{x}}}\) of a charged particle traversing the pixel sensor is measured from the charge collected in the clusters associated with the reconstructed track. With single particles and thin layers, one expects the \({\text {d}{\textit{E}}/d{\textit{x}}}\) measurements to approximately follow a Landau distribution [28]. A typical particle reconstructed from an LHC collision is expected to be a minimum-ionizing particle (MIP). Thus, two particles contributing to the same cluster are expected to deposit twice the energy of a single MIP. In the context of this paper, \({\text {d}{\textit{E}}/d{\textit{x}}}\) is normalized to the material density, and it therefore has units of \({\text{MeV}} {\text{g}}^{-1}{\text{cm}}^{2}\).

As demonstrated in the previous sections, near the jet core the charged-particle density is high and particles can be highly collimated. The tracks of these particles are thus more likely to create merged clusters, as shown in Fig. 5. By fitting the cluster \({\text {d}{\textit{E}}/d{\textit{x}}}\) for reconstructed tracks near the core of the jet, single-particle clusters can be statistically separated from merged clusters. The fraction of lost tracks can therefore be inferred from the number of times only one reconstructed track is associated with a cluster \({\text {d}{\textit{E}}/d{\textit{x}}}\) compatible with two MIPs. At truth-level, this fraction is defined as follows: the denominator is the number of truth particles passing the analysis selections (listed in Sect. 6.1, and including a \(p_{{\text {T}}}>10\) \(\text {GeV}\) requirement), which have a B-layer cluster created by exactly two charged particles; the numerator is the subset of these particles which failed to be reconstructed.

For the IBL, ToT is encoded in four bits. Eight bits are available in each of the remaining three pixel layers, which therefore provide an enhanced ToT resolution compared to the IBL, resulting in a superior energy resolution. For this reason, the cluster \({\text {d}{\textit{E}}/d{\textit{x}}}\) values corresponding to the B-layer are used in this study.

6.1 Track selection

To enhance the contribution of high-quality collimated tracks and suppress fake tracks to a negligible number, additional track selections beyond those outlined in Sect. 3.3 are required for all tracks used in this analysis:

  • Exactly one pixel cluster per layer,

  • \(p_{{\text {T}}}>\) 10 \(\text {GeV}\),

  • \(|\eta |<\) 1.2,

  • \(|d_0^{{\text {BL}}}|<\) 1.5 mm,

  • \(|z_0^{{\text {BL}}}\sin \theta |<\) 1.5 mm,

  • Minimum of six SCT clusters.

6.2 Fit method

A measurement distribution of cluster \({\text {d}{\textit{E}}/d{\textit{x}}}\) of tracks inside the jet core is fit using two \({\text {d}{\textit{E}}/d{\textit{x}}}\) template distributions: a single-track template containing mainly tracks reconstructed from a single-particle cluster, and a multiple-track template mainly made up of tracks reconstructed from a merged cluster. Both templates are derived directly from collision data or from simulation for the corresponding efficiency measurements.

As verified in simulation, most highly collimated tracks are expected to be within \(\Delta \textit{R}{\text {(jet,trk)}}<0.05\) which then defines the jet core for this method. Outside the jet core, the contribution of collimated tracks is negligible, and therefore all tracks are expected to be reconstructed from a single-particle cluster. The single-track template is created using tracks reconstructed from clusters which are neither identified as merged nor shared and that are well outside the jet core (\(\Delta \textit{R}{\text {(jet,trk)}}\) \(>0.1\)). The multiple-track template is taken from tracks reconstructed from either B-layer clusters identified as merged or shared B-layer clusters inside the jet core. These multiply used clusters are likely to be merged clusters.

Examples of the resulting distributions are shown in Fig. 15. The single-track template, displayed as circles in Fig. 15, contains a single peak at the \({\text {d}{\textit{E}}/d{\textit{x}}}\) value expected for a MIP traversing the B-layer of the pixel detector and a long tail to higher values compatible with a Landau distribution. Contamination of merged clusters in this template is 0.3–0.5% in the simulation. The multiple-track template, displayed as squares in the same figure, instead exhibits a peak in the \({\text {d}{\textit{E}}/d{\textit{x}}}\) range expected for two MIPs. A third, smaller peak occurs at \({\text {d}{\textit{E}}/d{\textit{x}}}\) > 3.2 \({\text{MeV}} {\text {g}}^{-1}{\text {cm}}^{2}\) for clusters created by three particles. The peak in the multiple-track template \({\text {d}{\textit{E}}/d{\textit{x}}}\) distribution at values expected for one MIP is due to the fact that multiply used clusters can also originate from shared clusters or clusters identified as merged which, in truth, are not merged clusters.

Fig. 15
figure 15

Single-track and multiple-track templates for data with a jet \(p_{{\text {T}}}\) in the range \(200~\text {GeV}<\) \(p_{{\text {T}}}^{{\text {jet}}}\) \(<400~\text {GeV}\)

The measurement distribution is created from tracks inside the jet core that are reconstructed from a cluster which is neither identified as merged nor shared. No additional requirements are made on other tracks using this cluster, including whether or not they satisfy the selections outlined in Sect. 6.1. The resulting \({\text {d}{\textit{E}}/d{\textit{x}}}\) distribution contains single-particle clusters with a peak at the energy of one MIP and a long tail to high values, as well as an enhanced contribution of merged clusters from two particles. Contributions from clusters from more than two particles are negligible. The true two-MIP clusters are created from a pair of tracks where only one track is reconstructed. Therefore, for every reconstructed track in the measurement distribution with a merged cluster, there is one particle which is not reconstructed. Using this information, the number of tracks contributing to merged B-layer clusters from two particles (\(N^{{\text {True}}}_{{\text {2}}}\)) is found from the sum of the number of reconstructed particles in the multiple-track template (\(N^{{\text {Reco}}}_{{\text {2}}}\)) and twice the number of lost particles (\(N_{\text{Lost}}\)),

$$N^{\text{True}}_{\text{2}} = N^{\text{Reco}}_{\text{2}} + 2 \cdot N_{\text{Lost}}. $$
(1)

The sample of \(\rho \) decays discussed in Sect. 5.3 is used to confirm that the multiple-track template captures merged clusters and that the second MIP peak in the measurement sample does in fact contain merged clusters where one contributing particle is not reconstructed. Therefore, to obtain the number of lost tracks (\(N_{{\text {Lost}}}\)), the measurement distribution is fit with the two templates. The fraction of merged clusters in the measurement distribution, \(F^{{\text {merged}}}\), is simply calculated from the post-fit number of tracks in the multiple-track template divided by the total number of tracks (\(N^{{\text {Reco}}}_{{\text {Data}}}\)). Finally, the fraction of lost tracks passing through the same detector element as a reconstructed track is given by:

$$\begin{aligned} F^{{\text {lost}}_{{\text {2}}}}= & {} \frac{N_{{\text {Lost}}}}{N^{{\text {True}}}_{{\text {2}}}}, \end{aligned}$$
(2)
$$\begin{aligned}\approx & {} \frac{N_{{\text {Lost}}}}{N^{{\text {Reco}}}_{{\text {2}}} + 2 \cdot N_{{\text {Lost}}}}, \end{aligned}$$
(3)

where

$$\begin{aligned} N_{{\text {Lost}}}=F^{{\text {merged}}} \cdot N^{{\text {Reco}}}_{{\text {Data}}}. \end{aligned}$$
(4)

The relation is approximate due to the assumption that the lost track of a pair of tracks has the same properties (e.g. \(p_{{\text {T}}}\) and hit content) as the reconstructed track. In simulation, this assumption can be explicitly checked by requiring the truth particle corresponding to the lost track to also pass the analysis selections. This confirms that the deviation from the approximation results in a less than 1.5% change in \(F^{{\text {lost}}_{{\text {2}}}}\).

To minimize the effect of clusters created by more than two particles, the fit was performed over the range 1.1–3.07 (1.26–3.2) \({\text{MeV}} {\text {g}}^{-1}{\text {cm}}^{2}\) for data (simulation). Contributions from clusters from more than two particles in this range are of the order of a few percent. An offset in the distributions observed in MC events compared to data requires an adjustment of the respective fit ranges. The ranges are chosen to have the same fraction of clusters inside the fit range with respect to all clusters in the distribution. An imperfect description of the leading edge of the measurement distribution by the single-track template would affect the fitted result. Since the area of interest lies at much higher \({\text {d}{\textit{E}}/d{\textit{x}}}\) values, the lower edge of the fit range was chosen to avoid as much as possible the leading edge of the single-particle \({\text {d}{\textit{E}}/d{\textit{x}}}\) peak, while retaining a large sample for the remainder of the distribution.

To study the dependence of lost tracks on jet \(p_{{\text {T}}}\), the fit is performed in seven different bins of jet \(p_{{\text {T}}}\) ranging from 200 \(\text {GeV}\) to 1600 \(\text {GeV}\) in steps of 200 \(\text {GeV}\).

The measurement is performed both on data and simulation samples. For simulation, separate templates are constructed for each jet-\(p_{{\text {T}}}\) bin. For data, the single-track and multiple-track templates are derived from the lowest jet-\(p_{{\text {T}}}\) bin, shown in Fig. 15, due to the small number of events at higher jet \(p_{{\text {T}}}\). It was verified that within the statistical uncertainty of the high-\(p_{{\text {T}}}\) bins, the templates derived from the lowest jet-\(p_{{\text {T}}}\) bin have the same shape within the fitted range.

6.3 Systematic uncertainties

The resulting \(F^{{\text {merged}}}\) exhibits a statistical uncertainty due to the finite number of entries in both the template and the measurement distributions.

Various potential sources of systematic bias were studied and are discussed below. The relative values for data are summarized in Table 1 and values for MC simulation are comparable. The measured \(F^{{\text {lost}}_{{\text {2}}}}\) varies as a function of the range in \({\text {d}{\textit{E}}/d{\textit{x}}}\) for which the distribution is fit. This is due to the different fractions of clusters with a \({\text {d}{\textit{E}}/d{\textit{x}}}\) of two and three MIPs falling in the fitted range. The effect was estimated by increasing the fit range. The fitting process was repeated for six different ranges with the upper edge increasing in 0.2 \({\text{MeV}} {\text {g}}^{-1}{\text {cm}}^{2}\) increments. A symmetric uncertainty, equal to the maximum change in \(F^{{\text {lost}}_{{\text {2}}}}\), is applied to each jet-\(p_{{\text {T}}}\) bin. The start of the fitted range was chosen such that small variations have a negligible impact on \(F^{{\text {lost}}_{{\text {2}}}}\).

A systematic uncertainty considered for data is the result of fitting all data jet-\(p_{{\text {T}}}\) bins with the templates from the lowest jet-\(p_{{\text {T}}}\) bin. This results in an overestimate of \(F^{{\text {lost}}_{{\text {2}}}}\) increasing with jet \(p_{{\text {T}}}\). To account for this bias, a \(p_{{\text {T}}}\)-dependent multiplicative correction was determined by comparing the \(F^{{\text {lost}}_{{\text {2}}}}\) values fitted in simulation with templates from the corresponding jet-\(p_{{\text {T}}}\) bin with those obtained using a template from the lowest jet-\(p_{{\text {T}}}\) bin. This correction increases from about 10 to 25% for jets with a \(p_{{\text {T}}}\) ranging from 400 to 600 \(\text {GeV}\) and from 1400 to 1600 \(\text {GeV}\), respectively. This correction term was applied to data \(F^{{\text {lost}}_{{\text {2}}}}\) values after completing the fitting procedure. In addition, the difference between the two simulation \(F^{{\text {lost}}_{{\text {2}}}}\) values compared for the correction factor was also included as a systematic uncertainty. An additional check performed with a large simulated sample showed a 3–8% bias in \(F^{{\text {lost}}_{{\text {2}}}}\) in the studied jet-\(p_{{\text {T}}}\) range due to the fraction of tracks reconstructed from \(\ge \) 3 particle clusters, relative to the two-particle contribution in the multiple-track template.

To validate the method, and provide an estimate of any residual biases, a truth-based closure test was performed using simulated samples. At low jet \(p_{{\text {T}}}\), the residual \({\text {d}{\textit{E}}/d{\textit{x}}}\) peak at values expected from one MIP in the multiple-track template contributes to a non-closure. Also, for all jet \(p_{\text {T}}\), isolated-track reconstruction efficiency, the composition of multiple-particle clusters, including particle composition and the calibration of \({\text {d}{\textit{E}}/d{\textit{x}}}\) itself are all covered in this non-closure estimate. This is already covered by the systematic uncertainty determined from changing the fit range described above, but also leads to a non-closure. In the lowest jet-\(p_{{\text {T}}}\) bin, a non-closure of approximately +18% is observed, corresponding to an absolute overestimation of the true \(F^{{\text {lost}}_{{\text {2}}}}\) of about 0.013, but then quickly decreases with increasing jet \(p_{{\text {T}}}\). This uncertainty is included for both simulation and data with the corresponding relative values in Table 1.

Other possible sources of uncertainty are contributions to \(F^{{\text {lost}}_{{\text {2}}}}\) not originating from the density of the environment. Such contributions could come from pile-up tracks creating merged clusters with tracks in the jets, as well as lost isolated tracks. Conservative estimates based on MC studies showed that such contributions are 2–6% of the total \(F^{{\text {lost}}_{{\text {2}}}}\) in the studied jet-\(p_{{\text {T}}}\) range. This effect is covered by the non-closure systematic uncertainty described above.

Uncertainties in the jet energy scale calibration and resolution have negligible impact in the analysis. Possible effects due to the binning of the \({\text {d}{\textit{E}}/d{\textit{x}}}\) distributions were studied and found also to be insignificant.

Table 1 Measured \(F^{{\text {lost}}_{{\text {2}}}}\), relative values of leading systematic uncertainties, and total systematic and statistical uncertainty in the fraction of lost tracks for data in bins of jet \(p_{{\text {T}}}\)

6.4 Results

Figure 16 shows the fit result for data in two bins of jet \(p_{{\text {T}}}\). The single-track and multiple-track \({\text {d}{\textit{E}}/d{\textit{x}}}\) templates provide a good description of the \({\text {d}{\textit{E}}/d{\textit{x}}}\) distribution as visible from the ratio in Fig. 16.

Fig. 16
figure 16

Data \({\text {d}{\textit{E}}/d{\textit{x}}}\) measurement distributions (black circles) with fit results (solid line) are shown for a \(200~\text {GeV}<p_{{\text {T}}} ^{\mathrm {jet}}<400~\text {GeV}\) and b \(1000~\text {GeV}<p_{{\text {T}}} ^{\mathrm {jet}}<1200~\text {GeV}\). The single-track template scaled by \(1 - F^{{\text {merged}}}\) is shown as the single-track contribution (dashed line) and the multiple-track template scaled by \(F^{{\text {merged}}}\) is shown as the multiple-track contribution (dotted line). The bottom panel in each plot shows the ratio of the fit to the data within the fit range (1.1–3.07 \({\text{MeV}} {\text {g}}^{-1}{\text {cm}}^{2}\))

Differences between event generators, such as different hadronization models and flavour compositions, can affect \(F^{{\text {lost}}_{{\text {2}}}}\) and the overall comparison of data and MC simulation. By comparing the fit results from simulated samples made with the Pythia 8, Sherpa and Herwig++event generators, a generator uncertainty was derived for simulation only. For each jet-\(p_{{\text {T}}}\) bin, results from Pythia are taken as the central value and the largest difference of \(F^{{\text {lost}}_{{\text {2}}}}\) between the three generators is symmetrized and taken as the generator uncertainty. The relative generator uncertainties in the fraction of lost tracks ranges from 4 to 37% in the different jet-\(p_{{\text {T}}}\) bins.

A comparison of \(F^{{\text {lost}}_{{\text {2}}}}\) as a function of jet \(p_{{\text {T}}}\) for data and simulation is shown in Fig. 17. As the jet \(p_{{\text {T}}}\) increases, so does \(F^{{\text {lost}}_{{\text {2}}}}\), with a similar trend observed in both data and simulation. This increase is caused by an increasing density of charged particles, which thereby causes higher collimation of the track pair, and is not due to confusion in correctly assigning clusters to tracks. At a certain point, the two particles are so collimated that the reconstructed tracks start to overlap completely up to the radius of the SCT detector. At that point a similar effect as shown for tracks from the \(\rho \) decay in Figs. 10 and 12 occurs. The cluster assignment efficiency for reconstructed tracks remains constant with increasing jet \(p_{{\text {T}}}\), indicating no degradation of performance due to the environmental effects besides the second track. Only because of their increasingly collimated nature, the probability of losing one of the tracks rises. This effect was confirmed in simulation for tracks selected by this analysis.

The measurements in data and MC simulation are consistent across the whole studied jet-\(p_{{\text {T}}}\) range.

Fig. 17
figure 17

The measured fraction of lost tracks, \(F^{{\text {lost}}_{{\text {2}}}}\), in the jet core (\(\Delta \textit{R}{\text {(jet,trk)}}\) \(<0.05\)) as a function of jet \(p_{{\text {T}}}\) for data (black circles) and simulation (red line). Vertical solid error bars indicate statistical uncertainty, while the total uncertainty is represented by dashed error bars for data and a shaded area for simulation

7 Conclusion

This paper presents the performance of the ATLAS track reconstruction chain with detailed studies in dedicated topologies, such as the cores of high-\(p_{{\text {T}}}\) jets and the decays of \(\tau \)-leptons, that are characterized by charged-particle separations comparable to the inner detector’s sensor granularity. The ambiguity-solver stage of the reconstruction chain is described, including the usage of a neural-network-based approach to identify pixel clusters created by multiple charged particles. The current performance is demonstrated with simulated samples of a single particle decaying to a set of collimated charged particles. In the cores of jets, the number of IBL clusters on tracks, as well as the expected track reconstruction efficiency, is robust up to the highest investigated \(p_{{\text {T}}}\) values.

A novel, fully data-driven technique, using the energy loss to identify clusters as originating from two charged particles is introduced to measure the fraction of charged particles, creating these clusters, that fail to be reconstructed. The results are presented using tracks with \(p_{{\text {T}}}\) above 10 \(\text {GeV}\) in the core of a jet from 3.2 fb\(^{-1}\) of 13 \(\text {TeV}\) proton–proton collisions at the LHC. The measured fraction of lost tracks as a function of jet transverse momentum was found to range from \(0.061 \pm 0.006{\text {(stat.)}} \pm 0.014{\text {(syst.)}}\) to \(0.093 \pm 0.017{\text {(stat.)}}\pm 0.021{\text {(syst.)}}\) as the jet \(p_{{\text {T}}}\) increases from 200 to 1600 \(\text {GeV}\). Data and simulation are compatible for the full studied jet-\(p_{{\text {T}}}\) range. This result can be used to minimize the uncertainty in the track reconstruction inefficiency in the cores of jets relevant for jet energy and mass calibrations as well as measurements of jet properties.