1 Introduction

Many theories beyond the Standard Model (SM) predict the existence of heavy neutral leptons (HNLs) to explain the smallness of neutrino masses [1,2,3]. These leptons, N, could be observed at collider experiments if their masses are at the electroweak scale. The HNLs may mix with the light SM neutrinos \(\nu _\ell \), with a strength governed by the coupling \(V_{N\ell }\). The mixing matrix is not expected to be flavour diagonal, which leads to signatures with transitions between different lepton flavours. Experimentally, direct searches for a generic heavy neutrino are performed through their mixing with each flavour of SM neutrino, typically considering decays where no flavour mixing occurs. The HNL is expected to be long-lived if the coupling is small enough. This analysis searches for the mixing of a heavy neutrino with a muon neutrino, taking advantage of the high reconstruction efficiency for muons at LHCb. The HNL mass range covered is between 5 and 50\(\, \text {GeV/}c^2\). The dominant HNL production mechanism in this mass range is via the decay of gauge bosons, \({W} ^{\pm }\rightarrow {\ell } ^{\pm }{\nu } \) and \({Z} \rightarrow {\nu } {\nu } \), where one of the SM neutrinos mixes with the heavy neutrino. For brevity, the processes \({W} ^{\pm }\rightarrow {\ell } ^{\pm }{\nu } [{\nu } \rightarrow N]\) and \({Z} \rightarrow {\nu } {\nu } [{\nu } \rightarrow N]\) will be written as \({W} ^{\pm }\rightarrow {\ell } ^{\pm }N\) and \({Z} \rightarrow {\nu } N\) throughout. Both lepton-number-violating and lepton-number-conserving decays of a heavy neutrino are considered. The heavy neutrino is assumed to have negligibly small lifetime.

The DELPHI collaboration was first to set a limit on these types of decays considering \({Z} \rightarrow \nu N\) decays in \(e ^+e ^-\) collisions at the \(Z \) resonance, where both long- and short-lived signatures were analysed [4]. The upper limit on the \({Z} \rightarrow \nu N\) branching fraction of \(1.3\times 10^{-6}\) at 95% confidence level (CL) for N masses between 3.5 and 50\(\, \text {GeV/}c^2\) leads to one of the most stringent constraints on the coupling in this mass range. At the LHC, a more promising detection approach for N with mass below the weak scale are leptonic decays of \(W \) bosons, \({W} ^\pm \rightarrow {\ell } ^{\pm } N\). Searches by the ATLAS [5,6,7,8,9] and CMS [10,11,12,13,14,15,16] collaborations at centre-of-mass energies of 7, 8 and \(13\, \text {TeV} \) typically probed larger neutrino masses, from \(40\, \text {GeV/}c^2 \) up to \(2700\, \text {GeV/}c^2 \), employing a signature of two same-sign leptons and two jets. A recent search performed by the CMS collaboration also included final states with at least one jet, extending the probed heavy-neutrino mass range down to \(20\, \text {GeV/}c^2 \) [17]. In the mass range studied in this analysis, searches of promptly decaying heavy neutrinos in leptonic final states of the \(W \) boson at centre-of-mass energy of \(13 \, \text {TeV} \) by the ATLAS [18] and CMS [19] collaborations set constraints comparable to that of the DELPHI collaboration. A long-lived signature has also been explored by the ATLAS collaboration, excluding coupling strengths down to about \(10^{-6}\) between 4.5 and \(10\, \text {GeV/}c^2 \), and hence representing the most stringent limit to date in this mass range [18].

Fig. 1
figure 1

Properties of a heavy neutrino as a function of its mass [21, 22]: (left) the branching fractions to final states with a muon and (right) the lifetime, assuming a coupling of \(10^{-4}\)

The branching fraction (\(\mathcal {B}\)) for the decay of a \(W \) boson into a muon and a heavy neutrino is suppressed with respect to the SM decay \({W} ^+\rightarrow {\mu } ^+{\nu } \) by the mixing of the active neutrino with the heavy neutrino and a phase-space factor according to Ref. [20]

$$\begin{aligned} {\mathcal {B}} ({W} \rightarrow {\mu } N)= & {} {\mathcal {B}} ({W} \rightarrow {\mu } {\nu })\left| V_{{\mu } N}\right| ^2\nonumber \\&\times \left( 1-\frac{m_N^2}{m_W^2}\right) ^{2}\left( 1+\frac{m_N^2}{2m_W^2}\right) . \end{aligned}$$
(1)

The heavy neutrino decays via neutral or charged current interactions \(N \rightarrow \nu Z^{(*)}\), \(N \rightarrow \nu H^{(*)}\) or \( N \rightarrow {\mu } ^{\pm }{W} ^{\mp (*)}\), where the \(Z \), Higgs and \(W \) bosons can be on- or off-shell. The corresponding branching fractions are computed based on Refs. [21, 22], where the Higgs contribution is neglected due to its suppression in the mass range considered. The total width is given by the sum of the partial decay widths of charged and neutral current interactions. If the neutrino is a Majorana particle, an additional lepton-number-violating decay contributes to the same final state, with the same partial decay width as the lepton-number-conserving decay. The branching fraction to any non-charge-specific final state is unaffected, but the lifetime is a factor of two smaller than if the neutrino were a Dirac particle.

The left plot of Fig. 1 shows the branching fraction for HNL decay modes with a muon in the final state as a function of the heavy-neutrino mass. The difference between the HNL decay modes to quarks is mainly due to CKM matrix elements [23, 24], with the quark masses only playing a minor role at low heavy-neutrino masses. The branching fraction of the decay \(N\rightarrow {\mu } {\mu } {\nu } \) is about one order of magnitude smaller than that of the \(N\rightarrow {\mu } {{q} {\overline{{q}}}} '\) mode, due to negative interference between charged and neutral current interactions. In the right plot of Fig. 1 the lifetime is shown as a function of the heavy-neutrino mass assuming a coupling of \(10^{-4}\). In the low-mass regime, the lifetime is of the order of a few \(\, \text {ps} \), while at higher masses the lifetime is so small that the decay can be considered prompt.

In this paper, a search is presented for a prompt HNL in the decayFootnote 1\({W ^+} \rightarrow {\mu ^+} N\) with \(N\rightarrow \mu ^{\pm }{q} {\overline{{q}}} '\), as depicted in Fig. 2. Data collected by the LHCb experiment in proton–proton collisions at centre-of-mass energies of 7\(\, \text {TeV}\) in 2011 and 8\(\, \text {TeV}\) in 2012 are used, corresponding to integrated luminosities of 1.0 and 1.9\(\, \text {fb} ^{-1}\) [25], respectively.

The experimental signature consists of two muons and one or two jets depending on the HNL mass. The muon from \(W \) decay, denoted as \(\mu _{{W}}\), carries significant transverse momentum, while the muon from N decay, denoted as \(\mu _{N}\), has lower momentum. Both same-sign and opposite-sign muons are considered, allowing for the possibility that the HNL has a Majorana nature. The signal yields for both categories and several mass hypotheses in the range \(5{-}50\, \text {GeV/}c^2 \) are extracted from the data and normalized with respect to the \({W ^+} \rightarrow {\mu ^+} {\nu } \) decay. Corresponding upper limits are then set on the product of coupling and branching ratio.

The paper is organised as follows. In Sect. 2 the detector, data and simulation samples are described, and in Sect. 3 the selection of signal and normalisation candidates is discussed. Section 4 contains the results and conclusions are drawn in Sect. 5.

Fig. 2
figure 2

Feynman diagram for the production of a heavy neutrino via mixing with a neutrino from the decay of a \(W \) boson and semileptonic decay of the heavy neutrino into a lepton and two quarks. The subscripts \(\alpha \) and \(\beta \) indicate the lepton flavour. In this analysis \(\alpha \) and \(\beta \) are both muons

2 Detector and simulation

The LHCb detector [26, 27] is a single-arm forward spectrometer covering the pseudorapidity range \(2<\eta <5\), designed for the study of particles containing \(b \) or \(c \) quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the pp interaction region [28], a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about \(4{\mathrm {\,Tm}}\), and three stations of silicon-strip detectors and straw drift tubes [29] placed downstream of the magnet. The tracking system provides a measurement of the momentum, \(p\), of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200\(\, \text {GeV/}c\). The minimum distance of a track to a primary proton–proton collision vertex (PV), the impact parameter (IP), is measured with a resolution of \((15+29/p_{\mathrm {T}})\,\upmu \text {m} \), where \(p_{\mathrm {T}}\) is the component of the momentum transverse to the beam, in \(\, \text {GeV/}c\). Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors [30]. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad (SPD) and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers [31]. The online event selection is performed by a trigger [32], which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. For the events selected for this analysis, the trigger requires at least a single muon with \(p_{\mathrm {T}} {}>1.48\, (1.76)\, \text {GeV/}c \) at the hardware stage in 2011 (2012), and includes an upper threshold of 600 hits in the SPD to prevent high-particle multiplicity events from dominating the processing time. A muon with \(p_{\mathrm {T}} {}>10\, \text {GeV/}c \) is required at the software stage.

Simulated samples were generated for the signal decay with both opposite- and same-sign muons in the final state, in equal amount. Samples were generated for HNL masses of 5, 10, 15, 20, 30, and \(50\, \text {GeV/}c^2 \), using the minimal mixing scenario model [33] and accounting for angular correlations due to spin effects. The parton level process is generated with MadGraph 5 [34, 35], while Pythia 8 [36], with a specific LHCb configuration [37], is used for the generation of the underlying event, fragmentation and hadronisation. Decays of hadronic particles are described by EvtGen  [38], in which final-state radiation is generated using Photos  [39]. The interaction of the generated particles with the detector, and its response, are implemented using the Geant4 toolkit [40, 41] as described in Ref. [42]. Simulated background samples are generated using Pythia 8. The DYTurbo [43] program is used to correct the kinematic distributions of the simulated \({W ^+} \rightarrow {\mu ^+} {\nu } \) background.

3 Event selection

Signal candidates are reconstructed from a pair of charged tracks identified as muons and a single jet. First, the high-momentum muon, \(\mu _{{W}}\), is selected. Both the hardware and software trigger decisions are required to be associated to the high-momentum muon candidate. The track is required to have transverse momentum larger than \(20\, \text {GeV/}c \), to be of good quality and to have a high significance of the track curvature to remove high transverse momentum tracks with poorly determined charge. The high-momentum muon candidate is also required to have small relative energy deposition in the calorimeters to reject pions and kaons misidentified as muons. The muon selection criteria for the normalisation channel \({W ^+} \rightarrow {\mu ^+} {\nu } \) are the same as for the high-momentum muon of the signal.

The lower-momentum muon candidate, \(\mu _{N}\), is required to have transverse momentum higher than \(3\, \text {GeV/}c \). The combined invariant mass of the \(\mu _{N}\) and \(\mu _{{W}}\) candidates must be in the range 20 to 70\(\, \text {GeV/}c^2\) to suppress the background from \({Z} \rightarrow \mu \mu \) decays. Depending on the relative charge of the two muons the candidates are classified as same-sign (SS) or opposite-sign (OS).

Jets are reconstructed following a particle flow approach [44], using tracks of charged particles and calorimeter energy deposits as inputs. To prevent overlap between jets and signal muons, tracks identified as muons with a transverse momentum greater than \(2\, \text {GeV/}c \) are excluded from the jet reconstruction. The anti-\(k_T\) jet clustering algorithm is used [45], with a distance parameter \(R = \sqrt{(\Delta \phi )^2+(\Delta \eta )^2}= 0.5\), where \(\phi \) is the azimuthal angle and \(\theta \) the pseudorapidity. The jet four-momentum is calculated from the four-vectors of its constituents, and corrected for pollution from pile-up and the underlying event using the per-event particle multiplicity [44]. To enhance the jet purity the fraction of the jet energy carried by charged particles should be at least 0.1, the jet must have \(p_{\mathrm {T}} >10\, \text {GeV/}c \) and contain at least one track with \(p_{\mathrm {T}} >1.2\, \text {GeV/}c \). Only candidates with at least one jet passing these criteria are retained. Jets are combined with lower-momentum muon candidates to form \(N\rightarrow {\mu _{N}} {\, \text {jet}} \) candidates, which are required to have invariant mass smaller than 80\(\, \text {GeV/}c^2\) and a transverse momentum greater than \(10\, \text {GeV/}c \). The selected heavy-neutrino candidates are then combined with a high-momentum muon candidate to form \(W \) candidates. Since the assignment of the two muons is ambiguous if they both satisfy the high-momentum muon selection, the mass, \(m({\mu _{N}} {\, \text {jet}})\), of the \({\mu _{N}} {\, \text {jet}} \) combination is required to be smaller than that of the \({\mu _{{W}}} {\, \text {jet}} \) combination. Only the \({\mu _{{W}}} {\mu _{N}} {\, \text {jet}} \) candidates within 20\(\, \text {GeV/}c^2\) of the known \(W \) mass [46] are retained.

A scale factor is applied to the jet four-momentum, constraining the invariant mass of the \({\mu _{{W}}} {\mu _{N}} {\, \text {jet}} \) system to the known mass of the \(W \) boson. This leads to a significant improvement in the resolution of \(m({\mu _{N}} {\, \text {jet}})\) and diminishes the sensitivity of the heavy-neutrino mass distribution to the jet energy scale.

Dominant background sources are charged weak currents, in particular \(pp\rightarrow W + X \) with \({W} \rightarrow \mu \nu \) or \({W} \rightarrow \tau \nu \), neutral electroweak Drell–Yan processes \(pp \rightarrow \gamma /Z^{(*)}+X\) with \(\gamma /{Z} ^{(*)} \rightarrow \mu \mu , \tau \tau \), heavy flavour \({b} {\overline{{b}}} \rightarrow X\mu \) and \({c} {\overline{{c}}} \rightarrow X \mu \), and \(X{\mu } \) production from light QCD \(({u},{d},{s})\). In the same-sign muon channel the Drell–Yan type background contributions are highly suppressed, while in the opposite-sign muon channel the contribution from low-mass Drell–Yan processes remains a prominent irreducible background.

Most of the heavy flavour background is suppressed by requiring the IP for \(\mu _{{W}}\) and \(\mu _{N}\) to be less than \(40\,\upmu \text {m} \) and \(100\,\upmu \text {m} \), respectively. The remaining background is reduced by using three different multivariate classifiers based on a Boosted Decision Tree (BDT) algorithm [47,48,49]. The three classifiers are referred to as the \(\mu _{{W}}\) uBDT, the \(\mu _{N}\) uBDT and the kinematics uBDT: the first two classifiers are dedicated to the identification of the respective muons, while the latter exploits the event kinematics to distinguish the signal from the remaining background. All three are trained minimising the dependence of the signal efficiency on the true neutrino mass using the uBoost method [50]. The training of all classifiers uses a cross-validation technique [51]. The classifiers are trained using simulated decays of the heavy neutrino with same-sign muons in the final state as a proxy for signal. Both charged and neutral weak background contributions have a muon in the final state with similar kinematics to the signal high-momentum muon. The \(\mu _{{W}}\) classifier discriminates between the signal and heavy flavour background. It is trained using data candidates where both muons have large impact parameters (\(\text {IP}({\mu _{{W}}})>{{40}{\,\upmu \text {m}}}\), \(\text {IP}({\mu _{N}})>{{100}{\,\upmu \text {m}}}\)) as a proxy for background. For both the \(\mu _{N}\) uBDT and the kinematics uBDT, a combination of the dominant background sources from simulation is used. The input variables used for each of the muon identification classifiers are the combined particle identification information from the RICH, calorimeter and muon systems, the ratio of the energy deposited in both calorimeters to the measured track momentum, and observables describing the isolation of the tracks. Additional isolation variables of different cone sizes are included among the inputs for the \(\mu _{N}\) uBDT classifier. The input variables of the kinematics classifier comprise the angular distance R between the \(\mu _{N}\) and the jet, the angle between the two muons in the rest frame of the heavy neutrino, the transverse component of the sum of the four-momentum of all particles used as particle flow input, the dimuon mass, the combined invariant mass of the dimuon and the jet, and the jet transverse momentum. The optimal requirement on the output of each BDT classifier is selected by maximising the Punzi figure-of-merit [52] for three units of significance. This is first evaluated for the \(\mu _{{W}}\) uBDT, followed by the simultaneous optimisation of the \(\mu _{N}\) and kinematics uBDTs. The optimal requirements are found to be the same for all the simulated signal samples. The selection is optimised for the same-sign muon signal, but it is verified to be optimal for the opposite-sign category as well, since the differences in spin-dependent observables between the two channels have a negligible effect on the output distributions of the BDT classifiers.

The background sources are studied and evaluated in three control regions: one enhanced in electroweak \(W \) background components, one in heavy flavour background components and one in light QCD background components, indicated as \(W \), \(b \) \(\overline{{b}}\) and QCD regions, respectively. The requirements defining the control regions with respect to the signal region are reported in Table 1. An additional region, denoted as the \({Z} \rightarrow {\mu } {\mu } \) region, is defined by the following criteria: both muons are required to have transverse momentum greater than \(20\, \text {GeV/}c \) and IP smaller than 40\(\,\upmu \text {m}\) and the invariant mass of the muon pair must be between 60 and 120\(\, \text {GeV/}c^2\). In each control region the predicted background composition and yield are compared to the data to confirm that no other contribution has been neglected.

Table 1 Requirements on IP and BDT classifiers defining the signal and control regions

4 Fit strategy and results

The product of the branching fraction \({\mathcal {B}} (N\rightarrow {\mu } {\, \text {jet}})\) and the squared coupling \(\left| V_{{\mu } N}\right| ^2\) is proportional to the number of signal candidates, \(N_{\text {sig}}\), and can be written with respect to the number of \({W} \rightarrow {\mu } {\nu } \) candidates as

$$\begin{aligned}&{\mathcal {B}} (N\rightarrow {\mu } {\, \text {jet}}) \left| V_{{\mu } N}\right| ^2=\nonumber \\&\quad \frac{N_{\text {sig}}}{N_{\text {norm}}}\frac{{\varepsilon } _{\text {norm}}}{{\varepsilon } _{\text {sig}}}\left( 1-\frac{m_N^2}{m_W^2}\right) ^{-2}\left( 1+\frac{m_N^2}{2m_W^2}\right) ^{-1}, \end{aligned}$$
(2)

where \(N_{\text {sig}}\) and \(N_{\text {norm}}\) denote the yields of the signal and normalisation channels and \({\varepsilon } _{\text {sig}}\) and \({\varepsilon } _{\text {norm}}\) their efficiencies. The phase-space suppression factor and the coupling term arise from the heavy-neutrino production process described by Eq. 1. The \({W ^+} \rightarrow {\mu ^+} {\nu } \) branching fraction in Eq. 1 cancels with the normalisation channel.

4.1 Normalisation channel

The yield of the normalisation channel is determined using a binned maximum-likelihood fit to the muon transverse momentum distribution separately for each year of data taking and in eight bins of muon pseudorapidity. The fit is performed separately for positively and negatively charged muons to account for the difference in production rate at LHCb. The main background contributions are \(\gamma /{Z} ^{(*)}{\rightarrow } \mu \! \mu \) decays and hadron misidentification (denoted as QCD). Minor contamination from \({Z} \rightarrow \tau \tau \), \({W} \rightarrow \tau \nu \) and \({{b} {\overline{{b}}}} \) processes is also present. The templates are obtained in bins of pseudorapidity from simulation for each component, with the exception of the QCD background templates that are determined from a control sample characterised by large energy deposits in the calorimeters. The yields for the minor background contributions are fixed to their expected values from simulation. The yield for the \({Z} \rightarrow {\mu } {\mu } \) component is constrained to the value obtained from the corresponding control region extrapolated according to simulation. The distribution of the muon transverse momentum for 2012 data integrated over pseudorapidity is shown in Fig. 3 with the filled histograms resulting from the fit to the data overlaid.

Fig. 3
figure 3

The (left) positive and (right) negative muon transverse momentum spectra for the 2012 data set integrated over pseudorapidity for the normalisation channel. The filled histograms are the result of the fit to the data

Systematic uncertainties on the normalisation yield are estimated separately for the 2011 and 2012 data sets by varying the shape and normalisation of the templates. Replacing the QCD template with an exponential distribution and varying the \({W} \rightarrow {\mu } {\nu } \) templates each yield a difference with respect to the default fit of about 1%, which is assigned as a systematic uncertainty. The ratio of measured QCD yields per pseudorapidity bin between positively and negatively charged muons deviates from unity. A systematic uncertainty of 0.7% is assigned to account for the difference with respect to the default fit when the normalisation of the QCD component is fixed bin by bin to the average of the yields. Systematic uncertainties of less than 0.1% are assigned for each of the components whose yield is fixed in the fit to account for the largest variation observed with respect to the default fit when each yield is changed by one standard deviation. For the 2011 data set an additional source of uncertainty is considered to account for the difference in templates between 2011 and 2012 simulation, resulting in 0.7% assigned systematic uncertainty. The total systematic uncertainty on the normalisation yield is 1.8 and 1.6% for the 2011 and 2012 data sets, respectively.

The total yield for the normalisation channel \({W} \rightarrow {\mu } {\nu } \) is \((795\pm 1 \pm 15)\times 10^3\) for 2011 data and \((1719 \pm 2 \pm 27)\times 10^3\) for 2012 data, where the first uncertainty is statistical and the second systematic. The total yield comprises 57% \(W ^+\) decays and 43% \(W ^-\) decays. The ratio of the measured yields for positively and negatively charged muons as a function of pseudorapidity is in good agreement with the simulation and the measurement of Ref. [53].

4.2 Efficiency ratio

The efficiency and corresponding uncertainties of the selection requirements for both the normalisation and signal samples are determined separately for each year of data taking using simulation. Corrections to account for mismodelling in simulation are derived from control samples, such as \({Z} \rightarrow {\mu ^+\mu ^-} \) and \({{J /\psi }} \rightarrow {\mu ^+\mu ^-} \), and are applied to the efficiencies related to the reconstruction of the two muons, the required number of hits in the SPD and the \(\mu _{{W}}\) uBDT and \(\mu _{N}\) uBDT criteria. When sufficient data is available, the corrections are evaluated in bins of pseudorapidity and momentum or transverse momentum of the muon. The dominant source of systematic uncertainties arises from the different detector response to jets between simulation and data. The energy scale is modelled to an accuracy of about 5%, driven mainly by the response to neutral particles, while the jet energy resolution is modelled in simulation to an accuracy of about 10% [44, 54, 55]. The corresponding systematic uncertainties on the efficiency ratio are evaluated in simulation by varying the jet energy by 5% for the former and by smearing the jet transverse momentum by 10% for the latter. Both resulting uncertainties vary between 5 and 11% depending on the heavy-neutrino mass, where the fluctuation is due to the limited size of the simulated samples. The overall uncertainty due to jet identification requirements, which amounts to 1.7%, is taken from Ref. [56]. A systematic uncertainty to account for the mismatch between simulation and data of the missing transverse momentum in the event varies between 1 and 2.5% depending on the heavy-neutrino mass. The uncertainties related to the efficiency of the \(\mu _{{W}}\) selection largely cancel for the signal and normalisation modes, since their selections are identical. The relative uncertainty on the correction factors is of the order of 2%. The ratios of efficiencies between the normalisation and signal channel, for different heavy neutrino masses, are reported in Table 2.

Table 2 Efficiency ratios, for different heavy-neutrino masses, between normalisation and signal channels. The first uncertainty is statistical, the second is systematic

4.3 Neutrino mass model

The signal yield for each heavy-neutrino mass hypothesis is determined from a binned maximum-likelihood fit to the invariant mass \(m({\mu _{N}} {\, \text {jet}})\). In the fits, the normalisation channel yield, the efficiency ratio, and background yields are Gaussian-constrained to their expected values within uncertainties.

The yields for the main background components are determined in the respective control regions. The yields for \({W} \rightarrow \mu {\nu } \) and \({Z} \rightarrow {\mu } {\mu } \) background contributions are obtained from a binned maximum-likelihood fit of the invariant mass \(m({\mu _{N}} {\, \text {jet}})\) in the \(W \) region, and for the \({{b} {\overline{{b}}}} \) background in the \({{b} {\overline{{b}}}} \) region. The fits in the control regions are performed separately for positively and negatively charged \(\mu _{{W}}\) and per year of data taking with templates obtained from simulation. The expected background yields in the signal region are determined by scaling the fitted yields according to simulation. The light QCD contribution in the signal region is estimated with a different method. The efficiency of the \(\mu _{{W}}\) uBDT requirement \(\varepsilon _{\text {QCD}}\) is evaluated using the normalisation channel, assuming that it factorises from the other selection criteria that suppress the QCD background. The number of light QCD events in the QCD region is obtained by subtracting from the total number of events the expected yields for the \(W \), \(Z \) and heavy flavour background components. The result is scaled by the ratio \(\varepsilon _{\text {QCD}} / (1-\varepsilon _{\text {QCD}})\) to determine the number of light QCD events in the signal region. The estimated background yields in the signal region are collected in Table 3 for same-sign and opposite-sign muons in Run 1 (2011 and 2012 combined) data. The uncertainty is dominated by the limited size of the simulated samples. The background predictions are tested in validation regions. These are defined by inverting one by one the requirements of Table 1 defining the signal region. The results are found to be in good agreement with the data.

Table 3 Extrapolated background yields in the signal region for same-sign and opposite-sign muon channels. The uncertainty is statistical

The templates for both signal and background contributions are determined from simulation. The light QCD background is assumed to have the same shape as the \({{b} {\overline{{b}}}} \) background and therefore a single component is included in the fit for both.

Fig. 4
figure 4

Distributions of the invariant mass \(m({\mu } _N{\, \text {jet}})\) for (left) same-sign and (right) opposite-sign muons. The signal component corresponds to a \(15\, \text {GeV/}c^2 \) neutrino

Fig. 5
figure 5

Expected (dashed line) and observed (solid line) upper limit on \({\mathcal {B}} (N\rightarrow {\mu } {\, \text {jet}})\left| V_{{\mu } N}\right| ^2\) at 95% CL for (left) the same-sign muons sample and (right) the opposite-sign muons sample. The light and dark green bands show the \(1\sigma \) and \(2\sigma \) uncertainties, respectively, on the expected upper limits

4.4 Results

The number of events observed in data in the signal region amounts to 8 and 2147 for same-sign and opposite-sign muons, respectively. A single fit to the Run 1 data is performed since the 2011 and 2012 templates are found to be compatible. The distributions of the invariant mass \(m({\mu } _N{\, \text {jet}})\) for same-sign and opposite-sign muon data are shown in Fig. 4 with the fits for the \(15\, \text {GeV/}c^2 \) neutrino mass hypothesis superimposed. Upper limits at 95% confidence level on \({\mathcal {B}} (N\rightarrow {\mu } {\, \text {jet}}) \left| V_{{\mu } N}\right| ^2\) are set for each heavy-neutrino mass hypothesis using the CLs method [57] with a one-sided profile likelihood ratio [58] as test statistic. The upper limits as a function of heavy-neutrino mass are shown in Fig. 5. For the same-sign muons sample and neutrino mass in excess of \(20\, \text {GeV/}c^2 \), the measured limit is between 2 and 3.8 standard deviations above the expected limit. The worse limit obtained with respect to the expectation can be attributed to the four data candidates with \(m({\mu } {\, \text {jet}})\) between 20 and \(40\, \text {GeV/}c^2 \). The value of the muon identification BDTs for three of the candidates are very close to the requirements, defined a priori with a blinded procedure, indicating that they are background-like and probably a QCD fluctuation. Each candidate has also a relatively large value for the missing transverse momentum in the event, which is not characteristic for the signal. Consequently, the excess at high mass is likely the result of an imperfectly modelled component of the background. For the opposite-sign muons samples, the expected limit is a factor 5 to 10 worse due to the irreducible background from Drell–Yan processes, in agreement with expectations.

To set upper limits on the coupling, the results of Fig. 5 are scaled by \({\mathcal {B}} (N\rightarrow {\mu } {\, \text {jet}})=0.51\), computed as described in Sect. 1 assuming \(\left| V_{{e} N}\right| ^2 = \left| V_{{\tau } N}\right| ^2 = 0\). For the \(5\, \text {GeV/}c^2 \) heavy-neutrino mass hypothesis, at the limit set, the heavy neutrino is expected to be long-lived with a lifetime of \(3.8\, \text {ps} \) and \(1.1\, \text {ps} \) for same- and opposite-sign muons in the final states, respectively. Since this search targets prompt heavy neutrinos, the acceptance is corrected accordingly. The constraints on the coupling as a function of mass for the opposite- and same-sign muons final state, with and without the acceptance correction factor applied, are illustrated in Fig. 6.

Fig. 6
figure 6

Observed upper limit on the mixing parameter \(\left| V_{{\mu } N}\right| ^2\) between a heavy neutrino and a muon neutrino in the mass range \(5{-}50 \, \text {GeV/}c^2 \) for same-sign and opposite-sign muons in the final states with and without lifetime correction

5 Conclusion

A search for a prompt heavy neutrino in the decay \(N\rightarrow \mu {\, \text {jet}} \) is performed using data from proton–proton collisions recorded by the LHCb experiment, corresponding to a total integrated luminosity of \(3\, \text {fb} ^{-1} \). No evidence for heavy neutrinos is observed and limits of the order of \(10^{-4}\) and \(10^{-3}\) are set as a function of heavy-neutrino mass for lepton-number-conserving and lepton-number-violating decays, respectively. An upwards fluctuation is present in the lepton-number-violating case, which is likely ascribable to an imperfectly modelled component of the background. These represent the first limits on the coupling to a heavy neutrino in the mass range 5–\(50\, \text {GeV/}c^2 \) at LHCb. For the first time the signature of two muons and a low mass jet has been probed for heavy neutrinos with mass lower than \(20\, \text {GeV/}c^2 \). Furthermore, this is the first limit on lepton-number-conserving decays of a prompt heavy neutrino in the mass range of interest. The observed limits on lepton-number-violating decays are not yet competitive with the existing limits [4, 18, 19]. With an integrated luminosity of \(50\, \text {fb} ^{-1} \), a better sensitivity than the current most stringent limit could be reached for the same-sign muons channel. While this analysis targets prompt heavy-neutrino decays, better sensitivity for low heavy-neutrino masses can be achieved by including long-lived signatures.