1 Introduction

Rare decays based on the flavour-changing neutral current \(b\rightarrow s\) transition are sensitive probes of physics beyond the Standard Model (SM). In recent years, a plethora of observables, including branching ratios, CP and angular asymmetries in inclusive and exclusive B decay modes, has been measured at the B factories and at LHC experiments. This wealth of data allows to investigate the helicity structure of flavour-changing interactions as well as possible new sources of CP violation.

In 2013, the observation by LHCb of a tension with the SM in \(B\rightarrow K^*\mu ^+\mu ^-\) angular observables [1] has received considerable attention from theorists and it was shown that the tension could be softened by assuming the presence of new physics (NP) [25]. In 2014, another tension with the SM has been observed by LHCb, namely a suppression of the ratio \(R_K\) of \(B\rightarrow K\mu ^+\mu ^-\) and \(B\rightarrow Ke^+e^-\) branching ratios at low dilepton invariant mass [6]. Assuming new physics in \(B\rightarrow K\mu ^+\mu ^-\) only, a consistent description of these anomalies seems possible [710]. In addition, also branching ratio measurements of \(B\rightarrow K^*\mu ^+\mu ^-\) and \(B_s\rightarrow \phi \mu ^+\mu ^-\) decays published recently [11, 12] seem to be too low compared to the SM predictions when using state-of-the-art form factors from lattice QCD or light-cone sum rules (LCSR) [1316]. Finally, in the latest update of the LHCb \(B\rightarrow K^*\mu ^+\mu ^-\) analysis from 2015 [17], the tensions in angular observables persist.

While the ratio \(R_K\) is theoretically extremely clean, predicted to be 1 to an excellent accuracy in the SM [18], the other observables mentioned are plagued by sizable hadronic uncertainties. On the one hand, they require the knowledge of the QCD form factors; on the other hand, even if the form factors were known exactly, there would be uncertainties from contributions of the hadronic weak Hamiltonian that violate quark–hadron duality and/or break QCD factorisation. These two sources of theoretical uncertainty have been discussed intensively in the recent literature [16, 1921] (see also the earlier work [2225], as well as efforts to design observables with limited sensitivity to hadronic uncertainties in various kinematic regimes [2632]). Understanding how large these hadronic effects could be is crucial to disentangle potential new physics effects from underestimated non-perturbative QCD effects, if significant tensions from the SM expectations are observed in the data. The main aim of our present analysis is thus to perform a global analysis of all relevant experimental data to answer the following questions:

  1. 1.

    Is there a significant tension with SM expectations in the current data on \(b\rightarrow s\) transitions?

  2. 2.

    Assuming the absence of NP, which QCD effects could have been underestimated and how large would they have to be to bring the data into agreement with predictions, assuming they are wholly responsible for an apparent tension?

  3. 3.

    Assuming the QCD uncertainties to be estimated sufficiently conservatively, what do the observations imply for NP, both model independently and in specific NP models?

Our work builds on our previous global analyses of NP in \(b\rightarrow s\) transitions [3, 33, 34], but we have built up our analysis chain from scratch to incorporate a host of improvements, including in particular the following.

  • In our global \(\chi ^2\) fits, we take into account all the correlations of theoretical uncertainties between different observables and between different bins. This has become crucial to assess the global significance of any tension, as the experimental data are performed in more and more observables in finer and finer bins.

  • We assess the impact of different choices for the estimates of theoretical uncertainties on the preferred values for the Wilson coefficients.

  • We model the subleading hadronic uncertainties in exclusive semileptonic decays in a different way, motivated by discussions of these effects in the recent literature (see e.g. [16, 19, 20, 2225]); see Sect. 2 for details.

The novel features of our analysis in comparison to similar recent studies in the literature [2, 4, 5, 8, 9], are as follows:

  • We use the information on \(B\rightarrow K^*\) and \(B_s\rightarrow \phi \) form factors from the most precise LCSR calculation [13, 16], taking into account all the correlations between the uncertainties of different form factors and at different \(q^2\) values. This is particularly important to estimate the uncertainties in angular observables that involve ratios of form factors.

  • We include in our analysis the branching ratio of \(B_s\rightarrow \phi \mu ^+\mu ^-\), showing that there exists a significant tension between the recent LHCb measurements and our SM predictions.

Our paper is organised as follows. In Sect. 2, we define the effective Hamiltonian and discuss the most important experimental observables, detailing our treatment of theoretical uncertainties. In Sect. 3, we perform the numerical analysis. We start by investigating which sources of theoretical uncertainties, if underestimated, could account for the tension even within the SM. We then proceed with a model-independent analysis beyond the SM, studying the allowed regions for the NP Wilson coefficients. In Sect. 4, we discuss what the model-independent findings imply for the minimal supersymmetric standard model as well as for models with a new heavy neutral gauge boson. We summarise and conclude in Sect. 5. Several appendices contain all our SM predictions for the observables of interest, details of our treatment of form factors and plots of constraints on Wilson coefficients.

2 Observables and uncertainties

In this section, we specify the effective Hamiltonian encoding potential new physics contributions and we discuss the most important observables entering our analysis. The calculation of the observables included in our previous analyses [3, 33, 34] (see also [16, 35]) have been discussed in detail there and in references therein; here we only focus on the novel aspects of the present analyses – like the \(B_s\rightarrow \phi \mu ^+\mu ^-\) decay – and on our refined treatment of theoretical uncertainties.

2.1 Effective Hamiltonian

The effective Hamiltonian for \(b\rightarrow s\) transitions can be written as

$$\begin{aligned} \mathcal{H}_{\text {eff}} = - \frac{4\,G_F}{\sqrt{2}} V_{tb}V_{ts}^* \frac{e^2}{16\pi ^2} \sum _i (C_i O_i + C'_i O'_i) + \text {h.c.} \end{aligned}$$
(1)

and we consider NP effects in the following set of dimension-6 operators:

$$\begin{aligned}&O_7 = \frac{m_b}{e} (\bar{s} \sigma _{\mu \nu } P_{R} b) F^{\mu \nu },\quad O_7^{\prime } = \frac{m_b}{e} (\bar{s} \sigma _{\mu \nu } P_{L} b) F^{\mu \nu },\end{aligned}$$
(2)
$$\begin{aligned}&O_9 = (\bar{s} \gamma _{\mu } P_{L} b)(\bar{\ell } \gamma ^\mu \ell ),\quad O_9^{\prime } = (\bar{s} \gamma _{\mu } P_{R} b)(\bar{\ell } \gamma ^\mu \ell ), \end{aligned}$$
(3)
$$\begin{aligned}&O_{10} = (\bar{s} \gamma _{\mu } P_{L} b)( \bar{\ell } \gamma ^\mu \gamma _5 \ell ), \quad O_{10}^{\prime } = (\bar{s} \gamma _{\mu } P_{R} b)( \bar{\ell } \gamma ^\mu \gamma _5 \ell ). \end{aligned}$$
(4)

Of the complete set of dimension-6 operators invariant under the strong and electromagnetic gauge groups, this set does not include:

  • Four-quark operators (including current-current, QCD penguin, and electroweak penguin operators). These operators only contribute to the observables considered in this analysis through mixing into the operators listed above and through higher order corrections. Moreover, at low energies they are typically dominated by SM contributions. Consequently, we expect the impact of NP contributions to these operators on the observables of interested to be negligible.Footnote 1

  • Chromomagnetic dipole operators. In the radiative and semileptonic decays we consider, their Wilson coefficients enter at leading order only through mixing with the electromagnetic dipoles and thus enter in a fixed linear combination, making their discussion redundant.

  • Tensor operators. Our rationale for not considering these operators is that they do not appear in the dimension-6 operator product expansion of the Standard Model [3739]. Consequently, they are expected to receive only small NP contributions unless the scale of new physics is very close to the electroweak scale, which is in tension with the absence of new light particles at the LHC.

  • Scalar operators of the form \((\bar{s} P_{A} b)( \bar{\ell } P_{B} \ell )\). The operators with \(AB=LL\) or RR do not appear in the dimension-6 operator product expansion of the Standard Model either. While the ones with \(AB=LR\) and RL do appear at dimension 6, their effects in semileptonic decays are completely negligible once constraints from \(B_s\rightarrow \mu ^+\mu ^-\) are imposed [39]. The constraints from \(B_s\rightarrow \mu ^+\mu ^-\) can only be avoided for a new physics scale close to the electroweak scale such that scalar LL and RR operators can have non-negligible impact.

2.2 \(B\rightarrow K\mu ^+\mu ^-\)

2.2.1 Observables

The differential decay distribution of \(B\rightarrow K\mu ^+\mu ^-\) in terms of the dimuon invariant mass squared \(q^2\) and the angle between the K and \(\mu ^-\) gives access to two angular observables, the so-called flat term \(F_H\) and the forward–backward asymmetry \(A_\text {FB}\), in addition to the differential decay rate (or branching ratio). The observables \(A_\text {FB}\) and \(F_H\) only deviate significantly from zero in the presence of scalar or tensor operators [18]. Due to the argument given above, we do not consider NP contributions to these operators in semileptonic decays. While the direct CP asymmetry has been measured recently as well [40], we do not include it in our analysis since it is suppressed by small strong phases and therefore does not provide constraints on new physics at the current level of experimental accuracy. Consequently, the only observable we need to consider is the (CP-averaged) differential branching ratio of the charged B decay,

$$\begin{aligned}&\frac{\mathrm{d}\text {BR}(B^\pm \rightarrow K^\pm \mu ^+\mu ^-)}{\mathrm{d}q^2} \nonumber \\&\quad = \frac{\tau _{B^+}}{2} \left( \frac{\mathrm{d}\Gamma (B^+\rightarrow K^+\mu ^+\mu ^-)}{\mathrm{d}q^2}+\frac{\mathrm{d}\Gamma (B^-\rightarrow K^-\mu ^+\mu ^-)}{\mathrm{d}q^2}\right) , \end{aligned}$$
(5)

and analogously for the neutral B decay.

2.2.2 Theoretical uncertainties

The theoretical analysis of the \(B \rightarrow K \mu ^+ \mu ^-\) observables is complicated not only by the need to know the \(B\rightarrow K\) form factors, but also by the fact that the “naive” factorisation of the amplitude into a hadronic and a leptonic part is violated by contributions from the hadronic weak Hamiltonian, connecting to the lepton pair through a photon. Concretely, in the limit of vanishing lepton mass,Footnote 2 the decay rate can be written as

$$\begin{aligned}&\frac{\mathrm{d}\Gamma (B\rightarrow K\mu ^+\mu ^-)}{\mathrm{d}q^2}\nonumber \\&\quad = \frac{G_F^2\alpha ^2_\text {em}|V_{tb}V_{ts}^*|^2}{2^{10} \pi ^5 m_B^3} \lambda ^{3/2}(m_B^2,m_{K^*}^2,q^2) ( |F_V|^2+|F_A|^2 ), \end{aligned}$$
(6)

where

$$\begin{aligned} \lambda (a,b,c)= & {} a^2 +b^2 + c^2 - 2 (ab+ bc + ac) , \end{aligned}$$
(7)
$$\begin{aligned} F_V(q^2)= & {} (C_{9}^\text {eff}(q^2)+C_{9}') f_+(q^2) \nonumber \\&+ \frac{2m_b}{m_B+m_K} (C_{7}^\text {eff}+C_{7}') f_T(q^2) + h_K(q^2) , \end{aligned}$$
(8)
$$\begin{aligned} F_A(q^2)= & {} (C_{10}+C_{10}') f_+(q^2) . \end{aligned}$$
(9)

Here, \(f_+\) and \(f_T\) are the full QCD form factors and \(h_K\) includes the non-factorisable contributions from the weak effective Hamiltonian. An additional form factor, \(f_0\), enters terms that are suppressed by the lepton mass. We now discuss our treatment of these quantities, which represent the main source of theoretical uncertainties in the \(B\rightarrow K\mu ^+\mu ^-\) observables.

For the form factors, we perform a combined fit of the recent lattice computation by the HPQCD collaboration [41], valid at large \(q^2\), and form factor values at \(q^2=0\) obtained from LCSR [42, 43], to a simplified series expansion. Details of the fit are discussed in “Appendix A”. The results are 3-parameter (4-parameter) fit expressions for the form factors \(f_{+,T}\) (\(f_0\)) as well as the full \(10\times 10\) covariance matrix. We retain the correlations among these uncertainties throughout our numerical analysis.

Concerning \(h_K(q^2)\), we emphasise the following contributions:

  • Virtual corrections to the matrix elements of the four-quark operators \(O_1\) and \(O_2\). We include them to NNLL accuracy using the results of Ref. [44].

  • Contributions from weak annihilation and hard spectator scattering. These have been estimated in QCD factorisation to be below a percent [43] and we neglect them.

  • Soft gluon corrections to the virtual charm quark loop at low \(q^2\). This effect was computed recently in LCSR with B meson distribution amplitudes in Ref. [22] and was found to be “unimportant at least up to \(q^2\sim 5\)\(6~\text {GeV}^2\)” (see also [24]).

  • Violation of quark–hadron duality at high \(q^2\), above the open charm threshold, due to the presence of broad charmonium resonances. Employing an OPE in inverse powers of the dilepton invariant mass, this effect has been found to be under control at a few percent in Ref. [23].

Concerning the last two items, the uncertainties due to these effects have to be estimated in a consistent and conservative manner to draw robust conclusions about the compatibility of experimental measurements with the SM predictions. We do this by parametrising our ignorance of subleading corrections to \(h_K\) in the following way:

$$\begin{aligned} h_K^\text {subl.}= & {} [C_9^\text {eff}(q^2)]^\text {SM} f_+(q^2)\nonumber \\&\times \,\left\{ \begin{array}{l@{\quad }l} a_K e^{i\phi _a}+ b_K e^{i\phi _b} (q^2/6\,\text {GeV}^2) &{}\text {at low }q^2, \\ c_K e^{i\phi _c} &{}\text {at high }q^2, \end{array} \right. \end{aligned}$$
(10)

where we used the leading contribution to the amplitude \(F_V\) as an overall normalisation factor. To obtain the theory uncertainties, we vary the strong phases \(\phi _{a,b,c}\) within \((-\pi ,\pi ]\). At low \(q^2\), since the main contribution is expected to come from the soft gluon correction to the charm loop, we vary a within [0, 0.02] and b within [0, 0.05]. In this way, the central value of the effect discussed in [22, 24] is contained within our \(1\sigma \) error band. Although (10) is just a very crude parametrisation of the (unknown) \(q^2\) dependence at low \(q^2\), we believe it is sufficiently general at the current level of experimental precision. At high \(q^2\), the presence of broad charmonium resonances means that \(h_K(q^2)\) varies strongly with \(q^2\), but since we will only consider observables integrated over the whole high-\(q^2\) region, we can ignore this fact and the parameter c simply parametrises the violation of the OPE result. We estimate it by varying c within [0, 0.05], which corresponds to an uncertainty on the rate more than twice the uncertainty quoted in [23]. This large range is chosen to take into account the fact that Ref. [23] uses a toy model for the charm loop. In Sect. 3.2, we will also discuss the consequences of increasing the ranges for these parameters.

2.3 \(B\rightarrow K^*\mu ^+\mu ^-\) and \(B\rightarrow K^*\gamma \)

2.3.1 Observables

The angular decay distribution of \(\bar{B}^0\rightarrow \bar{K}^{*0}\mu ^+\mu ^-\) contains in general 12 angular coefficient functions. In the presence of CP violation, the 12 angular coefficients of the CP-conjugate decay \(B^0\rightarrow K^{*0}\mu ^+\mu ^-\) represent another 12 independent observables [35]. However, since scalar contributions are negligible in our setup and one can neglect the muon mass to a good approximation, there are only nine independent observables in each decay. Moreover, the absence of large strong phases implies that several of the observables are hardly sensitive to new physics. In practice, the observables that are sensitive to new physics are

  • the CP-averaged differential branching ratio \(\mathrm{d}\text {BR}/\mathrm{d}q^2\),

  • the CP-averaged \(K^*\) longitudinal polarisation fraction \(F_L\) and forward–backward asymmetry \(A_\text {FB}\),

  • the CP-averaged angular observables \(S_{3,4,5}\),

  • the T-odd CP asymmetries \(A_{7,8,9}\).

All of these observables can be expressed in terms of angular coefficients and are functions of \(q^2\). Alternative bases have been considered in the literature (see e.g. [2629, 32]). Choosing different normalisations can reduce the sensitivity of the observables to the hadronic form factors, at least in the heavy quark limit and for naive factorisation. In our analysis, the choice of basis is irrelevant for the impact of hadronic uncertainties, as we consistently take into account all the correlations between theoretical uncertainties.

In the case of \(B\rightarrow K^*\gamma \), we consider the following observables: the branching ratio of \(B^\pm \rightarrow K^{*\pm }\gamma \), the branching ratio of \(B^0\rightarrow K^{*0}\gamma \), the direct CP asymmetry \(A_\text {CP}\) and the mixing-induced CP asymmetry \(S_{K^*\gamma }\) in \(B^0\rightarrow K^{*0}\gamma \). Since we take all known correlations between the observables into account in our numerical analysis, including the branching ratios of the charged and neutral B decays is to a very good approximation equivalent to including one of these branching ratios and the isospin asymmetry.

2.3.2 Theoretical uncertainties

Similarly to the \(B\rightarrow K\mu ^+\mu ^-\) decay, the main challenges of \(B\rightarrow K^*\mu ^+\mu ^-\) are the form factors and the contributions of the hadronic weak Hamiltonian.

For the form factors, we use the preliminary results of a combined fit [16] to a LCSR calculation of the full set of seven form factors [13] with correlated uncertainties as well as lattice results for these form factors [14]. This leads to strongly reduced uncertainties in angular observables.

The non-factorisable contributions from the hadronic weak Hamiltonian are more involved in \(B\rightarrow K^*\mu ^+\mu ^-\) compared to \(B\rightarrow K\mu ^+\mu ^-\) for several reasons. First, it contributes to three helicity amplitudes instead of just one; second, the presence of the photon pole at \(q^2=0\) enhances several of the contributions at low \(q^2\); third, since we do not only consider branching ratios but also a host of angular observables where form factor uncertainties partly cancel, we require a higher theoretical accuracy in the \(h_\lambda \). Concretely, we include the following contributions:

  • The NNLL contributions to the matrix elements of \(O_{1,2}\) as in the case of \(B\rightarrow K\mu ^+\mu ^-\).

  • At low \(q^2\), hard spectator scattering at \(O(\alpha _s)\) from QCD factorisation [45] including the subleading doubly Cabibbo-suppressed contributions [46].

  • At low \(q^2\), weak annihilation beyond the heavy quark limit as obtained from LCSR [47].

  • At low \(q^2\), contributions from the matrix element of the chromomagnetic operator as obtained from LCSR [48].

As in \(B\rightarrow K\mu ^+\mu ^-\), there are additional, subleading contributions, such as the soft gluon corrections to the charm loop [19, 22, 24, 49]. We parametrise them at low \(q^2\) by a correction relative to the leading contribution to the helicity amplitudes proportional to \(C_7^\text {eff}\),

$$\begin{aligned}{}[C_7^\text {eff}]^\text {SM} \rightarrow [C_7^\text {eff}]^\text {SM} \left[ 1 +a_\lambda e^{i\phi _a^\lambda } + b_\lambda e^{i\phi _b^\lambda } \left( \frac{q^2}{6\,\text {GeV}^2}\right) \right] . \end{aligned}$$
(11)

The parameters \(a_\lambda \) and \(b_\lambda \) are allowed to be different for each of the three helicity amplitudes, \(\lambda =+,-,0\). We vary the \(a_\lambda \) and \(b_\lambda \) in the following ranges:

$$\begin{aligned}&a_{+,-} \in [0,0.05] , \quad b_{+,-} \in [0,0.2] ,\nonumber \\&\quad a_{0} \in [0,0.2] , \quad b_{0} \in [0,0.5] , \end{aligned}$$
(12)

Again, with this choice the effect discussed in [22, 24] is within our \(1\sigma \) uncertainty band. Although the normalisation of the correction is arbitrary and could have also been written as a relative correction to \(C_9\), we choose \(C_7\) as normalisation in \(B\rightarrow K^*\mu ^+\mu ^-\) since the leading contribution proportional to \(C_9\) vanishes at \(q^2=0\) and does not contribute to \(B\rightarrow K^*\gamma \). It is due to this choice that we need to allow for larger \(a_0\), \(b_0\) since the \(C_7^\text {eff}\) contribution is not enhanced in the \(\lambda =0\) amplitude.

At high \(q^2\), as in the case of \(B\rightarrow K\mu ^+\mu ^-\), we do not have to consider a \(q^2\)-dependent correction as we are only considering observables integrated over the full high \(q^2\) region. Analogous to \(B\rightarrow K\mu ^+\mu ^-\), we parametrise the subleading uncertainties by a relative correction to \(C_9\). To be conservative, we allow it to be up to \(7.5\,\%\) in magnitude, independently for the three helicity amplitudes, with an arbitrary strong phase.

2.3.3 Direct CP asymmetry in \(B\rightarrow K^*\gamma \)

While direct CP asymmetries in the B decays considered by us are suppressed by small strong phases and so typically do not lead to strong constraints on NP, the direct CP asymmetry in \(B\rightarrow K^*\gamma \) is a special case since the measurements by the B factories and LHCb are so precise that this suppression could be overcome. The world average readsFootnote 3

$$\begin{aligned} A_\text {CP}(B^0\rightarrow K^{*0}\gamma )_\text {HFAG} = (0.1\pm 1.3)\,\%. \end{aligned}$$
(13)

Allowing for general NP contributions in \(C_7\), we find the following central value for the asymmetry:

$$\begin{aligned} A_\text {CP}(B^0\rightarrow K^{*0}\gamma )\approx & {} \left[ 0.003-0.45 \, \text {Im}\,C_7(m_b)\right] \nonumber \\&\times \frac{\text {BR}(B^0\rightarrow K^{*0}\gamma )_\text {SM}}{\text { BR}(B^0\rightarrow K^{*0}\gamma ) }, \end{aligned}$$
(14)

where we have neglected contributions from NP in \(C_7'\) and \(C_8\). We observe that the experimental bound (13) can constrain an imaginary part of the Wilson coefficient \(C_7\) at the \(m_b\) scale at the level of 0.1, which is still allowed by all other measurements as we will see.

The problem with using this observable as a constraint on NP is that it is proportional to a strong phase that appears only at subleading order and is afflicted with a considerable uncertainty. With our error treatment described above, taking the subleading contributions from Ref. [48], we find an overall relative uncertainty of 20 % in the presence of a large imaginary \(C_7\). However, to be conservative, we will not include \(A_\text {CP}(B^0\rightarrow K^{*0}\gamma )\) in our global fits, but we will discuss the impact of including it separately in Sect. 3.3.

2.4 \(B_s\rightarrow \phi \mu ^+\mu ^-\)

The decay \(B_s\rightarrow \phi \mu ^+\mu ^-\) is very similar to the \(B\rightarrow K^*\mu ^+\mu ^-\) decay, so here we only discuss the differences in the calculation of the observables compared to \(B\rightarrow K^*\mu ^+\mu ^-\), in addition to the obvious parametric replacements throughout the calculation.

  • The form factors are of course different; we use the combined fit of lattice and LCSR results obtained in [16] including the correlated uncertainties.

  • The subleading non-factorisable corrections are parametrised as in the case of \(B\rightarrow K^*\mu ^+\mu ^-\), and the coefficients \(a_\lambda \), \(b_\lambda \) and \(c_\lambda \) are varied in the same ranges. We assume the uncertainty in these coefficients to be \(90\,\%\) correlated between \(B_s\rightarrow \phi \mu ^+\mu ^-\) and \(B\rightarrow K^*\mu ^+\mu ^-\) since we do not see a physical reason why they should be drastically different.Footnote 4

  • In contrast to \(B\rightarrow K^*\mu ^+\mu ^-\), the \(B_s\rightarrow \phi \mu ^+\mu ^-\) decay is not self-tagging. Therefore, the only observables among the ones mentioned at the beginning of Sect. 2.3.1 that are experimentally accessible in a straightforward way at a hadron collider are [50]:

    • the differential branching ratio \(\mathrm{d}\text {BR}/\mathrm{d}q^2\),

    • the CP-averaged angular observables \(F_L\) and \(S_4\),

    • the angular CP asymmetry \(A_9\).

  • An additional novelty is the impact of the sizable \({ B}_{ s}\) width difference. As shown in [16] (see also [51]), this effect is small in the SM and we have checked that it is also negligible in the presence of NP at the current level of experimental precision, unless the Wilson coefficients assume extreme values that are already excluded by other constraints. Therefore, we have neglected the effect in our numerical analysis.

3 Global numerical analysis

3.1 Fit methodology

More and more experimental data on \(b\rightarrow s\mu ^+\mu ^-\) transitions becomes available and many observables are measured with a fine binning. Therefore, in order to determine the values of the Wilson coefficients preferred by the data it becomes more and more important to include the correlation of theoretical uncertainties between different observables as well as between different bins of the same observable. One possibility to achieve this is to perform a global Bayesian analysis where all the uncertainties are parametrised by nuisance parameters that are marginalised over by sophisticated numerical tools like Markov chain Monte Carlos. This approach has been applied recently e.g. in [4]. A drawback of this approach is that it is time-consuming and the computing time increases with the number of parameters. Here, we follow a different approach. We construct a \(\chi ^2\) function that only depends on the Wilson coefficients and take into account the theoretical and experimental uncertainties in terms of covariance matrices,

$$\begin{aligned} \chi ^2(\vec {C}^\text {NP})= & {} [\vec {O}_\text {exp}-\vec {O}_\text {th}(\vec {C}^\text {NP}) ]^T [C_\text {exp}+C_\text {th}]^{-1}\nonumber \\&\times \, [\vec {O}_\text {exp}-\vec {O}_\text {th}(\vec {C}^\text {NP}) ]. \end{aligned}$$
(15)

Here, \(\vec {O}_\text {exp}\) are the experimentally measured central values of all observables of interest, \(O_\text {th}\) are the corresponding theory predictions that depend on the (NP contributions to the) Wilson coefficients, \(C_\text {exp}\) is the covariance matrix of the experimental measurements and \(C_\text {th}\) is the covariance matrix of the theory predictions that contains the theory uncertainties and their correlations. In writing (15), we have made two main approximations. First, we have assumed all the experimental and theoretical uncertainties to be Gaussian. Second, we have neglected the dependence of the theory uncertainties on the new physics contributions to the Wilson coefficients. This means that the theory uncertainties and their correlations have been evaluated for the Wilson coefficients fixed to their SM values. We believe that this assumption is well justified in view of the fact that no drastic deviations from the SM expectations have been observed so far. We checked explicitly that changes are small between the covariance matrix of the theory predictions in the SM and the one computed at the best-fit point for new physics in the Wilson coefficient \(C_9\) alone (\(C_9^\text {NP}=-1.07\); see Sect. 3.3 below). The only possible exception are observables that vanish in the SM but could receive NP contributions much larger than the current experimental bounds. As we will discuss below, the only such observable at present is the direct CP asymmetry in \(B\rightarrow K^*\gamma \).

We determine \(C_\text {th}\) by evaluating all observables of interest for a large set of the parameters parametrising the theory uncertainties, randomly distributed following normal distributions according to the uncertainties and correlations described above. In this way, we retain not only correlated uncertainties between different observables, but also between different bins of the same observable. We find these correlations to have a large impact on our numerical results. Concerning \(C_\text {exp}\), we symmetrise the experimental error bars and include the experimental error correlations provided by the latest LHCb update of the \(B \rightarrow K^* \mu ^+\mu ^-\) analysis [17]. For branching ratio measurements, where no error correlations are available, we include a rough guess of the correlations by assuming the statistical uncertainties to be uncorrelated and the systematic uncertainties to be fully correlated for measurements of the same observable by a single experiment. We have checked that this treatment has only a small impact on the overall fit at the current level of experimental and theoretical uncertainties on branching ratios.

We use the following experimental input for our global \(b \rightarrow s \mu ^+\mu ^-\) fit:

  • \(B \rightarrow K^* \mu ^+\mu ^-\) branching ratios and angular observables from LHCb [1, 11, 17, 52], CMS [53], ATLAS [54], and CDF [5557];

  • \(B \rightarrow K \mu ^+\mu ^-\) branching ratios and angular observables from LHCb [11, 58] and CDF [5557];

  • \(B_s \rightarrow \phi \mu ^+\mu ^-\) branching ratios and angular observables from LHCb [12] and CDF [55, 57];

  • branching ratios for \(B \rightarrow K^* \gamma \) and \(B \rightarrow X_s \gamma \) and the mixing-induced CP asymmetry in \(B \rightarrow K^* \gamma \) from HFAG [59];

  • the combined result of the \(B_s \rightarrow \mu ^+\mu ^-\) branching ratio from LHCb and CMS [6062];

  • the \(B \rightarrow X_s \mu ^+\mu ^-\) branching ratio measurement from BaBar [63].

We do not include the additional results on \(b \rightarrow s \ell \ell \) transitions from BaBar [64, 65] and Belle [66, 67], as they are only available as an average of \(\mu ^+\mu ^-\) and \(e^+e^-\) modes. As already mentioned in Sect. 2, in the fit we do not explicitly include isospin asymmetries, but instead use results on the charged and neutral modes separately. As we take into account all known error correlations, this approach is essentially equivalent.

We would like to stress that for none of the observables, we use low \(q^2\) bins that extend into the region above the perturbative charm threshold \(q^2 > 6\) GeV, where hadronic uncertainties cannot be estimated reliably. This applies in particular to the bin [4.3, 8.68] GeV\(^2\) that has been used in several fits in the past [2, 5, 9] as well as the bin [6, 8] GeV\(^2\) in the recent \(B\rightarrow K^*\mu ^+\mu ^-\) angular analysis by LHCb [17].

For the \(B^0 \rightarrow K^{*0} \mu ^+\mu ^-\) observables at low \(q^2\), we choose the smallest available bins satisfying this constraint, since they are most sensitive to the non-trivial \(q^2\) dependence of the angular observables. For \(B_s\rightarrow \phi \mu ^+\mu ^-\), we use the [1, 6] GeV\(^2\) bin, since the branching ratio does not vary strongly with \(q^2\) and since the statistics is limited. In the high \(q^2\) region, we always consider the largest \(q^2\) bins available that extend to values close to the kinematical end point. All the experimental measurements used in our global fits are listed in “Appendix B” along with their theory predictions. All theory predictions are based on our own work and on [16], except the \(B_s \rightarrow \mu ^+\mu ^-\), \(B \rightarrow X_s \gamma \) and \(B \rightarrow X_s \ell ^+\ell ^-\) branching ratios that we take from [6870],Footnote 5 respectively. In the case of the SM prediction for BR\((B_s \rightarrow \mu ^+\mu ^-)\) we rescale the central value and uncertainty obtained in [68], to reflect our choice of \(V_{cb}\) (see Sect. 3.2.2 below).

3.2 Compatibility of the data with the SM

Evaluating (15) with the Wilson coefficients fixed to their SM values, we obtain the total \(\chi ^2\) of the SM. Including both \(b\rightarrow s\mu ^+\mu ^-\) and \(b\rightarrow se^+e^-\) observables, we find \(\chi ^2_\text {SM}\equiv \chi ^2(\vec {0})=125.8\) for 91 independent measurements. This corresponds to a p value of \(0.9\,\%\). Including only \(b\rightarrow s\mu ^+\mu ^-\) observables, we find \(\chi ^2_\text {SM}=116.9\) for 88 independent measurements, corresponding to a p value of \(2.1\,\%\). In Table 1, we list the observables with the largest deviation from the SM expectation. The full list of observables entering the \(\chi ^2\), together with the SM predictions and experimental measurements, is given in “Appendix B”. We note that some of these observables have strongly correlated uncertainties and that for two of the observables, \(A_\text {FB}\) and \(F_L\), there is some tension between different experiments. Still, there does seem to be a systematic suppression of branching ratios in different decay modes and we will see in Sect. 3.3 that the quality of the fit can be improved substantially in the presence of new physics. An important questions is whether these tensions could be due to underestimated theory uncertainties and we will investigate this question in the following paragraphs. It should be kept in mind that none of these sources of uncertainties can account for violation of lepton-flavour universality.

Table 1 Observables where a single measurement deviates from the SM by \(1.8\sigma \) or more. The full list of observables is given in “Appendix B”. Differential branching ratios are given in units of GeV\(^{-2}\)

3.2.1 Underestimated hadronic effects?

We will see in Sect. 3.3 that the agreement of the theory predictions with the experimental data is improved considerably assuming non-standard values for the Wilson coefficient \(C_9\). Since this coefficient corresponds to a left-handed quark current and a leptonic vector current, it is conceivable that a NP effect in \(C_9\) is mimicked by a hadronic SM effect that couples to the lepton current via a virtual photon, e.g. the charm loop effects at low \(q^2\) and the resonance effects at high \(q^2\) as discussed in Sect. 2 (see e.g. [19]). In our numerical analysis, in addition to the known non-factorisable contributions taken into account as described in Sect. 2, subleading effects of this type are parametrised by the parameters \(a_i, b_i, c_i\) in (10), (11), and analogously for \(B_s\rightarrow \phi \mu ^+\mu ^-\). Since they parametrise unknown subleading uncertainties, the central values of these parameters are 0 in our SM predictions.

Any underestimation of a non-perturbative QCD effect (not related to form factors) should then manifest itself as a drastic reduction of the \(\chi ^2\) for a sizable value of one of the parameters, when treating them as completely free. To investigate this question, we have constructed a \(\chi ^2\) function analogous to (15), but writing the central values \(\vec {O}_\text {th}\) as functions of the parameters \(a_i, b_i, c_i\) instead of the Wilson coefficients.

In Fig. 1, we show the reduction of the \(\chi ^2\) compared to our SM central value under variation of pairs of these parameters, while treating two of them at a time as free parameters and fixing all the others to 0. We show the cases of varying the coefficients entering the \(B \rightarrow K \ell ^+ \ell ^-\) amplitude at low and high \(q^2\) (top); the coefficients entering the \(\lambda = -\) and \(\lambda = 0\) \(B \rightarrow K^* \ell ^+ \ell ^-\) helicity amplitudes at low \(q^2\) (bottom left) and high \(q^2\) (bottom right). Corrections to the \(\lambda = +\) helicity amplitude are expected to be suppressed [25] and we checked explicitly that they have a weak impact. On the green dashed contours, the \(\chi ^2\) is the same as for the central value, so there is no improvement of the fit. In the green shaded area, the fit is improved, with the solid contours showing \(\Delta \chi ^2\equiv \chi ^2-\chi ^2_\text {SM}=1, 4, 9\), etc. In the unshaded region to the other side of the dashed contour, the fit is worsened compared to the central value. The blue circles show our 1 and \(2\sigma \) assumptions for the uncertainties on the parameters in question, as discussed in Sect. 2. We stress that these assumptions have not been used as priors to determine the green contours. We make the following observations.

  • The \(\chi ^2\) can be reduced by up to 4 when pushing the parameter \(b_K\), parametrising subleading corrections in \(B\rightarrow K\mu ^+\mu ^-\) at low \(q^2\), to the border of our estimated uncertainty. The fit does not improve significantly when changing the parameter \(c_K\) from 0, i.e. when assuming large violations of quark–hadron duality in the global (integrated) high \(q^2\) observables in \(B\rightarrow K\mu ^+\mu ^-\), unless \(b_K\) is shifted at the same time.

    Fig. 1
    figure 1

    Change of \(\chi ^2\) compared to the SM central value in the planes of pairs of coefficients that parameterise the size of unknown subleading non-perturbative QCD effects. Coefficients entering the \(B \rightarrow K \ell ^+ \ell ^-\) amplitude at low and high \(q^2\) (top); coefficients entering the \(B \rightarrow K^* \ell ^+ \ell ^-\) amplitudes at low \(q^2\) (bottom left) and high \(q^2\) (bottom right). Along the dashed line the \(\chi ^2\) remains unchanged. In the shaded green region the \(\chi ^2\) is improved, with the solid lines indicating contours of \(\Delta \chi ^2=1,4,9\). The blue circles show our 1 and \(2\sigma \) assumptions for the uncertainties on the shown parameters

  • A simultaneous positive shift in the subleading corrections to the \(\lambda =-\) and 0 helicity amplitudes in \(B\rightarrow K^*\mu ^+\mu ^-\) can significantly reduce the \(\chi ^2\) as well. \(\Delta \chi ^2=9\) requires a shift in both parameters that is four times larger than our error estimate.

  • Corrections to quark–hadron duality in the global high \(q^2\) observables in \(B\rightarrow K^*\mu ^+\mu ^-\) do not lead to any significant reduction of the \(\chi ^2\).

We conclude that the agreement of the data with the predictions cannot be improved by assuming (unexpectedly) large violations of quark–hadron duality in integrated observables at high \(q^2\) alone, while sizable corrections to \(B\rightarrow K\mu ^+\mu ^-\) and \(B\rightarrow K^*\mu ^+\mu ^-\) at low \(q^2\) could improve the agreement with the data. We stress, however, that Fig. 1 should not be misinterpreted as a determination of the size of subleading QCD effects from the data. Indeed, the regions where the \(\chi ^2\) is significantly reduced correspond to values that are larger than any known hadronic effect.

We will see in Sect. 3.3 that a good fit to the data can be obtained assuming a large negative NP contribution to the Wilson coefficient \(C_9\). We find it instructive to consider the size of the subleading parameters that would make them “mimic” a NP effect. Experimentally, it would be difficult to distinguish between the cases (i) where \(C_9 = C_9^\text {SM} + \Delta _9\) and all \(a_i= b_i=c_i=0\) or (ii) where \(C_9 = C_9^\text {SM}\) as well as

$$\begin{aligned} a_K\approx & {} 0.25\,\Delta _9 ,\quad c_K \approx 0.25\,\Delta _9 ,\end{aligned}$$
(16)
$$\begin{aligned} b_-\approx & {} -0.6\,\Delta _9 ,\quad c_- \approx 0.25\,\Delta _9 ,\end{aligned}$$
(17)
$$\begin{aligned} a_0\approx & {} -2\,\Delta _9 ,\quad c_0 \approx 0.25\,\Delta _9 , \end{aligned}$$
(18)

and all other \(a_i, b_i,c_i\) equal to zero. This pattern of effects is indeed similar to what is seen in Fig. 1. Distinguishing such a scenario from a NP effect is straightforward if the NP effect is not lepton-flavour universal. If it is lepton-flavour universal, a correlated analysis of exclusive and inclusive observables, of the \(q^2\) dependence, and of consistency relations among observables valid in the SM (see e.g. [72]) could help to disentangle QCD and NP.

3.2.2 Underestimated parametric uncertainties?

While the angular observables in \(B\rightarrow K^*\mu ^+\mu ^-\) are almost free from parametric uncertainties,Footnote 6 the apparent systematic suppression of branching ratios could also be due to an underestimated overall parametric uncertainty. The uncertainties of the \(B_{u,d,s}\) meson lifetimes quoted by the PDG [73] are well below \(1\,\%\) and are therefore very unlikely to be responsible. The dominant parametric uncertainty is the CKM factor \(|V_{tb}V_{ts}^*|^2\) to which all branching ratios are proportional and which itself is dominated by the uncertainty of the measurement of \(|V_{cb}|\). The relative uncertainty of all \(b\rightarrow s\) branching ratios due to \(|V_{cb}|\) is twice the relative uncertainty of \(|V_{cb}|\). In our numerical analysis, we use

$$\begin{aligned} |V_{cb}| = (4.09 \pm 0.10) \times 10^{-2}, \end{aligned}$$
(19)

which leads to an uncertainty of \(4.9\,\%\) on the branching ratios. In fact there is a long standing tension between determinations of \(|V_{cb}|\) from inclusive and exclusive decays. The PDG [73] quotes

$$\begin{aligned} |V_{cb}|^\text {PDG}_\text {incl.}= & {} (4.22 \pm 0.07) \times 10^{-2},\nonumber \\ |V_{cb}|^\text {PDG}_\text {excl.}= & {} (3.95 \pm 0.08) \times 10^{-2}, \end{aligned}$$
(20)

which are at a \(2.5\sigma \) tension with each other. Choosing the inclusive value instead of (19) would increase the central values of all our branching ratios by \(6.5\,\%\) and would worsen the agreement with the data. Choosing the exclusive value instead would lead to a reduction of the branching ratios by \(6.7\,\%\).

To see whether this has an impact on the significance of the tensions, we multiply all branching ratios by a scale factor \(\eta _\text {BR}\) and fit this scale factor to the data. We find \(\eta _\text {BR}=0.79\pm 0.08\), i.e. a \(21\,\%\) reduction of the branching ratios with respect to our central values is preferred. The \(\chi ^2\) is improved by 7.0 with respect to the SM. The obtained central value for \(\eta _\text {BR}\) would correspond to \(|V_{cb}| \simeq 3.6 \times 10^{-2}\), which is in tension with both the inclusive and exclusive determinations.

We conclude that underestimated parametric uncertainties are unlikely to be responsible for the observed tensions in the branching ratio measurements. Needless to say, the angular observables and \(R_K\) would be unaffected by a shift in \(|V_{cb}|\) anyway.

3.2.3 Underestimated form factor uncertainties?

The tensions between data and SM predictions could also be due to underestimated uncertainties in the form factor predictions from LCSR, lattice, or both. A first relevant observation in this respect is that the tensions in Table 1 include observables in decays involving \(B\rightarrow K\), \(B\rightarrow K^*\) and \(B_s\rightarrow \phi \) transitions, both at low \(q^2\) (where LCSR calculations are valid) and at high \(q^2\) (where the lattice predictions are valid). Explaining all of them would imply underestimated uncertainties in several completely independent theoretical form factor determinations.

In the case of \(B\rightarrow K\mu ^+\mu ^-\) and \(B_s\rightarrow \phi \mu ^+\mu ^-\), tensions are present only in branching ratios, which seem to be systematically below the SM predictions. This could be straightforwardly explained if the form factor predictions were systematically too high. Note that the largest tensions in the \(B \rightarrow K \mu ^+\mu ^-\) branching ratios appear in the neutral mode. The branching ratio of the charged mode, \(B^+ \rightarrow K^+ \mu ^+\mu ^-\), is measured with considerably smaller statistical uncertainty and agrees better with the SM predictions (see “Appendix B”). Nevertheless, also the charged mode seems to be systematically below the SM prediction and would profit from a reduction of the form factors.

The case of \(B\rightarrow K^*\mu ^+\mu ^-\) is less trivial due to the tensions in angular observables, which cannot simply be due to an overall rescaling of the form factors. To investigate this case, we have parametrised all seven \(B\rightarrow K^*\) form factors by a two-parameter z expansionFootnote 7 and constructed a \(\chi ^2\) function analogous to (15), but writing the central values \(\vec {O}_\text {th}\) as functions of the 12 z expansion parameters instead of the Wilson coefficients. Varying the expansion parameters, we have found that the most significant shift, i.e. preference for a non-standard value, is obtained by modifying the form factorFootnote 8 \(A_{12}\). In Fig. 2, we show the improvement in the \(\chi ^2\) obtained when changing the \(A_{12}\) form factor, while fixing all the other form factors to their central values. Instead of the two z expansion coefficients, we present it in terms of the values of the form factor at the borders of the kinematical region, 0 and \(q^2_\text {max}=(m_B-m_{K^*}^2)\). The colours are analogous to Fig. 1. We observe that an improvement of \(\Delta \chi ^2\sim 4\) can be obtained if the value at \(q^2=0\) is significantly lower than what is obtained from LCSR. This improvement is quite limited compared to the improvement obtained in the presence of NP discussed below or in the presence of large non-form factor corrections discussed above.

Fig. 2
figure 2

Change of \(\chi ^2\) compared to the SM central value when changing the central value of the form factor \(A_{12}\) at minimal or maximal \(q^2\), while fixing the central values of all other form factors to their nominal values. Colours are as in Fig. 1

Finally, an important observation in the case of \(B\rightarrow K^*\mu ^+\mu ^-\) angular observables is that the tensions are only present at low \(q^2\), where the seven form factors can be expressed in terms of two independent “soft” form factors up to power corrections of naive order \(\Lambda _\text {QCD}/m_b\). It is then possible to construct angular observables that do not depend on the soft form factors, but only on the power corrections [32]. The tensions can then be seen by estimating the power corrections by dimensional analysis [20]. This shows that an explanation of the tensions by underestimated form factor uncertainties would imply that the values of the power corrections are very different from what LCSR calculations predict for them.

3.3 New physics in a single Wilson coefficient

We now investigate whether new physics could account for the tension of the data with the SM predictions. We start by discussing the preferred ranges for individual Wilson coefficients assuming our nominal size of hadronic uncertainties. We determine the \(1\sigma \) (\(2\sigma \)) ranges by computing \(\Delta \chi ^2=1~(4)\), while fixing all the other coefficients to their SM values. We also set the imaginary part of the respective coefficient to 0. In addition to the Wilson coefficients \(C_{7,9,10}^{(\prime )}\), we also consider the case where the NP contributions to \(C_9^{(\prime )}\) and \(C_{10}^{(\prime )}\) are equal up to a sign, since this pattern of effects is generated by \(\mathrm{SU}(2)_L\)-invariant four fermion operators in the dimension-6 SM effective theory.

Our results are shown in Table 2. We summarise the most important points.

  • A negative NP contribution to \(C_9\), approximately \(-25\,\%\) of \(C_9^\text {SM}\), leads to a sizable decrease in the \(\chi ^2\). The best-fit point corresponds to a p value of \(11.3\,\%\), compared to \(2.1\%\) for the SM. This was already found in fits of low-\(q^2\) angular observables only [2] and in global fits not including data released this year [35, 20], as well as in a recent fit to a subset of the available data [9]. We find that the significance of this solution has increased substantially. This is due in part to the reduced theory uncertainties, in particular the form factors, as well as due to the new measurements by LHCb.

    Table 2 Constraints on individual Wilson coefficients, assuming them to be real. The pull in the last column is defined as \(\sqrt{\chi ^2_\text {SM} - \chi ^2_\text {b.f.}}\)
  • A significant improvement is also obtained in the \(\mathrm{SU}(2)_L\) invariant direction \(C_9^\text {NP}=-C_{10}^\text {NP}\), corresponding to an operator with left-handed muons.

  • A positive NP contribution to \(C_{10}\) alone can also improve the fit, although to a lesser extent.

  • NP contributions to individual right-handed Wilson coefficients hardly lead to improvements of the fit.

While Table 2 assumed the Wilson coefficients to be real, i.e. aligned in phase with the SM, in general the NP contributions to the Wilson coefficients are complex numbers. Since measurements in semileptonic decays are currently restricted to CP-averaged observables or direct CP asymmetries that are suppressed by small strong phases,Footnote 9 the constraints on the imaginary parts are generally weaker than on the real parts, since they do not interfere with the SM contribution.

An interesting special case is the direct CP asymmetry in \(B\rightarrow K^*\gamma \). As discussed in Sect. 2.3.3, this observable is precisely measured and very sensitive to the imaginary part of \(C_7\), but we do not include it in our default \(\chi ^2\) since it is proportional to a strong phase that is afflicted with a considerable uncertainty. In Fig. 3, we show how the allowed region for the NP contribution to \(C_7\) would change by including this observable. The red (green) contours correspond to the 1 and \(2\sigma \) regions (\(\Delta \chi ^2=2.3\) and 6 while fixing all other coefficients to their SM values) allowed by the global fit including \(A_\text {CP}(B^0\rightarrow K^{*0}\gamma )\) with a relative uncertainty of 50 % (25 %), while the blue contours correspond to the fit without the CP asymmetry. We observe that the constraint on the imaginary part of \(C_7\) improves by a factor of \(\sim \)2 even with our conservative estimate for the theory error. In any case, a more detailed study of the theoretical uncertainties in this observable and a combined analysis with other observables sensitive to \(C_7\) – e.g. \(B\rightarrow K^*e^+e^-\) at very low \(q^2\) [74] or \(B_s\rightarrow \phi \gamma \) [75, 76] – would be interesting and we leave this to a future study.

Fig. 3
figure 3

Allowed region in the Re\((C_7^\text {NP})\)–Im\((C_7^\text {NP})\) plane. The blue contours correspond to the 1 and \(2\sigma \) best-fit region without including the \(A_\text {CP}(B\rightarrow K^* \gamma )\) measurement. The red (green) contours show the impact of including \(A_\text {CP}(B\rightarrow K^* \gamma )\) with a relative theoretical uncertainty of 50 % (25 %)

The global constraints in the complex planes of all Wilson coefficients are shown in Fig. 11 of “Appendix C”.

3.4 Constraints on pairs of Wilson coefficients

We proceed by analysing the constraints in scenarios where two Wilson coefficients are allowed to differ from their SM values. In this section we exemplarily allow for real NP in either \(C_9\) and \(C_9'\) or \(C_9\) and \(C_{10}\). With our nominal values for the theory uncertainties, the best-fit values for the Wilson coefficients and the corresponding \(\Delta \chi ^2\) read in the two cases

$$\begin{aligned} (C_9^\text {NP})_\text {b.f.}= & {} -1.10 , \quad (C_9')_\text {b.f.} = +0.45 ,\quad \chi ^2_\text {SM} - \chi ^2_\text {b.f.} = 15.6 , \end{aligned}$$
(21)
$$\begin{aligned} (C_9^\text {NP})_\text {b.f.}= & {} -1.06 , \quad (C_{10}^\text {NP})_\text {b.f.} = +0.16 , \quad \chi ^2_\text {SM} - \chi ^2_\text {b.f.}= 14.2 . \end{aligned}$$
(22)

The best-fit points correspond to p values of 12.4 and \(10.6\,\%\), respectively. This is comparable to the \(11.3\,\%\) obtained in Sect. 3.3 in the scenario with new physics only in \(C_9\). In Fig. 4, we show the allowed regions in the Re\((C_9^\text {NP})\)–Re\((C_9')\) and Re\((C_9^\text {NP})\)–Re\((C_{10}^\text {NP})\) planes. The blue contours correspond to the 1 and \(2\sigma \) regions (\(\Delta \chi ^2=2.3\) and 6 while fixing all other coefficients to their SM values) allowed by the global fit. In addition, we also show the \(2\sigma \) allowed regions for two scenarios with inflated theory uncertainties. For the green short-dashed contours, we have doubled all the form factor uncertainties. For the red short-dashed contours, we have doubled all the hadronic uncertainties not related to form factors, i.e. the ones that are parametrised as in (10) and (11). We observe that the negative value preferred for \(C_9^\text {NP}\) is above the \(2\sigma \) level even for these conservative assumptions. We also observe that \(C_9'\) and \(C_{10}^\text {NP}\) are preferentially positive, although they deviate from 0 less significantly than \(C_9^\text {NP}\). The corresponding plots for all interesting combinations of real Wilson coefficients are collected in Fig. 12 of “Appendix C”, together with the \(\Delta \chi ^2\) values of the corresponding best-fit points.

Fig. 4
figure 4

Allowed regions in the Re\((C_9^\text {NP})\)–Re\((C_9')\) plane (left) and the Re\((C_9^\text {NP})\)–Re\((C_{10}^\text {NP})\) plane (right). The blue contours correspond to the 1 and \(2\sigma \) best-fit regions. The green and red short-dashed contours correspond to the \(2\sigma \) regions in scenarios with doubled form factor uncertainties and doubled uncertainties from subleading non-factorisable corrections, respectively

It is also interesting to investigate which observables drive the tensions. In Fig. 5, we compare the global constraints in the Re\((C_9^\text {NP})\)–Re\((C_9')\) and Re\((C_9^\text {NP})\)–Re\((C_{10}^\text {NP})\) planes to the constraints one gets only using branching ratios (green) or only using \(B\rightarrow K^*\mu ^+\mu ^-\) angular observables (red). We observe that the angular observables strongly prefer a negative \(C_9\) but are not very sensitive to \(C_9'\) or \(C_{10}\). The branching ratio constraints have an approximately flat direction \(C_9^\text {NP}\sim -C_9'\) and show a preference for \(C_{10}^\text {NP}>0\), in particular if \(C_{9}^\text {NP} > 0\). In fact, from the branching ratios alone, one could get a good fit to the data with SM-like \(C_9\) and \(C_{10}^\text {NP}>0\).

Fig. 5
figure 5

Allowed regions in the Re\((C_9^\text {NP})\)–Re\((C_9')\) plane (left) and the Re\((C_9^\text {NP})\)–Re\((C_{10}^\text {NP})\) plane (right). The blue contours correspond to the 1 and \(2\sigma \) best-fit regions from the global fit. The green and red contours correspond to the 1 and \(2\sigma \) regions if only branching ratio data or only data on \(B \rightarrow K^* \mu ^+\mu ^-\) angular observables is taken into account

3.5 Minimal flavour violation

In models with constrained minimal flavour violation (CMFV) [77], only the Wilson coefficients \(C_7\), \(C_9\) and \(C_{10}\) receive new physics contributions and they are aligned in phase with the SM, i.e. real in our convention. Since these NP contributions interfere with the SM contributions, they are the most strongly constrained ones at present. In fact, in this simple case, it is a reasonable approximation to expand the \(\chi ^2\) to quadratic order around the best-fit point,

$$\begin{aligned}&\chi ^2_\text {CMFV}(\vec {C}^\text {NP}) \approx \chi ^2_\text {b.f., CMFV} + (\vec {C}^\text {NP}- \vec {C}^\text {NP}_\text {b.f.})^T C_\text {CMFV}^{-1} (\vec {C}^\text {NP}- \vec {C}^\text {NP}_\text {b.f.})\nonumber \\ \end{aligned}$$
(23)

where the best fit has \(\chi ^2_\text {b.f., CMFV} = 102.4\). The covariance matrix is given in terms of the variances \(\sigma _i\) and correlations \(\rho _{ij}\) as \(C_\text {CMFV}^{ij} = \sigma _i\sigma _j\rho _{ij}\) (no sum). The central values and variances of the Wilson coefficients read

$$\begin{aligned} \vec {C}^\text {NP} = \begin{pmatrix} C_7^\text {NP} \\ C_9^\text {NP} \\ C_{10}^\text {NP} \end{pmatrix} = \begin{pmatrix} -0.017 \pm 0.030 \\ -1.02 \pm 0.27 \\ 0.16 \pm 0.24 \end{pmatrix}\, \end{aligned}$$
(24)

and the correlation matrix reads

$$\begin{aligned} \begin{pmatrix} 1 &{}\,\,\, -0.28 &{}\,\,\, 0.06 \\ -0.28 &{}\,\,\, 1 &{}\,\,\, 0.06 \\ 0.06 &{}\,\,\, 0.06 &{} \,\,\,1 \\ \end{pmatrix}. \end{aligned}$$
(25)

The expression (23) can be used to easily impose the combined fit constraints in phenomenological analyses of models satisfying CMFV. For scenarios with non-standard CP violation or right-handed currents, it can be understood from Figs. 11 and 12 that at present the constraints are not stringent enough to allow a quadratic expansion of the \(\chi ^2\) and we cannot provide a comparably simple expression in general.

3.6 Testing lepton-flavour universality

So far, in our numerical analysis we have only considered the muonic \(b \rightarrow s \mu ^+\mu ^-\) modes and the lepton-flavour-independent radiative \(b \rightarrow s \gamma \) modes to probe the Wilson coefficients \(C_7^{(\prime )}\), \(C_9^{(') \mu }\) and \(C_{10}^{(\prime ) \mu }\), where the superscript \(\mu \) indicates that in the semileptonic operators (3) and (4) only muons are considered. In this section we will extend our analysis and include also semileptonic operators that contain electrons. In particular, we will allow new physics in the Wilson coefficients \(C_9^e\) and \(C_{10}^e\) and confront them with the available data on \(B \rightarrow K e^+e^-\) from LHCb [6] and \(B \rightarrow X_s e^+e^-\) from BaBar [63].

As mentioned already in the introduction, the recent measurement of the ratio \(R_K\) of \(B\rightarrow K \mu ^+\mu ^-\) and \(B \rightarrow K e^+e^-\) branching ratios in the \(q^2\) bin [1, 6] GeV\(^2\) by LHCb [6] shows a \(2.6\sigma \) tension with the SM prediction,

$$\begin{aligned}&R_K = \frac{\text {BR}(B \rightarrow K\mu ^+\mu ^-)_{[1,6]}}{\text {BR}(B \rightarrow Ke^+e^-)_{[1,6]}} = 0.745^{+0.090}_{-0.074} \pm 0.036 ,\nonumber \\&R_K^\text {SM} \simeq 1.00 . \end{aligned}$$
(26)

The theoretical error of the SM prediction is completely negligible compared to the current experimental uncertainties. The tension between the SM prediction and the experimental data is driven by the reduced \(B\rightarrow K \mu ^+\mu ^-\) branching ratio, while the measured \(B \rightarrow K e^+e^-\) branching ratio is in good agreement with the SM. In our extended global fit we do not use the \(R_K\) measurement directly but instead include the \(B \rightarrow K\mu ^+\mu ^-\) and \(B \rightarrow Ke^+e^-\) branching rations separately, taking into account the correlations of their theory uncertainties. As the theory uncertainties of BR\((B \rightarrow K\mu ^+\mu ^-)\) and BR\((B \rightarrow Ke^+e^-)\) are essentially 100 % correlated, our approach is to a good approximation equivalent to using \(R_K\).

In Fig. 6 we show the result of two fits that allow for new physics in \(C_9^\mu \) and \(C_9^e\) (left plot) and new physics along the \(\mathrm{SU}(2)_L\) invariant directions \(C_9^\mu = -C_{10}^\mu \) and \(C_9^e = -C_{10}^e\). Recall that in Sect. 3.3 we found that new physics in these scenarios gives the by far best description of the experimental \(b\rightarrow s \mu ^+\mu ^-\) data. As expected, we again find that a \(C_9^\mu \) significantly smaller than in the SM is clearly preferred by the fits. The best-fit regions for \(C_9^\mu \) and \(C_9^\mu = - C_{10}^\mu \) approximately coincide with the regions found for \(C_9\) and \(C_9 = - C_{10}\) in Sect. 3.3. The Wilson coefficients \(C_9^e\) and \(C_9^e = - C_{10}^e\) on the other hand are perfectly consistent with the SM prediction. Lepton-flavour universality, i.e. \(C_9^\mu = C_9^e\) and \(C_{10}^\mu = C_{10}^e\) as indicated by the diagonal line in the plots is clearly disfavoured by the data. Our results are consistent with similar findings in recent fits to part of the available experimental data [8, 9].

Fig. 6
figure 6

Allowed regions in the plane of new physics contributions to the Wilson coefficients \(C_9^\mu \) vs. \(C_9^e\) (left) and the plane of the \(\mathrm{SU}(2)_L\) invariant combinations of Wilson coefficients \(C_9^\mu =-C_{10}^\mu \) vs. \(C_9^e=-C_{10}^e\) (right). The blue contours correspond to the 1 and \(2\sigma \) best-fit regions. The diagonal line corresponds to lepton-flavour universality

Working under the assumption that the electron modes are indeed SM like, we can make predictions for ratios of observables that test lepton-flavour universality using the best-fit regions for the muonic Wilson coefficients from our global fit. We consider ratios of branching ratios of the exclusive \(B\rightarrow K^* \ell ^+\ell ^-\) and \(B\rightarrow K \ell ^+\ell ^-\) decays and the inclusive \(B \rightarrow X_s \ell ^+\ell ^-\) decays, both at low and high \(q^2\). Moreover, we also predict ratios of the \(B \rightarrow K^* \ell \ell \) angular observables \(F_L\), \(A_\text {FB}\) and \(S_5\) at low and high \(q^2\). The results are shown in Table 3. The four columns correspond to the following scenarios:

  • new physics only in \(C_9^\mu \);

    Table 3 Predictions for ratios of observables with muons vs. electrons for four different scenarios with NP only in one or two Wilson coefficients with muons. Ratios deviating from the SM prediction 1.00 by more than 30 % are highlighted in boldface. Differential branching ratios are given in units of GeV\(^{-2}\)
  • new physics in \(C_9^\mu \) and \(C_9^{\prime ~\mu }\);

  • new physics along the \(\mathrm{SU}(2)_L\) invariant direction \(C_9^\mu = - C_{10}^\mu \);

  • new physics independently in \(C_9^\mu \) and \(C_{10}^\mu \).

The Standard Model prediction for all the shown ratios is 1, with negligible uncertainties.Footnote 10 In all scenarios all branching ratio ratios are predicted around 0.8 both at low and high dimuon invariant mass. A similar ratio is seen for \(S_5\) at low \(q^2\). Only very small deviations from the SM are predicted for \(S_5\) and \(A_\text {FB}\) at high \(q^2\) as well as \(F_L\) at low and high \(q^2\).Footnote 11 The most interesting observable turns out to be the ratio of the forward–backward asymmetries in \(B\rightarrow K^* \mu ^+\mu ^-\) and \(B\rightarrow K^* e^+e^-\) in the \(q^2\) bin [4, 6] GeV

$$\begin{aligned} R_{A_\text {FB}} \equiv \frac{A_\text {FB}(B \rightarrow K^* \mu ^+\mu ^-)_{[4,6]}}{A_\text {FB}(B \rightarrow K^* e^+e^-)_{[4,6]}} . \end{aligned}$$
(27)

Assuming that the electron mode is SM like, \(R_{A_\text {FB}}\) is extremely sensitive to the value of \(C_9^\mu \). For the considered values of \(C_9^\mu \) it deviates drastically from the SM prediction and a precise measurement would even allow to distinguish between the considered scenarios.

4 Constraints on new physics models

The results from the model-independent fit of the Wilson coefficients in the effective Hamiltonian can be interpreted in the context of new physics models. Here we discuss implications for the minimal supersymmetric standard model (MSSM) and models that contain massive \(Z^\prime \) gauge bosons with flavour-changing couplings.

4.1 SUSY models with generic flavour violation

Recently, the \(B \rightarrow K^* \mu ^+\mu ^-\) decay has been studied in MSSM scenarios that do not contain sources of flavour violation beyond the CKM matrix [79]. We do not find sizable SUSY contributions to \(C_9\) and \(C_{10}\) in such scenarios. In the following, we will therefore allow for generic flavour violation.

Experimental data on flavour-changing neutral current processes lead to strong constraints on new sources of flavour violation that can be present in the MSSM [80, 81]. In particular, the experimental information on rare \(b \rightarrow s \mu ^+\mu ^-\) decays can be used to put constraints on flavour-violating trilinear couplings in the up-squark sector, which are only poorly constrained otherwise [8286]. In principle, the general MSSM also allows for lepton-flavour non-universality effects and we will comment to which extent the \(R_K\) measurement can be accommodated.

4.1.1 Bounds on flavour-changing trilinear couplings

In addition to the usual flavour diagonal trilinear couplings, the soft SUSY breaking Lagrangian can contain flavour-changing trilinear couplings of the left- and right-handed top and charm squarks with the up-type Higgs

$$\begin{aligned} \mathcal {L}_\text {trilinear}&\supset&A_t Y_t \tilde{t}_L^* \tilde{t}_R H_u + A_{tc} Y_t \tilde{t}_L^* \tilde{c}_R H_u\nonumber \\&+ A_{ct} Y_t \tilde{c}_L^* \tilde{t}_R H_u +\text {h.c.} \end{aligned}$$
(28)

The flavour-changing trilinears give contributions to the effective Hamiltonian in (1) at the one loop level. Contributions can arise from boxes, photon penguins, and Z penguins and example Feynman diagrams are shown in Fig. 7. A straightforward flavour spurion analysis shows the following points:

  • contributions to \(C_{7,8}^\prime \), are suppressed by \(m_s/m_b\) with respect to contributions to \(C_{7,8}\);

    Fig. 7
    figure 7

    Example Feynman diagrams that correspond to MSSM contributions to the effective Hamiltonian for \(b\rightarrow s \ell \ell \) transitions proportional to flavour-changing trilinear couplings. In the penguin diagrams, the photon, gluon and Z propagators need to be attached to the loop in all possible ways

  • contributions to \(C_{9,10}^\prime \) are suppressed by \(m_s m_b / m_t^2\) with respect to contributions to \(C_{9,10}\);

  • contributions proportional to \(A_{tc}\) are suppressed by \(m_c / m_t\) compared to contributions proportional to \(A_{ct}\).

We therefore concentrate on the Wilson coefficients \(C_7\), \(C_8\), \(C_9\) and \(C_{10}\) in the presence of a non-zero \(A_{ct}\). To illustrate the main parameter dependence, in the following we give simple approximate expressions for the Wilson coefficients that are obtained at leading order in an expansion in \(m_\text {EW}^2/m_\text {SUSY}^2\). The most important SUSY masses involved are the Wino mass \(M_2\), the Higgsino mass \(\mu \), the left-handed slepton mass \(m_{\tilde{\ell }}\), the stop masses \(m_{\tilde{t}_L}\) and \(m_{\tilde{t}_R}\), as well as the left-handed charm-squark mass \(m_{\tilde{c}_L}\). The largest effects in \(b \rightarrow s\) transitions can obviously be achieved if the SUSY spectrum is as light as possible. To keep the expressions compact, we set for simplicity \(M_2 = \mu = m_{\tilde{\ell }} \equiv M\), \(m_{\tilde{t}_L} = m_{\tilde{c}_L} \equiv m_L\), \(m_{\tilde{t}_R} \equiv m_R\). We also work in the limit \(M \ll m_R \ll m_L\), which is least constrained by collider searches and therefore allows one to maximise the new physics contributions to the Wilson coefficients. Note also that a light Higgsino and light stops are well motivated by naturalness arguments [8789]. For the dipole coefficients we find in a leading log approximation

$$\begin{aligned} C_7= & {} \frac{V_{cs}^*}{V_{ts}^*} \left( \frac{A_{ct}}{A_t} \right) \tan \beta \frac{m_W^2 m_t^2}{m_R^4} \frac{\mu M_2 |A_t|^2}{m_L^4}\nonumber \\&\times \left[ \frac{m_R^2}{M^2} - \frac{7}{3} \log \left( \frac{m_R^2}{M^2} \right) \right] \nonumber \\&- \frac{m_t^2}{m_R^2} \frac{\mu A_t}{m_L^2} \tan \beta \frac{1}{2}\log \left( \frac{m_R^2}{M^2} \right) , \end{aligned}$$
(29a)
$$\begin{aligned} C_8= & {} \frac{V_{cs}^*}{V_{ts}^*} \left( \frac{A_{ct}}{A_t} \right) \tan \beta \frac{m_W^2 m_t^2}{m_R^4} \frac{\mu M_2 |A_t|^2}{m_L^4} \log \left( \frac{m_R^2}{M^2} \right) \nonumber \\&- \frac{m_t^2}{m_R^2} \frac{\mu A_t}{m_L^2} \tan \beta \frac{1}{4} . \end{aligned}$$
(29b)

The contributions to \(C_7\) and \(C_8\) from \(A_{ct}\) arise first at the dimension-8 level, i.e. they are suppressed by \(m_\text {EW}^4/m_\text {SUSY}^4\). The last terms in (29a) and (29b) are the leading irreducible MFV contributions to \(C_7\) and \(C_8\) from Higgsino stop loops. They arise already at dimension 6 and are typically much larger than the contributions proportional to \(A_{ct}\).

For the box contributions, \(C_{9,10}^\text {box}\), and the photon penguin contribution, \(C_9^\gamma \), to the semileptonic operators we find

$$\begin{aligned} C_{10}^\text {box}= & {} \frac{1}{s_W^2} \frac{V_{cs}^*}{V_{ts}^*} \left( \frac{A_{ct}}{A_t} \right) \frac{m_W^2 m_t^2}{m_L^2 m_R^2} \left[ \frac{|A_t|^2}{4m_L^2} \log \left( \frac{m_R^2}{M^2}\right) \right. \nonumber \\&-\left. \frac{A_t}{12M^*} + \frac{M A_t}{4m_R^2} \log \left( \frac{m_R^2}{M^2}\right) \right] , \end{aligned}$$
(30a)
$$\begin{aligned} C_9^\text {box}= & {} - C_{10}^\text {box} , \end{aligned}$$
(30b)
$$\begin{aligned} C_9^\gamma= & {} \frac{V_{cs}^*}{V_{ts}^*} \left( \frac{A_{ct}}{A_t} \right) \frac{m_W^2 m_t^2}{m_L^2 m_R^2} \left[ \frac{2|A_t|^2}{3m_L^2} \log \left( \frac{m_R^2}{M^2}\right) \right. \nonumber \\&\left. - \frac{2A_t}{3M^*} + \frac{5M A_t}{3m_R^2} \log \left( \frac{m_R^2}{M^2}\right) \right] , \end{aligned}$$
(30c)

where \(s_W = \sin \theta _W\) and \(\theta _W\) is the Weinberg mixing angle. Again we find that these contributions arise at the dimension-8 level. For a TeV scale SUSY spectrum, they are completely negligible.

In the considered scenario, only the Z penguin contributions, \(C_{9,10}^Z\), arise already at the dimension-6 level. We find

$$\begin{aligned} C_{10}^Z= & {} \frac{1}{s_W^2} \frac{V_{cs}^*}{V_{ts}^*} \left( \frac{A_{ct}}{A_t} \right) \frac{m_t^2}{m_L^2} \left[ \frac{|A_t|^2}{2m_L^2} \log \left( \frac{m_L^2}{m_R^2}\right) \right. \nonumber \\&+\left. \frac{M A_t}{8m_R^2} \log \left( \frac{m_R^2}{M^2}\right) \right] , \end{aligned}$$
(31a)
$$\begin{aligned} C_9^Z= & {} (4s_W^2 -1) C_{10}^Z . \end{aligned}$$
(31b)

This suggests that there are regions of MSSM parameter space, where a contribution to \(C_{10}^Z\) of O(1) is indeed possible. MSSM contributions to \(C_9^Z\) on the other hand are suppressed by the accidentally small vector coupling of the Z boson to leptons, \((4s_W^2 -1) \sim -0.08\), and therefore negligible.

Recalling the model-independent results from Sect. 3, a positive new physics contribution to the Wilson coefficient \(C_{10}^\text {NP} \simeq O(1)\), can improve the agreement with the current experimental \(b \rightarrow s \mu ^+\mu ^-\) data significantly (albeit to a lesser extent than NP in \(C_9\)). Negative NP contributions to \(C_{10}\) on the other hand are strongly disfavoured with the current data. We use these results to probe regions of MSSM parameter space with sizable flavour-changing trilinear couplings.

Bounds on flavour-changing trilinear couplings can also be obtained from vacuum stability considerations. As is well known, sizable trilinear couplings can lead to charge and color breaking minima in the MSSM scalar potential [90, 91]. Requiring that the electroweak minimum be the deepest gives upper bounds on the trilinear couplings. Taking into account non-zero expectation values for the left- and right-handed stops, the left-handed charm squark, as well as the up-type Higgs, we find the following necessary condition to ensure absolute stability of the electroweak vacuum [92, 93]:

$$\begin{aligned}&( |A_t| + |A_{ct}| \tan \theta )^2 \lesssim (3+ \tan ^2\theta )\nonumber \\&\quad \times \, ( m_{\tilde{t}_L}^2 \cos ^2\theta + m_{\tilde{c}_L}^2 \sin ^2\theta + m_{\tilde{t}_R}^2 + m_{H_u}^2 + |\mu |^2 ). \end{aligned}$$
(32)

This inequality has to hold for all values of \(\theta \), which parametrises the angle in field space between the left-handed top and charm squarks. In the limit \(\theta = 0\) one recovers a well known bound on \(A_t\) given e.g. in [90]; for \(\theta = \pi /2\) one recovers the bound on \(A_{ct}\) found in [91].

In principle, additional constraints on \(A_{ct}\) can be obtained from the experimental bounds on electric dipole moments (EDMs). In particular, if \(A_{ct}\) and \(A_t\) contain a relative phase, a strange quark EDM and chromo EDM will be induced analogous to the new physics contributions to \(C_7\) and \(C_8\). However, predicting an experimentally accessible EDM of a hadronic system, like the neutron, given a strange quark EDM or chromo EDM involves large theoretical uncertainties [94, 95]. Due to these uncertainties, existing EDM bounds do not give appreciable constraints in our setup. Note also that bounds on the charm quark chromo EDM [96] do not constrain the parameter space of our scenario. A sizable charm quark chromo EDM would be generated in the presence of both \(A_{ct}\) and \(A_{tc}\) couplings, but here we only consider a non-zero \(A_{ct}\).

We now describe the SUSY spectrum that we chose to illustrate the bounds on the trilinear couplings from the \(b\rightarrow s \mu ^+\mu ^-\) data. The soft masses for the left-handed stop and charm squark are set to a common value \(m_{\tilde{t}_L} = m_{\tilde{c}_L} = 1\) TeV. The soft mass of the right-handed stop is set to \(m_{\tilde{t}_R} = 500\) GeV. All other squarks and sleptons as well as the gluino are assumed to be heavy, with masses of 2 TeV. Concerning the trilinear couplings, we only consider non-zero \(A_t\) and \(A_{ct}\). Due to these trilinear couplings, the lightest up-squark mass eigenstate can have a mass \(m_{\tilde{t}_1} < 500\) GeV and is potentially subject to strong bounds from direct stop searches. Higgsinos, Winos and Binos are assumed to have mass parameters \(m_{\tilde{B}} = 250\) GeV, \(m_{\tilde{W}} = 300\) GeV, \(\mu = 350\) GeV. In that way the mass of the lightest neutralino is given by \(m_{\tilde{\chi }_1^0} \simeq 225\) GeV and the mass of the lightest chargino is \(m_{\tilde{\chi }_1^\pm } \simeq 250\) GeV. Such a chargino–neutralino spectrum is heavy enough to avoid the bounds from the direct stop searches [97100]Footnote 12 as well as bounds from electro-weakino searches [103, 104]. Finally, we set \(\tan \beta = 3\) to minimise contributions to the dipole Wilson coefficients.

In Fig. 8 we show bounds on the trilinear couplings that can be derived from the \(b\rightarrow s \mu ^+\mu ^-\) data in the described scenario. We evaluate all MSSM 1-loop contributions to the Wilson coefficients \(C_{7,8,9,10}^{(\prime )}\) and compute the \(\chi ^2\) as defined in (15) as a function of the trilinear couplings. For the numerical evaluation of the Wilson coefficients in the MSSM, we use an adapted version of the SUSY_FLAVOR code [105107]. The plot on the left hand side of  Fig. 8 shows constraints in the \(A_t\)\(A_{ct}\) plane, assuming real trilinears. The plot on the right hand side shows constraints in the Re\((A_{ct})\)–Im\((A_{ct})\) plane, for a fixed \(A_t = -1.5\) TeV.Footnote 13 The red region is excluded by the \(b\rightarrow s \mu ^+\mu ^-\) data by more than \(2\sigma \) with respect to the SM (\(\chi ^2 > \chi ^2_\text {SM} + 6\)). In the blue region the agreement between the theory predictions and the experimental \(b\rightarrow s \mu ^+\mu ^-\) data is improved by more than \(1\sigma \) with respect to the SM (\(\chi ^2 < \chi ^2_\text {SM} - 2.3\)). In the best-fit point in the left plot of Fig. 8, the \(\chi ^2\) is reduced by 4.2 compared to the SM. This improvement is rather moderate compared to the results of the model-independent fits and also compared to the \(Z^\prime \) scenarios discussed below. In the black corners, the lightest up-squark mass eigenstate is lighter than the lightest neutralino. Outside the dashed contours there exist charge and color breaking minima in the MSSM scalar potential that are deeper than the electroweak minimum. Inside the contours, the NP effects in the Wilson coefficients are rather moderate. In particular, we find that in this region of parameter space the SUSY contribution to \(C_{10}\) does not exceed 0.3; the SUSY contribution to \(C_9\) is smaller by approximately one order of magnitude, as expected.

Note that the regions outside of the vacuum stability contours are not necessarily excluded. Even though a deep charge and color breaking minimum exists in these regions, the electroweak vacuum might be meta-stable with a live time longer than the age of the universe. Studies show that requiring only meta-stability, relaxes the stability bounds on the trilinear couplings slightly [108112]. A detailed analysis of vacuum meta-stability is beyond the scope of the present work.

4.1.2 Lepton-flavour non-universality in the MSSM

The Z penguin effects discussed above are lepton-flavour universal, i.e. they lead to the same effects in \(b \rightarrow s e^+e^-\) and \(b \rightarrow s \mu ^+\mu ^-\) decays. Breaking of e-\(\mu \) universality as hinted by the \(R_K\) measurement can only come from box contributions as they involve sleptons of different flavours. If there are large mass splittings between the first and second generations of sleptons, or more precisely, if the selectrons are decoupled but smuons are kept light, Wino box diagrams (and to a lesser extent also Bino box diagrams) can contribute to \(C_9^\mu \) and \(C_{10}^\mu \) but not to \(C_9^e\) and \(C_{10}^e\).

Box contributions are, however, typically rather modest in size. As discussed above, boxes that are induced by flavour-changing trilinears arise only at the dimension-8 level and are completely negligible. Non-negligible box contributions (at the dimension 6 level) are only possible in the presence of flavour violation in the squark soft masses. However, even allowing for maximal mixing of left-handed bottom and strange squarks, it was found in [3] that Winos and smuons close to the LEP bound of \(\sim \)100 GeV as well as bottom and strange squarks with masses of few hundred GeV would be required to obtain contributions to \(C_9^\mu \) and \(C_{10}^\mu \) of \(\gtrsim \)0.5, which could give \(R_K \sim 0.75\). A careful collider analysis would be required to ascertain if there are holes in the LHC searches for stops [97100], sbottoms [113115], sleptons [103, 116, 117] and electro-weakinos [103, 104] that would allow such an extremely light spectrum. We also note that a sizable splitting between the left-handed smuon and selectron masses required to break e-\(\mu \) universality is only possible if the slepton mass matrix is exactly diagonal in the same basis as the charged lepton mass matrix, since even a tiny misalignment would lead to an excessive \(\mu \rightarrow e \gamma \) decay rate.

4.2 Flavour-changing \(Z'\) bosons

A massive \(Z^\prime \) gauge boson with flavour-changing couplings to quarks is an obvious candidate that can lead to large effects in \(b \rightarrow s \ell \ell \) decays [3, 118128]. Instead of discussing a complete model that contains such a \(Z'\) boson, we will take a bottom up approach and ask which properties a \(Z'\) has to have in order to explain the discrepancies observed in the \(b \rightarrow s \ell \ell \) data. To this end we treat the mass of the \(Z'\) as well as its couplings to SM quarks and leptons as free parameters. Following the notation of [129], we parametrise the \(Z'\) couplings as

$$\begin{aligned} \mathcal L \supset \bar{f}_i \gamma ^\mu [ \Delta ^{f_if_j}_L(Z') P_L + \Delta ^{f_if_j}_R(Z') P_R ] f_j \,Z'_\mu . \end{aligned}$$
(33)

In the presence of \(\Delta _{L/R}^{bs}\) and \(\Delta _{L/R}^{\mu \mu }\) couplings, the \(Z'\) boson will contribute to the Wilson coefficients \(C_9^{(\prime )}\) and \(C_{10}^{(\prime )}\) at tree level. As the primed Wilson coefficients hardly improve the agreement of the experimental \(b \rightarrow s \mu ^+\mu ^-\) data with the theory predictions, we will not consider them here and set the right-handed bs couplings to zero, \(\Delta _R^{bs}=0\).

The \(Z'\) couplings \(\Delta _{L}^{bs}\) and \(\Delta _{L/R}^{\mu \mu }\) are subject to various constraints that bound the maximal effect a \(Z'\) prime can have in \(C_9\) and \(C_{10}\). In particular, a \(Z'\) boson with flavour-changing \(b \leftrightarrow s\) couplings will inevitably also contribute to \(B_s\)\(\bar{B}_s\) mixing at the tree level. One finds the following modification of the mixing amplitude:

$$\begin{aligned} \frac{M_{12}}{M_{12}^\text {SM}} -1 = \frac{v^2}{M_{Z'}^2} (\Delta _L^{bs})^2 \left( \frac{g_2^2}{16\pi ^2} (V_{tb}V_{ts}^*)^2 S_0 \right) ^{-1}, \end{aligned}$$
(34)

where \(v = 246\) GeV is the Higgs vev, and the SM loop function is given by \(S_0 \simeq 2.3\). We obtain the following stringent bound on the \(Z'\) mass and the flavour-changing coupling:

$$\begin{aligned} \frac{M_{Z'}}{|\Delta ^{bs}_L|}\gtrsim & {} 244~\text {TeV} \times \left( \frac{10\,\%}{|M_{12}/M_{12}^\text {SM}-1|} \right) ^{1/2} \nonumber \\\approx & {} \frac{10~\text {TeV}}{|V_{tb}V_{ts}^*|} \times \left( \frac{10\,\%}{|M_{12}/M_{12}^\text {SM}-1|} \right) ^{1/2} . \end{aligned}$$
(35)

In the following, we will allow for maximally 10 % new physics contribution to the mixing amplitude, which is approximately the size of non-standard effects that are currently probed in \(B_s\) mixing [130]. Concerning the couplings of the \(Z'\) to leptons, we will start with the least constrained case, where the \(Z'\) only couples to muons, but not to electrons and consider a coupling to left-handed muons only. Subsequently, we will discuss how our conclusions change if we assume a vector-like coupling to muons or a lepton-flavour universal coupling.

4.2.1 \(Z'\) with coupling to left-handed muons

The only non-zero coupling to charged leptons we consider here is \(\Delta _L^{\mu \mu }\). Such a \(Z^\prime \) is very poorly constrained. Over a very broad range of \(Z'\) masses, the strongest constraint on \(\Delta _L^{\mu \mu }\) comes from neutrino trident production [122, 131], i.e. the production of a muon pair in the scattering of a muon-neutrino in the Coulomb field of a heavy nucleus.Footnote 14 The relative correction of the trident cross section in the presence of the considered \(Z'\) is given by

$$\begin{aligned} \frac{\sigma }{\sigma _\text {SM}}= & {} \frac{1}{1+(1+4s_W^2)^2}\left[ \left( 1 + \frac{v^2 (\Delta _L^{\mu \mu })^2}{M_{Z^\prime }^2} \right) ^2\right. \nonumber \\&+\left. \left( 1 + 4s_W^2 + \frac{v^2 (\Delta _L^{\mu \mu })^2}{M_{Z^\prime }^2} \right) ^2 \right] . \end{aligned}$$
(36)

We use the CCFR measurement of the trident cross section, \(\sigma _\text {CCFR}/\sigma _\text {SM} = 0.82 \pm 0.28\) [132], to set bounds on the \(Z'\) mass and its coupling to muons. At the \(2\sigma \) level we find

$$\begin{aligned} \frac{M_{Z'}}{|\Delta ^{\mu \mu }_L|} \gtrsim 0.47~\text {TeV} . \end{aligned}$$
(37)

Combining this result with the bound on the flavour-changing quark coupling from \(B_s\) mixing, Eq. (35), we can derive an upper bound on the possible size of new physics contributions to the Wilson coefficients \(C_9\) and \(C_{10}\) that can be achieved in the considered setup. For the Wilson coefficients we have

$$\begin{aligned}&C_9^\text {NP} = -C_{10}^\text {NP} = -\frac{\Delta ^{bs}_L\Delta ^{\mu \mu }_L}{V_{tb}V_{ts}^*}\left[ \frac{\Lambda _v}{M_{Z'}}\right] ^2 ,\nonumber \\&\quad \text {with}\quad \Lambda _v = \left[ \frac{\pi }{\sqrt{2}G_F\alpha _\text {em}}\right] ^{1/2}\approx 4.94\,\text {TeV} . \end{aligned}$$
(38)

This implies

$$\begin{aligned} |C_9^\text {NP}| = |C_{10}^\text {NP}| <5.4 . \end{aligned}$$
(39)

The best-fit values in the \(C_9^\text {NP} = - C_{10}^\text {NP}\) scenario found in Sect. 3.3 are well within this bound.

Although the explanation of the tensions in \(b\rightarrow s\mu ^+\mu ^-\) transitions does not require a coupling of the \(Z'\) to first-generation quarks, it is interesting to investigate what happens in models where such couplings are present, which could lead to \(Z'\) signals at the LHC. Fixing the Wilson coefficients \(C_9\) and \(C_{10}\) to their best-fit values and assuming the flavour-changing coupling to have its maximal value (35) allowed by \(B_s\) mixing, we find a lower bound on the muon coupling,

$$\begin{aligned} \Delta _{L}^{\mu \mu }\gtrsim 0.3 \left[ \frac{M_{Z'}}{\text {TeV}}\right] . \end{aligned}$$
(40)

Adopting the lower end of this range, ATLAS and CMS searches for quark-lepton contact interactions [133, 134] can be used to put an upper bound on the \(Z'\) coupling to the left-handed first-generation quark doublet. Using the CMS results [134], we find

$$\begin{aligned} \frac{M_{Z'}}{|\Delta ^{qq}_L|} \gtrsim 11~\text {TeV} ~(7~\text {TeV}) \end{aligned}$$
(41)

for constructive (destructive) interference with the SM \(q_L \bar{q}_L \rightarrow \mu ^+\mu ^-\) amplitude. Comparing this to (35), we conclude that models with a rough scaling \(|\Delta ^{bs}_L|\sim |V_{tb}V_{ts}^*\Delta ^{qq}_L|\) are compatible with these bounds.

For a \(Z'\) mass between 200 GeV and 3.5 TeV, also LHC searches for resonances [135, 136] in the dimuon mass spectrum can be used to put an upper bound on the \(Z'\) coupling to first-generation quarks as a function of \(M_{Z'}\). In Fig. 9 we show the bound on \(\Delta _L^{qq}\) using the results from the ATLAS search [135] (shaded blue region). For the branching ratios of the \(Z'\) we assume \(\text {BR}(Z'\rightarrow \mu ^+\mu ^-)=\text {BR}(Z'\rightarrow \nu _\mu \bar{\nu }_\mu )=\frac{1}{2}\), which approximately holds as long as the \(\Delta _L^{\mu \mu }\) coupling is sufficiently large compared to couplings to other states. The bound from resonance searches could be weaker if the \(Z'\) has e.g. a sizable branching ratio into a dark sector. In the same plot, we also show the bound from quark-lepton contact interaction searches from CMS [134], assuming (40) (red line). Below 3.5 TeV, we show this bound as a dashed line, because for such light \(Z'\) masses the contact interaction approximation becomes invalid.

Fig. 9
figure 9

Bounds in the plane of \(Z'\) mass and the \(Z'\) coupling to the left-handed first-generation quark doublet. The blue shaded region is excluded by searches for resonances in the dimuon invariant mass spectrum [135], assuming a \(Z' \rightarrow \mu ^+\mu ^-\) branching ratio of 50 %. The region above the red curve is excluded by searches for quark-lepton contact interactions [134]. The upper plot axis shows the minimal value of the \(Z'\) coupling to left-handed muons (40), required to obtain the best-fit values for \(C_9\) and \(C_{10}\)

We conclude that, in order to lead to visible effects in \(b\rightarrow s\mu ^+\mu ^-\) transitions, a heavy \(Z'\) with \(M_{Z'} \gtrsim 3~\text {TeV}\) can have weak-interaction strength couplings to first-generation quarks without being in conflict with the bounds from contact interactions. Such a heavy \(Z'\) must have strong couplings to muons (\(\Delta _L^{\mu \mu } \gtrsim 1\)). A lighter \(Z'\) can be weakly coupled to muons but requires a suppression of the coupling to first-generation quarks by roughly two orders of magnitude to avoid the bounds from direct searches.

4.2.2 \(Z'\) with vector-like coupling to muons

If the couplings of the \(Z'\) to muons are purely vector-like we can define \(\Delta _L^{\mu \mu } = \Delta _R^{\mu \mu } \equiv \Delta _V^{\mu \mu }/2\). In this case, the correction to the neutrino trident cross section reads

$$\begin{aligned} \frac{\sigma }{\sigma _\text {SM}} = \frac{1}{1+(1+4s_W^2)^2}\left[ 1 + \left( 1 + 4s_W^2 + \frac{ v^2 (\Delta _V^{\mu \mu })^2}{2 M_{Z^\prime }^2} \right) ^2 \right] , \end{aligned}$$
(42)

and we obtain the following bound using the CCFR measurement:

$$\begin{aligned} \frac{M_{Z'}}{|\Delta ^{\mu \mu }_V|} \gtrsim 0.27~\text {TeV} . \end{aligned}$$
(43)

Now the NP contribution to the Wilson coefficient \(C_{10}\) vanishes, while for \(C_9\) one has

$$\begin{aligned} C_9^\text {NP} = -\frac{\Delta ^{bs}_L\Delta ^{\mu \mu }_V}{V_{tb}V_{ts}^*}\left[ \frac{\Lambda _v}{M_{Z'}}\right] ^2. \end{aligned}$$
(44)

Again, one finds that sizable effects are possible: adopting the maximum allowed values for the couplings (43) and (35), we find \(|C_9^\text {NP}|<9.3\). The bounds on the first-generation quark couplings from contact interaction and dimuon resonance searches are qualitatively similar to the left-handed case.

4.2.3 \(Z'\) with universal coupling to leptons

If the \(Z'\) coupling to leptons is flavour universal, stringent bounds on \(\Delta ^{\ell \ell }\) can be obtained from LEP2 searches for four lepton contact interactions [137]. Depending on whether the coupling is to left-handed leptons only or is vector-like, we find

$$\begin{aligned}&\frac{M_{Z'}}{|\Delta ^{\ell \ell }_L|} \gtrsim 3.9~\text {TeV} , \end{aligned}$$
(45)
$$\begin{aligned}&C_9^\text {NP} = -C_{10}^\text {NP} = -\frac{\Delta ^{bs}_L\Delta ^{\ell \ell }_L}{V_{tb}V_{ts}^*}\left[ \frac{\Lambda _v}{M_{Z'}}\right] ^2 \end{aligned}$$
(46)
$$\begin{aligned}&\Rightarrow |C_9^\text {NP}|=|C_{10}^\text {NP}|<0.64 , \end{aligned}$$
(47)
$$\begin{aligned}&\frac{M_{Z'}}{|\Delta ^{\ell \ell }_V|} \gtrsim 3.5~\text {TeV}, \end{aligned}$$
(48)
$$\begin{aligned}&C_9^\text {NP} = -\frac{\Delta ^{bs}_L\Delta ^{\ell \ell }_V}{V_{tb}V_{ts}^*}\left[ \frac{\Lambda _v}{M_{Z'}}\right] ^2, \end{aligned}$$
(49)
$$\begin{aligned}&\Rightarrow |C_9^\text {NP}|<0.72 , \end{aligned}$$
(50)

where, for the last step, the flavour-changing coupling has been assumed to saturate the upper bound in (35) coming from \(B_s\) mixing. We observe that the effects in \(b\rightarrow s\mu ^+\mu ^-\) transitions are now much more limited, but, in particular for left-handed couplings, can still come close to the best-fit values in Sect. 3.3 (of course, the anomaly in \(R_K\) cannot be explained in this scenario). Interestingly, this also implies that the effect in \(B_s\) mixing is necessarily close to the current experimental bounds. Future improvements of the \(B_s\) mixing constraints will then allow to test the lepton-flavour-universal scenario.

Concerning collider searches, the new feature of the lepton-universal case is that there is an absolute lower bound on the \(Z'\) mass from LEP2, \(M_{Z'}>209\) GeV. LHC bounds on the coupling to first-generation quarks, on the other hand, are qualitatively similar to the non-universal case discussed above.

5 Summary and conclusions

Several recent results on rare B decays by the LHCb collaboration show tensions with Standard Model predictions. Those include discrepancies in angular observables in the \(B \rightarrow K^* \mu ^+\mu ^-\) decay, a suppression in the branching ratios of \(B \rightarrow K^* \mu ^+\mu ^-\) and \(B_s \rightarrow \phi \mu ^+\mu ^-\), as well as a hint for the violation of lepton-flavour universality in the form of a \(B \rightarrow K \mu ^+\mu ^-\) branching ratio that is suppressed not only with respect to the SM prediction but also with respect to \(B \rightarrow K e^+e^-\). In this paper we performed global fits of the experimental data within the SM and in the context of new physics.

For our SM predictions we use state-of-the-art \(B \rightarrow K\), \(B \rightarrow K^*\) and \(B_s \rightarrow \phi \) form factors taking into account results from lattice and light-cone sum rule calculations. All relevant non-factorisable corrections to the \(B \rightarrow K \mu ^+\mu ^-\), \(B \rightarrow K^* \mu ^+\mu ^-\) and \(B_s \rightarrow \phi \mu ^+\mu ^-\) amplitudes that are known are included in our analysis. Additional unknown contributions are parametrised in a conservative manner, such that existing estimates of their size are within the \(1\sigma \) range of our parametrisation. We take into account all the correlations of theoretical uncertainties between different observables and between different bins of dilepton invariant mass. As experimental data is available for more and more observables in finer and finer bins, the theory error correlations have a strong impact on the result of the fits.Footnote 15

Making use of all relevant experimental data on radiative, leptonic and semileptonic \(b \rightarrow s\) decays we find that there is on overall tension between the SM predictions and the experimental results. Assuming the absence of new physics, we investigated to which extent non-perturbative QCD effects can be responsible for the apparent disagreement. We find that large non-factorisable corrections, a factor of 4 above our error estimate, could improve the agreement for the \(B \rightarrow K^* \mu ^+\mu ^-\) angular observables and the branching ratios considerably. Alternatively, the branching ratio predictions could also be brought into better agreement with the experimental data, if the involved form factors were all systematically below the theoretical determinations from the lattice and from LCSR. On the other hand, we find that non-standard values of the form factors could at most lead to a modest improvement of \(B \rightarrow K^* \mu ^+\mu ^-\) angular observables. In both cases, however, the hint for violation of lepton-flavour universality cannot be explained.

Assuming that in our global fits the hadronic uncertainties are estimated in a sufficiently conservative way, we discussed the implications of the experimental results on new physics. Effects from new physics at short distances can be described model independently by an effective Hamiltonian and the experimental data can be used to obtain allowed regions for the new physics contributions to the Wilson coefficients. We find that the by far largest decrease in the \(\chi ^2\) can be obtained either by a negative new physics contribution to \(C_9\) (with \(C_9^\text {NP} \sim -25\,\% \times C_9^\text {SM}\)), or by new physics in the \(\mathrm{SU}(2)_L\) invariant direction \(C_9^\text {NP}=-C_{10}^\text {NP}\), (with \(C_9^\text {NP} \sim -12\,\% \times C_9^\text {SM}\)). A positive NP contribution to \(C_{10}\) alone would also improve the fit, although to a lesser extent.

Concerning the hint for violation of lepton-flavour universality, we observe that new physics exclusively in the muonic decay modes leads to an excellent description of the data. We do not find any preference for new physics in the electron modes. We provide predictions for other lepton-flavour-universality tests. We find that the ratio \(R_{A_\text {FB}}\) of the forward–backward asymmetries in \(B \rightarrow K^* \mu ^+\mu ^-\) and \(B \rightarrow K^* e^+e^-\) at low dilepton invariant mass is a particularly sensitive probe of new physics in \(C_9^\mu \). A precise measurement of \(R_{A_\text {FB}}\) would allow to distinguish the new physics scenarios that give the best description of the current data.

Finally we also discussed the implications of the model-independent fits for the minimal supersymmetric standard model and models that contain \(Z^\prime \) gauge bosons with flavour-changing couplings. In the MSSM, large flavour-changing trilinear couplings in the up-squark sector can give sizable contributions to the Wilson coefficient \(C_{10}\) and we identified regions of MSSM parameter space that are favoured or disfavoured by the current experimental data. Heavy \(Z'\) bosons can have the required properties to explain the discrepancies observed in the \(b \rightarrow s \ell \ell \) data. If the \(Z'\) couples to muons but not to electrons (as preferred by the data), it is only weakly constrained by indirect probes. On the other hand, if the \(Z'\) couplings to leptons are flavour universal, LEP constraints on four lepton contact interactions imply that an explanation of the \(b \rightarrow s \ell \ell \) discrepancies results in new physics effects in \(B_s\) mixing of at least \(\sim \)10 %. In all scenarios, the couplings of the \(Z'\) to first-generation quarks are strongly constrained by ATLAS and CMS measurements of dilepton production.

We look forward to the updated experimental results using the full LHCb data set, which will be crucial in helping to establish or to refute the exciting possibility of new physics in \(b \rightarrow s\) transitions.