Analysis of B 0 ( s ) → µ + µ − decays at the LHC

: This article reviews the most recent measurements of B 0 ( s ) → µ + µ − decay properties at the LHC, which are the most precise to date. The measurements of the branching fraction and effective lifetime of the B 0 s → µ + µ − decay by the ATLAS, CMS and LHCb collaborations, as well as the search for B 0 → µ + µ − decays are summarised with a focus on the experimental challenges. Furthermore prospects are given for these measurements and new observables that become accessible with the foreseen amounts of data by the end of the LHC.


Introduction
This review summarises the most recent measurements related to the B 0 → µ + µ − and B 0 s → µ + µ − decays performed with the ATLAS, CMS and LHCb experiments.The B 0 → µ + µ − and B 0 s → µ + µ − decays belong to the category of Flavour Changing Neutral Current (FCNC) processes and therefore highly suppressed in the Standard Model (SM).This makes them important tools in the search for New Physics (NP) since they can provide indirect constraints on NP processes that interfere with the SM processes and alter the rates and decay properties sizeably.They are even sensitive to particles that are out of the kinematic range accessible by particle colliders, including the Large Hadron Collider (LHC).The B 0 (s) → µ + µ − decays are among the most sensitive FCNC processes due to their small theoretical uncertainty and clean experimental signature [1][2][3][4][5].In the SM, the decays B 0 (s) → µ + µ − are forbidden at leading-order and can proceed only via loop diagrams.In addition, they are also suppressed by the helicity conservation and the presence of off-diagonal CKM matrix elements, leading to very small expected decay time integrated branching fractions.Additional interest in B 0 s → µ + µ − decays comes from the simple description in effective field theory [6,7].The decays can only proceed via axial-vector (Wilson coefficient C 10 ), scalar (C S ) or pseudo-scalar (C P ) b → sℓ l currents, where the scalar and pseudo-scalar currents are forbidden in the SM.Thus, measurements of B 0 s → µ + µ − properties are crucial inputs for global fits of the parameters that govern b → sℓ l transitions.
The most up-to-date SM predictions for the B 0 s → µ + µ − and B 0 → µ + µ − branching fractions are calculated in Ref. [5] and yield B(B 0 s → µ + µ − ) = (3.66 ± 0.14) × 10 −9 and B(B 0 → µ + µ − ) = (1.03± 0.05) × 10 −10 . ( The values of these branching fractions are CP averaged and time-integrated and include final state radiation effects, so that they can be readily compared with the experimental measurements, which do not distinguish between B 0 (s) CP eigenstates and take final state radiation effects into account.Next-to-leading order electroweak corrections and nextto-next-to-leading order QCD corrections are also included in the calculations.Recently, several progresses in lattice quantum chromodynamics (QCD) [8][9][10][11][12], in the calculation of electroweak effects at next-to-leading order [2], and QCD effects at next-to-next-toleading order [3] helped significantly in reducing the theoretical uncertainties on both branching fractions.Enhanced electromagnetic contributions from virtual photon exchange were also proven to produce larger corrections to theoretical uncertainties than previously assumed [4,5].These predictions also take into account the finite width difference measured in the B 0 s system, that apply in the experimental measurements where the data samples are un-tagged (see Ref. [13,14]).Alternative predictions are also available.They are obtained using the relation between B 0 (s) → µ + µ − decays and ∆m d(s) , the mass difference of the B 0 (s) mass eigenstates [15,16].In addition, it has been recently pointed out [17] that the current way to calculate B(B 0 (s) → µ + µ − ) could be affected by the presence on NP effects.Therefore, a calculation based on ∆m d(s) and |ϵ K | considering only the SM contribution has been proposed.In both cases, the resulting values for B(B 0 (s) → µ + µ − ) are slightly different than the ones shown in Eq. 1, but still compatible within the theoretical uncertainties of the calculation, that are, on their own, still smaller than the experimental precision.Since all Collaborations used the values reported in Eq. 1 to assess the level of compatibility of the measurements with the SM predictions, in the remainder of the article, the values quoted in Eq. 1 will be used as reference values for the B(B 0 (s) → µ + µ − ) SM predictions.
While the mentioned reference does not quote a value for the ratio of the two branching fractions, this can be easily calculated as: where τ B 0 and 1/Γ s H are the lifetimes of the B 0 and of the heavy mass eigenstate of the B 0 s ; M B 0 s and M B 0 are the masses and f B 0 s and f B 0 the meson decay constants of the B 0 s and B 0 mesons respectively; V td and V ts the elements of the CKM matrix and m µ the mass of the muon.Using the same input values as Ref. [5], the numerical value in Eq. 2 is obtained.It is worth noting that the ratio has a theoretical uncertainty which is smaller than the single branching fractions due to the cancellation of most of the factors.In particular this ratio has the same value in all theories obeying the Minimal Flavour Violation (MFV) paradigm (including the SM) and as such it is a test of the latter.It is therefore of additional interest to evaluate R also in upcoming measurements.
A second observable of the B 0 s → µ + µ − decay considered in the latest experimental results is its effective lifetime τ µµ .This observable is complementary to the branching fraction because it is sensitive to potential New Physics (NP) effects which are flavourdependent [13].In fact, in the SM, only the heavy CP odd heavy-mass eigenstate component of the B 0 s -B 0 s system contributes to the B 0 s → µ + µ − decay amplitude: an assumption which does not hold in every NP scenario.Therefore, the measurement of this quantity could reveal the presence of NP effects which do not affect the branching fraction measurement.τ µµ is simply defined as the mean lifetime of where t is the proper decay time of the B 0 s meson and y s and the CP parameter A ∆Γ are defined as and denote the contributions of the heavy and light mass eigenstates of the B 0 s system to the un-tagged B 0 s → µ + µ − decay rate.Since the µ + µ − final state is CP-odd, in the SM A ∆Γ = +1 and the effective lifetime is equal to the lifetime of the heavy-mass B 0 s eigenstate τ SM µµ .The CP asymmetry A ∆Γ can receive contributions from NP effects, particularly from scalar and pseudoscalar operators, even in cases where the branching fraction B(B 0 s → µ + µ − ) is not modified.The most recent τ SM µµ value is 1.624 ± 0.009 ps [18], which can be slightly different from that used by the various Collaborations depending on publication time of their most recent measurement.
All experimental results described in this review assume the SM hypothesis A ∆Γ = 1 in the calculation of efficiencies and acceptance for the B 0 s → µ + µ − decay and thus for its branching fractions.However, the ATLAS, CMS and LHCb Collaborations estimated the impact of such assumption on the B 0 s → µ + µ − branching fraction, which spans from 4% to 10% depending on A ∆Γ varying in the interval [-1,1].
The article is structured as follows: in Sections 2, 3 and 4, the measurements by the ATLAS, CMS and LHCb Collaborations are reported respectively, while in Section 5 the results obtained by the latest official LHC combination are presented.Section 6 provides a summary of the status of the measurements and prospects of the three Collaborations for the HL-LHC phase.

The ATLAS B(B 0
(s) → µ + µ − ) and effective lifetime measurements 2.1.The B(B 0 (s) → µ + µ − ) measurement The measurement of the branching fractions of the B 0 s → µ + µ − and B 0 → µ + µ − decays performed by the ATLAS Collaboration is described and documented in Ref. [19].The analysis uses 26.1 fb −1 of Run-2 data collected at √ s = 13 TeV, and combines the result with the previously published Run-1 analysis [20] on 4.7 fb −1 of data at √ s = 7 TeV and 20.3 fb −1 at √ s = 8 TeV.In order to remove the dependence from the knowledge of the b-quark production cross section and minimise the systematic uncertainties, the branching fractions are measured relative to a reference channel.For its abundance and well-measured branching ratio, the B + → J/ψ(µ + µ − )K + decay channel has been chosen for this purpose.As a consequence, the procedure to extract the B(B 0 (s) → µ + µ − ) takes into account the difference in the fragmentation fractions f u,d,s of b-quarks to form, respectively, a B + , B 0 s or B 0 meson.Also the different acceptances and efficiencies between the signal and the reference channels are taken into account.Hence, the branching fractions B(B 0 (s) → µ + µ − ) are expressed as: where ) events, and ϵ tot (ϵ B + tot ) is the total signal (B + → J/ψK + ) efficiency.Events from B 0 s → J/ψ ϕ decay, with J/ψ → µµ and ϕ → KK, are also used as control sample for the signal kinematic variables exploited in the analysis.
The signal selection starts with a hardware dimuon trigger requiring one muon with transverse momentum p T > 4 GeV and the other with p T > 6 GeV.In the offline analysis, both muons are required to have the same p T thresholds as in the trigger selection, to have pseudo-rapidity |η| < 2.5, and to pass stringent track-quality requirements (Tight muons).Signal candidates are formed with two muons with opposite electric charges.Kaon candidates for the reference channel are reconstructed in the tracking system and are required to have p T > 1 GeV and |η| < 2.5.
B-meson kinematic observables are reconstructed imposing quality requirements on the dimuon vertex for the signal, or on the vertex formed by the dimuon system and one track for the reference channel.The reconstructed B candidates are also required to fall within a fiducial volume defined as p T (B) > 8 GeV and |η(B)| < 2.5.
The analysis uses mainly the B-candidate invariant mass to characterise the selected events.B-candidates with a mass in the 4766-5966 MeV interval, are considered.A blind analysis is performed where the dimuon invariant mass signal region between 5166 and 5526 MeV is not used until the analysis criteria and strategies are finalised.
The main backgrounds for this analysis can be split into three categories: continuum background, partially reconstructed B decays (PRD) and peaking background.The continuum background consists mainly of muons produced in uncorrelated hadron decays.It is the dominant background for the analysis and it is several orders of magnitude larger than the signal.Therefore, a Boosted Decision Tree [21] (c-BDT) is emplyed to efficiently separate the signal from this background type.The c-BDT is based on 15 kinematic variables with high discriminating power which describe the kinematics of the B-meson candidate, the secondary vertex displacement, the kinematic properties of the muons and the rest of the event (such as the isolation of the B candidate and that of the two muon tracks with respect to the rest of the event).The c-BDT is trained and validated on the data mass sidebands.
The PRD background is made of decays where the two muons in the final state come from one of the following topologies: (a) 'cascade' transitions with the muons coming from the same ancestor (e.g.b → cµν → sµµνν), and labelled same-side muons (SS); (b) from the same decay chain (e.g.B → J/ψX or B 0 → µµK * ) and labelled same vertex muons (SV); (c) from B c → J/ψµν decays; (d) from semileptonic B decays where a hadron h (π, K or proton) is misidentified as a muon (e.g.B → µhν).All these types of backgrounds populate the low-mass sideband with contributions also into the dimuon mass signal region.
The peaking background consists of charmless two body decays B 0 (s) → h + h ( ′ )− (h (′) being a pion or a kaon) that are reconstructed as signal events due to the hadrons being misidentified as muons.This background component falls in the signal region and presents the same features of the B 0 → µ + µ − signal.Its contribution has been studied with the help of a dedicated MC sample and validated in data in a region enriched by hadrons misidentified as muons.The resulting peaking background contribution is estimated to be 2.9 ± 2.0 events in the signal region.
To extract the B(B 0 (s) → µ + µ − ) using Eq. 5, the yield of the reference channel and the efficiency ratio between the two channels needs to be computed.The B + → J/ψ K + yield N B + is obtained by an unbinned extended maximum-likelihood fit to the µµK + invariant mass distribution, where the shape parameters are fitted simultaneously in data and simulation.
The efficiency ratio between signal and reference channels is computed from appropriate simulation samples within the fiducial volume of the analysis.These samples are reweighted in such a way that they reproduce the distributions of the number of primary vertices (and therefore pile-up), p T (B), |η(B)| and trigger efficiencies (as a function of p T (µ) and |η(µ)|) as measured in data.Furthermore, a correction to the B 0 s lifetime in the simulated signal sample is applied to match the distribution of the heavy B 0 s mass eigenstate, because the B 0 s → µ + µ − decay proceeds in the SM exclusively through the heavy B 0 s mass eigenstate, as described in Section 1.
The yields of signal events N B 0 (s) are extracted simultaneously from an unbinned extended maximum-likelihood fit to the dimuon invariant mass distribution m µµ .In order to enhance the sensitivity of the analysis, four bins in the c-BDT output (in increasing order of signal-over-background ratio) are defined in order to have constant signal efficiency equal to 18% in each bin.The fit is performed simultaneously in the four c-BDT bins.
The first c-BDT bin, which has the lowest signal-over-background ratio, is dominated by the main backgrounds.It is introduced in the fit to improve the backgrounds modelling and reduce the systematic uncertainties related to them.The B 0 (s) → µ + µ − signals are parameterised by a double Gaussian function to take into account different resolutions in the dimuon invariant mass depending on the different regions of the ATLAS detector.The shape and the relative signal efficiencies are assumed to be the same in all c-BDT bins.The continuum background is described by a first order polynomial, while the background coming from SS and SV events is parameterised with an exponential function.These backgrounds are fluctuated independently in each c-BDT bin.Finally, the description of the peaking background is based on the same model used to describe the signal, with a constraint on the total yield of 2.9 ± 2.0 equally distributed in the c-BDT bins.
The B(B 0 (s) → µ + µ − ) values are extracted through a simultaneous unbinned extended maximum-likelihood fit using the components written in Eq. 5 and the N B 0 (s) event yields extracted from the invariant mass fits just described.The B(B + → J/ψ(µ + µ − )K + ) value is taken as the world average from the PDG [22], while the hadronisation probability ratio = 0.256 ± 0.013 from the HFLAV average [23].
The measurements are dominated by statistical uncertainties, with the most prominent sources of systematic uncertainty coming from: the fit uncertainties (where the largest contributors are the mass scale and the b → µ + µ − X background parameterisation), the ratio (only for the BR(B 0 s ) measurement) and the reference channel yield.All systematic uncertainties are described in the likelihood as Gaussian constraints.
A Neyman construction [24] is employed to extract the 68.3%, 95.5% and 99.7% confidence intervals in the B(B The likelihood function from the described Run-2 result is then combined with the likelihood function from the Run-1 re-sult [20].The only common parameters in the combination are: the fitted B(B 0 (s) → µ + µ − ) and the external inputs (B(B + → J/ψ(µ ).All remaining nuisance parameters are treated as uncorrelated between the two results.
The ATLAS results, obtained by combining 25 fb −1 from Run1 and 26.1 fb −1 from Run2 LHC campaigns, are [19]: with a significance for the B 0 s → µ + µ − signal of 4.6 standard deviations (σ).The 95 % confidence level (CL) upper limit for the as obtained with the Neyman procedure described in Ref. [24].Fig. 1 shows the dimuon invariant mass distribution in the highest-score BDT bin (left) and the likelihood contours in the B(B

The B 0
s → µ + µ − effective lifetime measurement Using the same dataset and the same configurations for the event selection and the simulated samples, ATLAS has subsequently performed a measurement of the B 0 s → µ + µ − effective lifetime τ µµ [25].As explained in Section 1, the measurement of this quantity is complementary to the branching fraction measurement in the searches for NP phenomena.The only difference between the two analyses lies in the different selection applied to the c-BDT output.A requirement on the c-BDT output to be larger than 0.365 is applied to the dataset, while all other requirements are the same as the BR analysis.The value of this requirement was selected after an optimisation procedure based on the maximisation of the S/ In the first step, the dimuon invariant mass distribution, after all selection cuts described above, is fit using a five parameters model made of three Probability Density Functions (PDF): a double Gaussian to describe the B 0 s → µ + µ − component, a linear function to describe the combinatorial (or continuum) background component and an exponential function to describe the PRD component.Additional resonant and non-resonant backgrounds (such as B → hh ′ , B ± c and semileptonic B decays), as well as the B 0 → µ + µ − component, are neglected in this fit and considered as sources of systematic uncertainties whose impact on τ µµ is evaluated through MC pseudo-experiments (as described later in the text).The fit yields 58±13 events in the B 0 s → µ + µ − mass window.The second step exploits the sPlot statistical technique to extract the B 0 s → µ + µ − signal proper-decay time component from the invariant mass fit.The signal proper-decay time distribution is background-subtracted by means of per-event weights computed using the result of the invariant mass fit described above.
The third and final step consists of a binned-χ 2 fit to the proper-decay time distribution extracted in the previous step.This distribution is considered in the interval 0-12 ps and divided in twelve equal width bins.Pure signal proper-decay time simulated templates in the same interval and binning scheme corresponding to different values of τ µµ are generated, and a χ 2 -binned fit is performed with respect to background-subtracted data.The χ 2 calculation takes both the statistical uncertainty on the weight-corrected MC and the Poissonian uncertainty in each data bin as expected from the predicted MC content for that bin into account.The template minimising the χ 2 corresponds to an observed lifetime τ Obs µµ of 1.07 ps.MC pseudo-experiments studies, generated for a lifetime of 1.624 ps (i.e. the SM predicted value) showed that the lifetime extraction procedure had a bias of 82 fs due to the low-statistics regime of the fit.This bias is found to be constant in the B 0 s lifetime range considered in the analysis.Therefore the quoted value for τ Obs µµ has been corrected for this effect.The statistical uncertainty on τ Obs µµ is instead extracted using a MC pseudoexperiments based Neyman construction, yielding to a value of τ Obs µµ = 0.99 +0.42 −0.07 (stat.)ps.Fig. 2 shows the signal proper decay time distribution extracted from data superimposed with the MC template minimising the χ 2 distribution (left) and the MC pseudo-experiments based Neyman construction used to estimate the statistical uncertainty of the measurement (right).
The dominant systematic uncertainties for this measurement are related to the data-MC discrepancies (134 fs evaluated in data by repeating, under the same statistical regime as the B 0 s → µ + µ − signal case, the same fit procedure in the B ± → J/ψK ± channel), to the background mass and lifetime models (86 fs), to the fit dependence from the lifetime used in MC pseudo-experiments generation and the B 0 s eigenstates admixture (15 fs evaluated by generating MC pseudo-experiments in the τ SM L τ SM H lifetime interval) and the neglected resonant and non-resonant backgrounds (12 fs).The total systematic uncertainty is then obtained by summing in quadrature and symmetrising the impact on τ µµ of all single sources.This yields to an observed value of τ Obs µµ of 0.99 +0.42 −0.07 (stat.)± 0.17(syst.).The value is compatible with the SM prediction of 1.624 ps (A ∆Γ = 1) as well as with the other experimental results described in this article.

Measurement of B 0
s → µ + µ − decay properties and search for B 0 → µ + µ − decay at CMS The latest analysis by the CMS collaboration is based on the LHC Run-2 data collected in 2016-2018 at a center-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 140 fb −1 [28].The studies based on LHC Run-1 samples collected in 2011-2012 can be found in the earlier publications [29].There is no attempt to combine the latest publication with the results from 2011-2012 data as the expected gain in sensitivity is modest.In this section the latest CMS measurement of B(B 0 s → µ + µ − ), the search for B 0 → µ + µ − decay, and the effective lifetime measurement using B 0 s → µ + µ − events are discussed.
(s) → µ + µ − comprises two muons originating from a single displaced vertex, isolated from other activities, with momentum aligned with the flight direction, and an invariant mass peaking at M(B 0 s ) or M(B 0 ).The primary contributors to the background comprise combinatorial events, involving instances where the two muons originate from different heavy quarks, partially reconstructed semileptonic decays wherein both muons emanate from the same B meson (with one of the muons from a misidentified charged hadron), and the background arising from peaking charmless two-body hadronic B meson decays.
The data events were collected with a set of dimuon triggers for this study: the L1 trigger required two oppositely charged muons within the range of |η| < 1.5, while at HLT the dimuon should form a secondary vertex and are required to be within specific mass ranges.The dimuon candidates are used to reconstruct B mesons for the signal and normalization B + → J/ψ K + and B 0 s → J/ψ ϕ channels.The selections are reserved to be as similar as possible for partial cancellation of systematic effects.Muons at offline analysis are required to have a high-quality track fit at tracker, a transverse momentum at least 4 GeV and |η| < 1.4.To suppress misidentified muons from charged pion and kaon decays, a multivariate-analysis(MVA)-based algorithm has been introduced.Extra kaons are required in the reconstruction for the normalization channels.A trajectory representing the B candidate is built from the decay vertex and B candidate's momentum, and is extrapolated to the closest point for each reconstructed primary vertex; the primary vertex with the smallest distance to the extrapolated point is selected for the analysis.
How to reduce the combinatorial and partially reconstructed backgrounds are the main challenges to the study.To enhance the analysis sensitivity, a dedicated MVA discriminator, combining various discriminating observables into a single score distribution (d MVA ) using a boosted decision tree algorithm, is introduced.The inputs for d MVA includes pointing angles, defined as the angles between the B momentum and the direction connecting the primary and secondary vertices, observables related to the secondary vertex such as quality of the vertex finding, and observables that are designed to identify nearby decay products in semi-leptonic decays of b and c hadrons.The d MVA training is employed by an advanced gradient boosting algorithm, supported by the XGBoost library [30].The training utilises a mix of B 0 s → µ + µ − signal events and background events selected from the data sidebands.Following a fine-tuning of input observables to align the kinematics of B 0 s → µ + µ − and B + → J/ψ K + decays (considering variations in the uncertainties of the dimuon vertex position), the control decay B + → J/ψ K + channel is employed to evaluate the performance of the d MVA in data.
Charmless two-body decays B 0 (s) → h + h ( ′ )− , like B 0 → K + π − and B 0 s → K + K − , can mimic the signal when both charged hadrons are misidentified as muons.The misidentification probabilities in data are calculated by utilizing K 0 S → π + π − , ϕ(1020) → K + K − , and Λ → pπ − decays, restricting the decay distance of K 0 S and Λ candidates to align with the lifetime of the B meson.Misidentification of pions and kaons primarily originates from their decays into muons.An agreement between the observed data and simulations is observed for both pions and kaons.The proton misidentifying rate is much smaller hence the contributions from the associated processes are totally negligible.After stringent multivariate-based muon identification requirement, the charmless two-body backgrounds reduce to a negligible level.
Because of the limited precision in measuring the b-quark production cross section at the LHC, directly determining the branching fraction (B(B 0 (s) → µ + µ − )) could introduce significant uncertainty.As a common practice, the signal branching fraction is assessed by normalizing it to the B + → J/ψ K + decay channel.In addition the B 0 s → J/ψ ϕ decays, with J/ψ → µµ and ϕ → KK, are considered as a cross-check, and might become more precise if the B(B 0 s → J/ψϕ(1020)) is further improved by future B-factory studies.Another advantage of measuring branching fractions in a relative manner is the potential cancellation of systematic uncertainties common in the selection of the signal and normalization channels.The exact formulae for the B 0 (s) → µ + µ − branching fractions are similar to those used in Eq.5: where the yields and the selection efficiencies for each processes are denoted by N X and ϵ The production fractions for B + , B 0 , and B 0 s mesons are represented by f u , f d , and f s .The ratio is set to unity due to isospin symmetry, while the ratio f u f s , together with B(B + → J/ψK + ) and B(B 0 s → J/ψϕ), are external inputs.
The results are obtained with a simultaneous unbinned maximum likelihood fits in multiple categories.For the measurement of branching fractions, a two-dimensional fit using the dimuon invariant mass and its uncertainty as observables is introduced.The events are categorized according to data-taking period, signal purity based on d MVA , and |η| of the most-forward muon.The likelihood function include five components: B 0 s → µ + µ − and B 0 → µ + µ − signals, semileptonic background, peaking two-body decays, and the combinatorial events.The signal components are represented using Crystal Ball functions for the dimuon mass.The width of these Crystal Ball functions is parameterized based on the per-event mass resolution.To model the mass resolution, a kernel estimation approach is employed, utilizing Gaussian kernels.The semileptonic background is modeled by a Gaussian with free parameters in the fit to the data, while the peaking background is modeled by a sum of Gaussian and Crystal Ball functions with the shape parameters determined from simulated events.The yields of the semileptonic and peaking background components are first derived and then included in the fit with uncertainties from the hadron to muon misidentifying rate as constrained nuisance parameters.The combinatorial background is modeled by a linear function with yields and slope free to vary in the fits.
For the branching fraction measurements the experimental uncertainties include signal efficiency corrections due to mismodeling of d MVA , the charged kaon efficiency in the normalization channels, trigger efficiencies, and fitting bias, while the rest of uncertainties are smaller than 1%.The mismodeling of the d MVA distribution has been investigated through two distinct studies with B + → J/ψK + events.In the first study, a direct comparison is conducted between background-subtracted data, with the sPlot technique [31] on the B + → J/ψK + invariant mass distribution, and the simulated distributions.The second study involves reweighting of the simulated samples to align with the data distributions, employing the XGBoost tool.The disparity between the two studies is quantified as a systematic uncertainty.The systematic uncertainty arising from the selection of background models is derived through pseudo-experiments, incorporating variations in the fit.The uncertainties in the input branching fractions of the normalization channels and the ratio, are implemented as external uncertainties.
The results incorporate external inputs, specifically B(B + → J/ψK + ) = (1.020± 0.019) × 10 −3 , B(J/ψ → µ + µ − ) = (5.961± 0.033) × 10 −2 , and f s / f u = 0.231 ± 0.008.The input f s / f u value is derived from the p T -dependent measurement by LHCb [32] and the p T distribution observed in this measurement.Figure 3 shows the dimuon invariant mass distributions from the categories with different signal purity; the results of the fit are superimposed.The profile likelihood contours enclose the regions with different coverage are shown in Figure 4. Alternatively the B 0 s → µ + µ − branching fraction is measured using the B 0 s → J/ψϕ decays as the normalization, which leads to where the last uncertainty arises from the uncertainty in the B 0 s → J/ψϕ branching fraction (B(B 0 s → J/ψϕ) = (1.04 ± 0.040) × 10 −3 ).The lifetime of the B 0 s meson has a significant impact on the B 0 s → µ + µ − branching fraction too; a scaling factor on the resulting branching fraction (1.577 − 0.358 • τ B 0 s , where τ B 0 s is B 0 s lifetime in ps) for alternative lifetime hypotheses other than the SM value (1.61 ps) is provided.The upper limit on the B 0 → µ + µ − decay is calculated to be B(B 0 → µ + µ − ) < 1.9 × 10 −10 at 95% confidence level, using the CL s method [33]

Measurement of B
lifetime τ µµ is extracted with a unbinned maximum likelihood in three-dimensions including dimuon invariant mass, decay time, and decay time uncertainty.The decay time t µ + µ − , which is calculated for each event, is defined by the product of the flight length and the invariant mass of the B candidate, divided by the magnitude of the B candidate momentum.The events are also categorized in the datataking period, purity based on d MVA , and the pseudorapidity of the most forward muon.The dimuon invariant mass distribution is modeled with the same functions introduced for the branching fraction measurements, while the decay time distribution for signal events is modeled by an exponential function convoluted with the decay time resolution function.The decay time resolution function is parameterized with the measured decay time uncertainty.The acceptance as a function of the decay time is obtained from simulated events and corrected with the B + → J/ψ K + events from data.The decay time distribution for combinatorial background decays is obtained from high-mass sideband events.The decay time uncertainty models used in the fit are obtained from simulation samples and mass sideband data as well.
The systematic uncertainties in the lifetime measurement are mostly driven by the correlations between the d MVA and the decay time, as the key input variables for the d MVA classifier: the pointing angle of B candidate and its associated uncertainty are strongly correlated with the decay time observable.Any mismodeling in the simulation results in significant impacts on the decay time distribution.A correction as a ratio of the decay time distributions for different d MVA requirements is derived from B + → J/ψ K + events.This method introduced a bias up to 0.1 ps for the data recorded in 2016, and reduced in the later data sets.The possible bias arises in fitting and modeling is also tested with B + → J/ψ K + events, but with a relaxed selection criterion.Other systematic uncertainties are minor, estimated to be smaller than 0.01 ps.
The resulting effective lifetime for B 0 s → µ + µ − events is: which is consistent with the SM prediction and the other experimental results described in this article.The decay time distribution for the candidates in the region of 5.28 < m µ + µ − < 5.48 GeV is shown in Fig. 5.

Analysis of B 0
(s) → µ + µ − decays with LHCb The most recent analysis of B 0 (s) → µ + µ − with the LHCb experiment [34,35] was performed with the full pp-collision data collected in the LHC Run 1 and Run 2 campaigns.The total integrated luminosity corresponds to 1 fb −1 at √ s = 7 TeV, 2 fb −1 at √ s = 8 TeV and 6 fb −1 at √ s = 13 TeV.In total the analysis comprises the search and branching fraction measurements of the decays B 0 s → µ + µ − , B 0 → µ + µ − and B 0 s → µ + µ − γ with initial state radiation (B 0 s → µ + µ − γ was only investigated in the region m(µ + µ − ) > 4.9 GeV/c 2 ), as well as the measurement of the effective lifetime of the B 0 s → µ + µ − decay.A precise branching fraction measurement is achieved by normalising the signal decay with two high-statistics decay modes, B 0 → K + π − and B + → J/ψ K + with J/ψ → µ + µ − , similarly to what is done by ATLAS and CMS Collaborations and shown in Eq.5 for the B + → J/ψ K + channel.The decay modes B 0 → K + π − and B 0 s → K + K − are used as control modes for the effective lifetime measurement as well.
Dominant background processes mimicking the signal on the one hand arise from random combinations of two muons from two different b-hadron decays in the same event.
On the other hand they can come from b-hadron decays where one or more final state particles have been wrongly identified as a muon.Furthermore, b-hadron decays wher part of the decay products have not been reconstructed can constitute a background.The selection of the signal decays largely inherits from previous analyses of a subset of the data [36] and targets particularly the selecting of B 0 (s) → µ + µ − decays over aforementioned backgrounds, whereas the measurement of B 0 s → µ + µ − γ is a byproduct of the analysis.The LHCb detector, as used to collect the above mentioned data, employed a two-staged online selection.Firstly, events are selected by a hardware trigger that requires at least one muon with a high transverse momentum.Secondly, a two-staged software trigger is applied, which performs a full event reconstruction.In the software trigger, events fulfilling minimum requirements on the muon momentum and its impact parameter, are kept.Also events are kept where these requirements are met by non-signal candidates to maximise signal efficiency.
In the offline selection, candidate B 0 (s) → µ + µ − decays are selected by combining two well reconstructed oppositely charged particles identified as muons [37] with a transverse momentum in the range of 0.25 < p T < 40 GeV/c.The common vertex is required to have a good vertex fit quality and be clearly separated from the associated pp-collision vertex.The resulting B 0 (s) candidate is required to have a transverse momentum greater than 0.5 GeV/c.Candidates in the full instrumented pseudorapidity region 2 < η < 5 are retained for analysis.A preliminary selection based on a Boosted Decision Tree (BDT) is applied to remove a large fraction of combinatorial background while maintaining a high signal efficiency.The BDT is trained with variables related to the decay topology of two particles originating from a vertex displaced with respect to the primary vertex.A highly efficient veto on the combination of a signal muon with another particle in the event identified as muon that result in a dimuon mass close to the J/ψ mass allows to effectively remove B + c → J/ψ µ + ν decays.A selection on a combination of particle identification algorithms is performed and tuned to maximise the The final selection is performed on a second BDT, called in the following s-BDT.This s-BDT includes, apart from variables related to the decay topology, notably isolation classifiers -specifically developed for this analysis -that inspect the closeness of the signal muon tracks to other tracks in the event that are either reconstructed in all tracking detector stations or only in the detector closest to the collision region.The B 0 (s) → µ + µ − yields are measured by fitting the dimuon invariant mass distribution in bins of this final selection s-BDT, discarding only the lowest bin (that corresponds to about 25 % of the signal) in order to maximise the signal sensitivity.The samples of B 0 → K + π − and B + → J/ψ K + are selected in a similar way except for trigger and particle identification criteria for the B 0 → K + π − mode and removing the J/ψ veto.For B 0 → K + π − , the muon identification criteria are replaced by hadron identification and a trigger selection independent of the candidate is required to achieve an unbiased selection.

Measurement of the branching fractions of B 0
(s) → µ + µ − and B 0 s → µ + µ − γ In order to achieve unbiased branching fraction estimates, efficiencies are calculated either on corrected simulation or directly on data.Importantly, the fractions of the s-BDT bins are determined from B 0 (s) → µ + µ − simulation, where the B 0 (s) quantities and the number of tracks in the event are reweighted from data-simulation comparisons in highstatistics B + → J/ψ K + and B 0 s → J/ψ ϕ samples.The resulting corrected s-BDT fractions are then independently cross-checked with B 0 → K + π − data samples, corrected by the different trigger and particle identification response.Measuring the branching fractions relative to two modes, B 0 → K + π − and B + → J/ψ K + , allows for a stringent cross check of the efficiencies by calculating the ratio between the estimated branching fractions of the two and comparing it to the ratio of the published branching fractions [38].An excellent agreement is found.
The invariant mass shape of signal B 0 (s) → µ + µ − decays is described with twosided Crystal Ball functions [39], where the mean of the Gaussian core is calibrated from B 0 s → K + K − and B 0 → K + π − data samples.The mass resolution of about 22 MeV/c 2 is determined from the interpolation of the measured resolutions of charmonium and bottomonium resonances.The tail parameters are estimated from simulation.Small differences in the resolution and the tail parameters are found to appear across the s-BDT bins and are accounted for in the final fit.
Exclusive background decays remaining in the fully selected samples have been carefully studied with simulation, calibrated in data.A large focus in the most recent analysis is laid on the correct estimation of the misidentification of charged hadrons as muons.Decays of the form B 0 (s) → h + h ( ′ )− (h = K, π) with both charged hadrons misidentified create a peaking structure very close to the B 0 → µ + µ − peak and therefore form the most relevant remaining background component.Misidentification occurs in the detector dominantly because the hadrons decay in-flight into muons.The hadron misidentification rate is estimated with a dedicated procedure using D 0 → K − π + from D * + → D 0 π + decays from simulation and data.This procedure takes explicitly into account that the D 0 → K − π + invariant mass shape deforms significantly with hadrons decaying in-flight.As additional cross check, the misidentification rate is investigated from B 0 → K + π − data samples by determining the B 0 → K + π − yield in π µ, K µ and π K mass distributions.
A summary of the final mass fit to obtain the signal branching fractions is displayed in Fig. 6.A precise measurement of the B 0 s → µ + µ − branching fraction is obtained to be where the first uncertainties are of statistical and the second of systematic nature.The systematic uncertainties are dominated by the knowledge of the ratio of fragmentation fractions s and B 0 mesons which enters the normalisation equation because the decay is measured relative to B 0 and B + decays.The B 0 → µ + µ − and B 0 s → µ + µ − γ decays are not observed and consequently upper limits on their branching fractions are set to at 95 % CL, respectively.Similarly, an upper limit on the branching fraction ratio R was determined at 95 % CL to (s) → µ + µ − candidates (black dots) with s-BDT > 0.5.The result of the fit is overlaid and the different components are detailed: , and B 0(+) → π 0(+) µ + µ − (cyan dashed line).The solid bands around the signal shapes represent the variation of the branching fractions by their total uncertainty.Right: two-dimensional profile likelihood of the branching fractions for the B 0 (s) → µ + µ − decays.The measured central values of the branching fractions are indicated with a blue dot.The profile likelihood contours for 68 %, 95 % and 99 % CL regions of the result are shown as blue contours, while the brown contours indicate the previous measurement [36] and the red cross shows the SM prediction.Figures from Ref. [35].
These values include systematic uncertainties, which are dominated by the knowledge of the background components that include misidentified hadrons.A correlation of 11 % is observed between the measurement of the B 0 → µ + µ − and B 0 s → µ + µ − components.
4.2.Measurement of the effective lifetime of the B 0 s → µ + µ − decay The effective lifetime of the B 0 s → µ + µ − decay has been measured on the same sample with a slightly different selection.Since there is effectively no background from hadronmuon misidentification in the B 0 s → µ + µ − mass peak region, the dimuon mass window is adapted to exclude these backgrounds and the particle identification requirements are loosened to increase the signal yield.The conditions of triggered events are required to be met either from the signal candidate itself or the remainder of the event, which facilitates the modelling of the acceptance.Furthermore the data are analysed in only two bins of the final selection s-BDT, chosen to maximise the sensitivity to the effective lifetime.The mass distributions in each s-BDT region are fitted independently to extract backgroundsubtracted decay time distributions with the sPlot technique [27].A simultaneous fit to the two background-subtracted decay-time distributions as shown in Fig. 7 is employed to extract the effective lifetime.In order to extract an unbiased lifetime measurement, the acceptance effects of the reconstruction selection requirements have to be modelled.The decay time acceptance is modelled by fitting parametric functions to the efficiency distribution in simulation, where the simulation has been weighted to improve datasimulation differences.The procedure is validated by measuring the lifetimes of B 0 s → K + K − and B 0 → K + π − in data, finding good agreement with the world average values [38].The uncertainty of the measurement of the B 0 s → K + K − lifetime is taken as systematic uncertainty.Further systematic effects like the sample contamination from B 0 → µ + µ − and B 0 (s) → h + h − decays, acceptance modelling, uncertainties in the background decay time distributions and B 0 s -B 0 s -production asymmetries are investigated and are found to have only sub-leading to negligible effects.The measured effective lifetime is found to be  [35] where the first uncertainty is statistical and the second systematic.This value is outside the lifetime interval defined by the B 0 s light (A ∆Γ = −1) and heavy (A ∆Γ = 1) mass eigenstates, but is consistent with these values at the level of 2.2 and 1.5 standard deviations, respectively.found to be negligible.Additionally, the dependence of f d / f s on the transverse momentum is checked and is found to be consistent within the assigned uncertainties.
The profiled likelihood for each experiment is then modeled with a two-dimensional variable-width Gaussian, which describes asymmetric likelihoods (and asymmetric uncertainties) and also the correlation between the two branching fractions.This analytical function is found to be consistent with the original likelihood for each experiment.The log-likelihoods from the three measurements are summed across the B(B 0 s → µ + µ − ) -B(B 0 → µ + µ − ) grid points and then fitted using the variable-width Gaussians.By maximizing the modeled likelihood function the combined branching fractions and the associated uncertainties are derived: The upper limit on B(B 0 → µ + µ − ) is evaluated as < 1.6(1.9)× 10 −10 at 90% (95%) CL, which is calculated under the positive B(B 0 → µ + µ − ) hypothesis by renormalising the likelihood in the interested region.The combined B(B 0 s → µ + µ − ) branching fraction is found to be lower than any single result, which is due to the strong anti-correlation between two branching fractions.The individual profiled likelihood (left) and the combined likelihood in the B(B 0 s → µ + µ − ) -B(B 0 → µ + µ − ) plane (right) are shown in Fig. 8.The compatibility with the SM predictions is estimated to be 2.4σ for B(B These values are calculated assuming Wilks' theorem and with theoretical uncertainties included.In addition to the individual branching fractions, a combined estimation on the ratio of branching fractions R (see Eq. 2 is also derived: where the corresponding upper limit is evaluated to be R < 0.052 (0.060) at 90% (95%) CL.The B 0 s → µ + µ − effective lifetime is measured in the last analysis iteration by all three experiments, as reported in Sections 2-4.However, at the time when the combination was perforemed, only CMS and LHCb Collaborations had a measurement of this quantity.Therefore, a combination has been carried out based only on their results, exploiting a similar method as for the B(B 0 (s) → µ + µ − ) combination.The LHCb analysis is carried out with a bin-likelihood fit to the background-subtracted decay time distribution, while the CMS measurement is carried out with a two-dimensional likelihood fit to the decay time and dimuon invariant mass distributions.As the analyses are fully dominated by the statistical uncertainties, the combination is performed by describing CMS and LHCb likelihoods (as a function of effective lifetime τ B 0 s →µ + µ − ) with variable-width Gaussians, and then, to determine their combined value, the two measurements are assumed to be uncorrelated.The resulting τ B 0 s →µ + µ − value and the corresponding uncertainty are: Both CMS and LHCb Collaborations have recently released updated analyses, as discussed in Sections 3-4.Another iteration of the combination is foreseen in the near future, incorporating the results from all three experiments based on the full Run 2 LHC campaign data.

Conclusion and prospects
In recent years the ATLAS, CMS and LHCb Collaborations made a push towards precision measurements of the B 0 s → µ + µ − branching fraction, which resulted in measurements that reach a precision of down to 10 % relative uncertainty.These measurements are the most precise to date.At the same time all three Collaborations have begun measuring the effective lifetime of the decay to understand the CP structure of the decay.Contrary to initial evidence in the first combination of CMS and LHCb measurements [42], the B 0 → µ + µ − decay has not been confirmed yet.All results are in good agreement with the SM, strongly constraining potential NP scenarios.To achieve even higher sensitivities, a community effort is ongoing to combine the results of all three experiments.The results of the previous combination have been presented in this review, but have been superseded by the legacy measurements by the CMS and LHCb Collaborations.Once the measurement with the full Run 2 data of the ATLAS Collaboration is published as well, this combination will be repeated to have the most precise picture possible with the harvest of Run 2 data.
After the LHC Run 2, in 2021 the experiments began to take data again with increased instantaneous luminosity until the end of 2025.After that, the High-Luminosity LHC phase will begin, which will have increased pile-up conditions for all experiments and a massively increased total luminosity.The ATLAS and CMS experiments will strongly upgrade their detectors to cope with the increased pile-up conditions.However, they also target a significant dimuon mass resolution improvement by 20 % − 30 % (ATLAS) and 40 % − 50 % (CMS), respectively.The LHCb experiment is planning to follow and go through a major upgrade in 2031 to begin taking data with the LHC Run 5.By the end of the LHC lifetime, ATLAS and CMS aim to have collected 3000 fb −1 , while LHCb is estimating 300 fb −1 .Under these conditions and assuming the central values as predicted by the Standard Model, the ATLAS, CMS and LHCb collaborations made extrapolations to the expected sensitivity of future measurements [43,44].For the ATLAS experiment, the sensitivity strongly depends on the trigger conditions for dimuon events with the upgraded detector.In the most conservative scenario the expected statistical-only (statistical and systematic) uncertainties reach 19 % (23 %) relative to the central value for the B 0 s → µ + µ − branching fraction and 134 % (135 %) for the B 0 → µ + µ − branching fraction, while in the most optimistic scenario they reach 5 % (13) for B 0 s → µ + µ − and 25 % (26 %) for B 0 → µ + µ − .The dominant systematic uncertainties in these projections arise from external inputs like the uncertainty on the fragmentation fraction ratio f s / f d and the branching fraction uncertainty of the normalisation channel.
The CMS collaboration expects to reach uncertainties of 7 % on the branching fraction of B 0 s → µ + µ − and 16 % on the branching fraction of B 0 → µ + µ − .The expected uncertainty on the effective B 0 s → µ + µ − lifetime is 0.05 ps.This precision will allow stringent constraints on the parameter A µ + µ − ∆Γ and in particular break the degeneracy between possible scalar and pseudoscalar contributions beyond the SM to this decay.
The LHCb collaboration expects to reach a statistical uncertainty on the B 0 s → µ + µ − branching fraction of 1.8%, however, the analysis will be systematically limited by the external inputs of the fragmentation fraction ratios and the normalisation branching fractions, which are estimated to become 4 % by then.On the contrary, the ratio B(B 0 → µ + µ − )/B(B 0 s → µ + µ − ) is not expected to become systematically limited and is expected to reach a relative precision of 10 %.The measurement of the effective B 0 s → µ + µ − lifetime is expected to reach a precision of 0.033 ps.Both the CMS and LHCb Collaborations expect to establish the B 0 → µ + µ − decay signal at more than 5σ level.
The expected large yield of B 0 s → µ + µ − decays will also allow to access the CP parameter S µ + µ − , which describes the time-dependent CP-violation in the decay [45].Adding this parameter will complete the base of CP observables and provide complementary constraints to physics beyond the SM that are not constrained by the other observables.A nonzero value of this parameter will be an immediate sign for a CP-violating phase beyond the SM.This parameter can only be determined by measuring the decay-time distribution of B 0 s and B 0 s decays separately and thus requires the tagging of the B 0 s flavour.Assuming a similar performance of the flavour tagging as in Run 2, the LHCb collaboration expects to reach a precision of 0.2 of this parameter.Provided a sufficient flavour tagging performance can be achieved, this analysis could potentially be performed by the CMS and ATLAS experiments.
To achieve the projected sensitivities discussed in this section and possibly surpass them, it will be important to maintain the basic assumptions.For the ATLAS and CMS experiments it will be crucial to design trigger strategies that allow to keep the muon transverse momentum thresholds as low as possible in the high pile-up environment.Furthermore the level of backgrounds from random combinations must be maintained or decreased, which might be achieved through the tracking detectors, the fast timing information in the reconstruction and the improvement of current selection algorithms based on Machine Learning tools.Fast timing information to disentangle pp-collision points will also facilitate the analysis of LHCb data and enable the flavour tagging of the B 0 s mesons.Further improvements over the projected sensitivities in this sectionespecially on B 0 → µ + µ − measurements -might be achieved by improvements on the muon identification and the momentum resolution, which will have significant impact on the dimuon mass resolution.

Figure 1 .
Figure 1.(Left): Dimuon invariant mass distributions in data, for the bin with the highest scores of BDT output.Superimposed is the result of the maximum-likelihood fit.The total fit is shown as a continuous line, with the dashed lines corresponding to the observed signal component, the b → µ + µ − X background, and the continuum background.The signal components are grouped in one single curve, including both the B 0 s → µ + µ − and the (negative) B 0 → µ + µ − component.The curve representing the peaking B 0 (s) → hh ′ background lies very close to the horizontal axis.(Right): Likelihood contours for the combination of the Run 1 and 2015-2016 Run 2 results (shaded areas).The contours are obtained from the combined likelihoods of the two analyses, for values of −2∆ln(L) equal to 2.3, 6.2 and 11.8.The empty contours represent the result from 2015-2016 Run 2 data alone.The SM prediction with uncertainties is also indicated.Figures from Ref.[19].
Figure 1.(Left): Dimuon invariant mass distributions in data, for the bin with the highest scores of BDT output.Superimposed is the result of the maximum-likelihood fit.The total fit is shown as a continuous line, with the dashed lines corresponding to the observed signal component, the b → µ + µ − X background, and the continuum background.The signal components are grouped in one single curve, including both the B 0 s → µ + µ − and the (negative) B 0 → µ + µ − component.The curve representing the peaking B 0 (s) → hh ′ background lies very close to the horizontal axis.(Right): Likelihood contours for the combination of the Run 1 and 2015-2016 Run 2 results (shaded areas).The contours are obtained from the combined likelihoods of the two analyses, for values of −2∆ln(L) equal to 2.3, 6.2 and 11.8.The empty contours represent the result from 2015-2016 Run 2 data alone.The SM prediction with uncertainties is also indicated.Figures from Ref.[19].
√ S + B figure-of-merit.The B 0 s → µ + µ − effective lifetime is measured using a binned χ 2 fit to the proper decay time distribution of the B 0 s → µ + µ − signal component after the subtraction of the background.The proper decay time tµ + µ − is defined as tµ + µ − = L xy m PDG B 0 s p B 0 s T , where L xy is the decay length projected along the reconstructed B 0 s momentum in the transverse plane, m PDG B 0 s the world averaged mass of B 0 s mesons from Ref. [26] and p B 0 s T the magnitude of the candidate's reconstructed transverse momentum.To extract τ µµ , three main steps have been completed: • A fit to the dimuon invariant mass, in the same range as for the BR analysis • The extraction of the tµ + µ − distribution of the B 0 s → µ + µ − component using the sPlot technique [27] • A binned χ 2 fit to tµ + µ − distribution comparing Monte-Carlo simulated effective lifetime templates corresponding to different values of τ µµ .

Figure 2 .
Figure 2. (Left): Signal proper decay time distribution extracted with the sPlot background subtraction procedure applied to the B 0 s → µ + µ − invariant mass fit.The superimposed signal MC template is the result of the lifetime fit procedure discussed in the text.The uncertainties on the data points are calculated as Poisson fluctuations on the MC yield prediction (continuous red histogram) in the corresponding bin.(Right): 68% and 95% CL bands obtained with a Neyman construction based on MC pseudo-experiments for the signal and background components.The yellow lines interpolate the band boundaries in order to smooth the effects of limited number of MC pseudo-experiments used.The dashed-dotted blue line corresponds to the average expected τ Obs µµ value at a given τ µµ value used to generate MC pseudo-experiments.The horizontal dashed black line corresponds to the experimentally observed value of τ Obs µµ =0.99 ps, yielding a 68% CL band of [0.92,1.41]ps (thick vertical dashed purple lines) and a 95% CL band of [0.77,1.73]ps (thin vertical dashed purple lines).The same construction at the τ Obs µµ corresponding to τ µµ = 1.624 ps (the SM prediction) yields [1.44,2.26]ps as 68% CL band.Figures from Ref.[25].
Figure 2. (Left): Signal proper decay time distribution extracted with the sPlot background subtraction procedure applied to the B 0 s → µ + µ − invariant mass fit.The superimposed signal MC template is the result of the lifetime fit procedure discussed in the text.The uncertainties on the data points are calculated as Poisson fluctuations on the MC yield prediction (continuous red histogram) in the corresponding bin.(Right): 68% and 95% CL bands obtained with a Neyman construction based on MC pseudo-experiments for the signal and background components.The yellow lines interpolate the band boundaries in order to smooth the effects of limited number of MC pseudo-experiments used.The dashed-dotted blue line corresponds to the average expected τ Obs µµ value at a given τ µµ value used to generate MC pseudo-experiments.The horizontal dashed black line corresponds to the experimentally observed value of τ Obs µµ =0.99 ps, yielding a 68% CL band of [0.92,1.41]ps (thick vertical dashed purple lines) and a 95% CL band of [0.77,1.73]ps (thin vertical dashed purple lines).The same construction at the τ Obs µµ corresponding to τ µµ = 1.624 ps (the SM prediction) yields [1.44,2.26]ps as 68% CL band.Figures from Ref.[25]. .

Figure 3 .
Figure 3.The dimuon invariant mass distributions for the candidates with d MVA > 0.99 (left) and 0.99 > d MVA > 0.90 (right) categories.The solid blue curves are the projections of fit model, while the individual components of the fit are also presented.Figures from Ref.[28].
Figure 3.The dimuon invariant mass distributions for the candidates with d MVA > 0.99 (left) and 0.99 > d MVA > 0.90 (right) categories.The solid blue curves are the projections of fit model, while the individual components of the fit are also presented.Figures from Ref.[28].

Figure 5 .
Figure 5.The proper decay time distribution for the candidates in the region of 5.28 < m µ + µ − < 5.48 GeV, with the result of the fit superimposed.The solid blue curve is the sum of all fit component, while the shaded areas are the background components.Figure from Ref. [28].

Figure 7 .
Figure 7.The background-subtracted decay-time distributions with the fit model used to determine the B 0 s → µ + µ − effective lifetime superimposed.The distributions in the low and high BDT regions are shown in the left and right plot, respectively.Figures from Ref.[35]

Figure 8 .
Figure 8. Left plot: the two-dimensional likelihood contours for the B 0 (s) → µ + µ − decays from ATLAS (red dashed line), CMS (green dot-dashed line), and LHCb (blue long-dashed line) experiments, together with contours for their combination (continuous line).The likelihood contours are corresponding to the values of −2∆ ln L = 2.3, 6.2, and 11.8, respectively.Right plot: the likelihood contours for the combination of the three results, corresponding to the values of −2∆ ln L = 2.3, 6.2, 11.8, 19.3, and 30.2, or 1 to 5 σ levels in a bidimensional Gaussian approximation.Figures from Ref.[40].
Figure 8. Left plot: the two-dimensional likelihood contours for the B 0 (s) → µ + µ − decays from ATLAS (red dashed line), CMS (green dot-dashed line), and LHCb (blue long-dashed line) experiments, together with contours for their combination (continuous line).The likelihood contours are corresponding to the values of −2∆ ln L = 2.3, 6.2, and 11.8, respectively.Right plot: the likelihood contours for the combination of the three results, corresponding to the values of −2∆ ln L = 2.3, 6.2, 11.8, 19.3, and 30.2, or 1 to 5 σ levels in a bidimensional Gaussian approximation.Figures from Ref.[40].