Searches for Lepton Flavor Violation in Tau Decays at Belle II

Searches for lepton flavor violation in tau decays are unambiguous signatures of new physics. The branching ratios of tau leptons at the level of 10^-10 - 10^-9 can be probed with 50 ab^-1 of electron-positron annihilation data being collected by the Belle II experiment at the world's highest luminosity accelerator, the SuperKEKB, located at the High Energy Accelerator Research Organization, KEK, in Tsukuba, Japan. Searches with such expected sensitivity will either discover new physics or strongly constrain several new physics models.


Introduction
Lepton flavor conservation stands out in the Standard Model (SM) among all other symmetries because it is not associated with any underlying conserved current. Lepton flavor violation (LFV) in the charged sector is predicted by many new physics (NP) models. The small but finite mass of the neutrinos in the SM allow charged LFV in neutrino-less two-body decays, e.g., τ − → − i γ decays (charge conjugate modes are implied throughout the text, unless otherwise specified), where − i (i = 1, 2) denotes light charged leptons (e − , µ − ). However, such decays are suppressed by a factor of (m 2 ν /m 2 W ) 2 [1], which produces experimentally unreachable rates of the order of 10 −54 . For neutrino-less three-body decays, e.g., τ − → − i + j − a decays (where a = i or j, and i may or may not be equal to j), two conflicting predictions existed in the literature: one of the order of 10 −55 [2] and another of the order of 10 −14 [3]. Recently, the contributions due to finite neutrino masses for such decays were re-scrutinized and found to be in the range of [10 −56 , 10 −54 ] [4,5], thereby laying to rest the claim that such decays could be of the order of 10 −14 in the SM. Thus, any observation of charged LFV is an unambiguous signature of NP.
More exotic decay modes, such as τ − → π + − − and K + − − , accompanied by a violation of the lepton number (LNV), are predicted at the level of 10 −10 -10 −8 in several scenarios beyond the SM [30]. Several of these decay modes are expected to have branching ratios close to existing experimental limits in NP models, e.g., heavy Dirac neutrinos [31,32], supersymmetric processes [25,33], flavor-changing Z exchanges with non-universal couplings [34], etc., to name a few. Wrong-sign τ − → + i − j − j decays are very intriguing because they are expected at rates only one order of magnitude below the present bounds in some NP models, e.g., the Littlest Higgs model with T-parity realizing an inverse seesaw [35].
Most models for baryogenesis, a hypothetical physical process based on different descriptions of the interaction between the fundamental particles that took place during the early universe producing the observed matter-antimatter asymmetry, require baryon number violation (BNV), which in charged lepton decays automatically implies LNV and LFV [36]. Angular momentum conservation requires the difference of net baryon number (B) and lepton number (L) to be equal to either 0 or 2. Although the SM conserves this difference, the symmetry group for the sum of baryon number and lepton number can be associated with an anomalous current. A set of models predicts baryogenesis that conserves B-L but includes instanton induced B+L violating currents [37]. In a large class of models [38], BNV in τ decay modes containing baryons in the final state, for example, τ − → π − Λ, π −Λ , K − Λ, K −Λ ,pγ,p i¯ j and p i j , are predicted at observable rates in the large τ data set that the Belle II detector will record over the coming years.

Belle II Experiment at SuperKEKB
The most restrictive limits on LFV in τ decays at the level of 10 −8 have been obtained by the first generation of the B-Factory experiments, Belle and BABAR, where a big data sample of τ' s was generated thanks to large and similar values of the production cross-sections of B− mesons and τ− pairs around the Υ(4S) resonance at the level of a nanobarn (nb) [39]. Belle and BABAR experiments collected approximately one attobarn-inverse (ab −1 ) and half an ab −1 of e − e + annihilation data, respectively. The next generation of the B-Factory experiment, Belle II, is expected to collect 50 ab −1 of data over the next decade [40]. Such a huge data sample corresponding to 10 11 single τ-decays would lower the limits on LFV in τ decays by one or two orders of magnitude.

Luminosity Upgrade of SuperKEKB
The asymmetric beam energy e − e + collider, SuperKEKB, is an upgrade of the KEKB accelerator facility in Tsukuba, Japan, and has a circumference of about 3 km. The main components of the SuperKEKB collider complex are a 7 GeV electron ring known as the highenergy ring (HER), a 4 GeV positron ring known as the low-energy ring (LER), and an injector linear accelerator with a 1.1 GeV positron damping ring [41]. The HER and the LER have four straight sections named Tsukuba, Oho, Fuji, and Nikko, with the interaction point in the straight section of Tsukuba, where the Belle II detector is located.
The target integrated luminosity of 50 ab −1 to be collected by the Belle II experiment will be achieved by increasing the instantaneous luminosity by a factor of 30. Two major upgrades account for this increase: a modest two-fold increase in the beam currents, and a fifteen-fold reduction of the vertical beta function at the interaction point (β y ) from 5.9 mm to 0.4 mm, according to the "nano-beam" scheme described below.
Compared to KEKB, the asymmetry between the beam energies for the HER/LER beams were reduced from 8.0/3.5 GeV to 7.0/4.0 GeV, which reduces the beam loss due to Touschek scattering. This also improves the solid-angle acceptance of the experiment, which helps to analyze events with large missing energy. Additionally, the effects of synchrotron radiation as a result of higher currents are mitigated. Since synchrotron radiation is proportional to the product of beam current and the fourth power of beam energy, the HER at SuperKEKB emits (7/8) 4 = 59% as much synchrotron radiation per unit of beam current compared to KEKB [42]. This facilitates the SuperKEKB collider to operate at a beam current twice the value of the KEKB.
The very high luminosity environment of SuperKEKB required significant upgrades of the injection beams with high current and low emittance. The upgraded accelerator complex houses a new electron-injection gun and a new target for positron production. A new damping ring was installed for injection of the positron beam with low-emittance, as well as for improving simultaneous top-up injections needed for the high luminosity upgrade. The upgrade also features completely redesigned lattices for the LER and HER, replacement of short dipoles with longer ones in the LER, a new Titanium Nitride coated beam pipe with antechambers to suppress the electron-cloud effect, a modified RF system, and a completely redesigned interaction region [43].
The design of the beam parameters at SuperKEKB [44] follows the "nano-beam" and the "crab-waist" schemes, which were originally proposed for the SuperB-Factory in Italy [45]. Accordingly, the transverse sizes of the beam bunches in the horizontal plane (σ x ) are squeezed to have very small values and made to collide at a larger horizontal crossing angle (2φ x ) = 83 mrad at Belle II, instead of =22 mrad at Belle. Thus, the effective size of the overlap region (σ z ) is much shorter than what it would have been in the case of a normal head-on collision, which is given by the longitudinal size of the beam bunches in the horizontal plane (σ z ) [46].
The vertical beta function at the interaction point (IP) is constrained due to the hour-glass effect as: where Φ is the large Piwinski angle. With σ z of the order of 6 mm, the β y is thus squeezed down to about 400 µm, which is much shorter than the real bunch length. In addition to increasing the luminosity, a reduction of the interaction region of the colliding beams restricts the vertex position along the beam axis, thus providing an additional benefit of more precise estimation of the primary vertex, which helps in the reconstruction of the complete event topology during physics analysis.
The instantaneous luminosity in an e − e + collider is where N e − is the number of electrons per bunch, N e + is the number of positrons per bunch, f is the collision frequency of the bunch, σ x and σ y are the transverse beam-profile sizes at the IP, and R L is the luminosity-reduction factor (of the order of unity) due to the finite beam-crossing angle. In terms of beam currents I ± = N e ± e f , the luminosity becomes where e is the charge of the electron. Each beam affects the stability of the other, which can be characterized by the beam-beam tune shift parameters given by where r e is the classical radius of the electron, γ ± is the relativistic gamma factor of e − (e + ) beams, and R ξ (x,y) is the geometric reduction factor (also of the order of unity) due to the hour-glass effect. Putting all these factors altogether, we arrive at the following expression for instantaneous luminosity: where the design parameters for beam-beam tune shifts are ξ (x,y) = (0.0012, 0.0807) for HER and = (0.0028, 0.0881) for LER [44]. The horizontal/vertical beam sizes at the IP are reduced from σ x /σ y = 170 µm/940 nm for HER and =147 µm/940 nm for LER at Belle to =10.7 µm/62 nm for HER and =10.1 µm/48 nm for LER at Belle II, respectively [41]. The beam currents for Belle were 1.19 A and 1.64 A for HER and LER, respectively, compared to the design values of 2.6 A and 3.6 A for HER and LER, respectively, at Belle II [41]. This allows one to improve upon the value of instantaneous luminosity from L = 2.1 × 10 34 cm 2 s −1 at Belle to 6.5 × 10 35 cm 2 s −1 at Belle II [40].

Detector Upgrade of Belle II
From the IP outward, the main components of the Belle II detector are vertexing and tracking detectors, particle identification systems, calorimeter and muon chambers, as shown in Figure 1 [47]. The tracking detectors consist of an inner Silicon PiXel Detector (PXD), a Silicon Vertex Detector (SVD) and a Central Drift Chamber (CDC). Two dedicated particle identification systems are the Time-Of-Propagation (TOP) detector in the barrel region and the Aerogel Ring-Imaging CHerenkov detector (ARICH) in the forward endcap. These are surrounded by an Electromagnetic CaLorimeter (ECL) and a superconducting solenoid providing a homogeneous magnetic field of 1.5 T. A K 0 L and Muon detector (KLM) is the largest and outermost part of the Belle II detector. Some upgrades of the Belle II detector [47,48] over the Belle detector [49] are: • Vertexing: In Belle, the beam pipe was at 15 mm [50], the innermost layer of a 4-layer silicon vertex detector [51] was at 20 mm and the outermost layer of the vertex detector was at a radius of 88 mm. In Belle II, the beam pipe is at 10 mm, the inner two layers of the PXD, consisting of silicon pixels, are closer to the IP at 14 mm and 22 mm, respectively, and the outermost layer of the four layers of the SVD, consisting of silicon strips, goes to a larger radius of 140 mm. The PXD is based on the Depleted Field Effect Transistor (DEPFET) technology, which allows for thin sensors with 50 µm thickness. The readout of the new silicon strip detector is based on the APV25 chip, which has a much shorter shaping time to accommodate for higher background rates in Belle II than the VAITA chip-based readout used at Belle. As a result of these upgrades, considerably better performance is expected in Belle II than Belle. For example, the vertex resolution at Belle II is improved by the excellent spatial resolution of the two innermost pixel detector layers. • Tracking: The large volume CDC at Belle II, with 56 layers organized in 9 super-layers, has smaller drift cells than in Belle. CDC starts just outside the expanded silicon strip detector, and extends to a larger radius of 1130 mm in Belle II as compared to 880 mm in Belle. The measured spatial resolution of the CDC is about 100 µm, while the relative precision of the dE/dx measurement for particles with an incident angle of 90 • is around 12%. The angular resolution achieved between tracks is ∼4.5 mrad. The efficiency to reconstruct K 0 S → π − π + decays in Belle II is also improved because the silicon strip detector occupies a larger volume. • Particle Identification: Belle II has two completely new, more compact particle identification devices of the Cherenkov imaging type: TOP in the barrel and ARICH in the endcap regions. Both detectors are equipped with very fast read-out electronics, leading to very good kaon versus pion separation in the kinematic limits of the experiment. The two Cherenkov detectors are designed to differentiate between K and π particles over the entire momentum range, and also differentiates among π, µ, and e below 1 GeV/c.

• Calorimetry:
The ECL is made of CsI(Tl) scintillation crystals of size 6 cm × 6 cm each with high light output, a short radiation length, and good mechanical properties, covering the range of 12 • < θ < 155 • in the polar angle, e.g., 90% of solid angle coverage in the center-ofmass system. The ECL is divided into two parts: the barrel and the endcap. While the barrel part consists of 6624 crystals, the endcap part consists of 2112 crystals. The new electronics of the ECL are of the wave-form-sampling type, which has particular relevance in missing-energy studies by reducing the noise due to pile up considerably. The ECL is able to detect neutral particles in a wide energy range, from 20 MeV up to 4 GeV, with a high resolution of σ E /E = 4% at 100 MeV, and angular resolution of 13 mrad (3 mrad) at low (high) energies. This gives a mass resolution for reconstructing π 0 → γγ of about 4.5 MeV/c 2 [52]. • K 0 L and Muon Detection: The K 0 L and muon detector (KLM) at Belle was based on glass-electrode resistive plate chambers (RPC). Since larger backgrounds are expected in the high luminosity environment at Belle II, the upgraded KLM system consists of RPC only in some parts of the barrel. The two innermost layers in the barrel and the entire endcap section of KLM at Belle II consist of layers of scintillator strips with wavelength shifting fibers, read out by silicon photomultiplier (SiPMs) as light sensors [53]. Although the high neutron background can cause damage to the SiPMs, the upgraded KLM has been demonstrated to operate reliably during irradiation tests by appropriately setting the discrimination thresholds.

Daq Upgrade of Belle II
The new data acquisition (DAQ) system [54] meets the requirements of considerably higher event rates at Belle II. It consists of a Level One (L1) [55] and High Level Trigger (HLT). The L1 trigger has a latency of 5 µs and a maximum trigger output rate of 30 kHz, limited by the read-in rate of the DAQ. The HLT must suppress online event rates to 10 kHz for offline storage using complete reconstruction with all available information from the entire detector. To enable readout from high-speed data transmission, a peripheral component interconnect express based readout module (PCIe40) with high data throughput of up to 100 Gigabytes/s was adopted for the upgrade of the Belle II DAQ system [56]. The trigger system at Belle II achieves almost 100 % trigger efficiency for Υ(4S) → BB events and nearly high efficiency for other physics processes of interest, e.g., τ-pair events.

Event Topology
B-Factories typically operate at center-of-mass energies around the Υ(4S) resonance, e.g., 10.58 GeV. Tau-pair production via e − e + annihilation in this energy regime leads to cleanly separated event topology associated with the decay of each τ lepton, and are well simulated by state-of-the-art event generators: KK2F [57][58][59], Tauola [60,61] and Photos [62,63]. Searches for LFV in τ decays in B factories exploit these event characteristics, assuming that only one of the two τ's produced in the e − e + → τ − τ + process could have decayed in this rare mode, and the other τ decays via the allowed SM processes. By dividing the event into a pair of hemispheres perpendicular to the thrust axis [64,65] in the center-of-mass frame, τ decay products can thus be identified as coming from the signal-side and the tag-side, corresponding to decay via LFV and the SM decays of the τ lepton, respectively.

Signal Characteristics
The characteristic feature of τ decays via LFV is that the final state does not contain ν τ . Thus, there is no missing momentum associated with the signal-side, and the kinematics of the signal τ lepton can be completely reconstructed from measurements of the final state particles. Simulation studies for more than a hundred possible decays via LFV that can be searched with such signal characteristics for each sign of the τ lepton, are possible with recent updates of the Tauola event generator [61,66,67], which have been seamlessly integrated into the software of the Belle II experiment.
A very interesting feature of τ-pair production in e − e + annihilation is that the energy of each τ lepton is known to be exactly half of the center-of-mass (CM) energy of the collision, except for corrections due to initial and final state radiations. Therefore, the uncertainty of the energy of the τ lepton is independent of the performance of the detector, and is known from the beam energy spread of SuperKEKB to be approximately 5 MeV [48].
As a first example of the signal mode, let us consider τ − → − γ decays, which are predicted with rates just lower than the current experimental bounds in the widest variety of NP models and are hence regarded as "golden modes" in searches for LFV. The total energy in the CM frame of the τ decay products in the signal-side is E CM γ = √ s/2, and the invariant mass of the γ pair can be calculated as m γ = 2 p E γ sin( θ 2 ), where p is the magnitude of the three-momentum of the lepton, E γ is the energy of the photon, and θ is the opening angle between them. The invariant mass is ideal as a discriminating variable, because its resolution is given by [68]: which simultaneously combines all the available experimental precision on the measured energy/momentum from the calorimeter/tracking systems with the measured uncertainties on the position measurements of the observable final state decays products. The resolution of this kinematic variable is further improved by considering the beam-energy-constrained mass, M bc , given as: and p CM γ is the sum of the lepton and photon momenta in the CM frame, because the resolution of E CM beam comes from the accelerator instead of the detector. The beam-energy-constrained τ mass, labeled somewhat differently as m EC for the BABAR search [69], is typically obtained from a kinematic fit that constrains the CM energy of the τ to be √ s/2. Its resolution was further improved in the BABAR search by assigning the origin of the photon candidate to the point of the closest approach of the signal lepton track to the e − e + collision axis. Figure 2 shows a comparison study using a simulated sample of signal  The most distinguishing feature of signal events is obtained by considering the characteristic mass of the decay products of the LFV τ decays along with the normalized difference in their energy from half the center-of-mass energy in e − e + annihilation, so that the search can be uniformly performed at energies other than the Υ(4S) peak, to take advantage of the larger luminosity including all the recorded data: The signal events are clustered around M bc ∼ m τ and ∆E/ √ s ∼ 0 in the two-dimensional plots of ∆E vs. M bc , as shown in Figures 3 and 4 for τ − → e − γ (left) and τ − → µ − γ (right) searches at Belle [70] and BABAR [69] experiments, respectively, where the variable m EC refers to the beam-energy-constrained mass in the latter, as mentioned earlier.  All analyses are developed in a blind manner, e.g., optimizing the event selection before looking at the data events inside the signal region, to avoid experimental bias in the search for LFV τ decays. The search sensitivity can be optimized to give the smallest expected upper limit in the background-only hypothesis inside a 2σ ellipse, for example, amongst other possible choices. A typical 2σ signal region is defined as the following elliptical regions: Here, σ high/low M bc and σ high/low ∆E/ √ s are the widths on the higher/lower side of the peak obtained by fitting the signal distribution to an asymmetric Gaussian function [71].
For the Belle search [70], the resolutions are σ The mass and energy kinematic variables typically have a small correlation arising from initial and final state radiation, as well as energy/momentum scale calibration effects. For the BABAR search [69], the correlation was estimated to be −8.5% and −8.4% for the τ − → e − γ and τ − → µ − γ decays, respectively, around the core region. Without the beam-energy constraint, the correlation between the invariant mass and energy variables are typically much higher.
LFV process in τ decays containing a resonance in the final state are identified by the presence of a peak in the invariant mass of the daughter particles in the simulation of the signal process. For example, distributions of invariant mass of the π + π − , K + K − , π + π − π 0 , K + π − , π + K − and π + π − systems in the signal-side are studied and confirmed to contain the respective resonances in searches for and τ − → µ − f 0 (980) decays, respectively, performed by the Belle experiment [72,73]. The selected mass regions ensure that the signal is unambiguously selected in the corresponding searches.

Background Suppression
Background events containing leptons from decays of heavy quarks are easily suppressed by appropriate cuts on Fox-Wolfram moments [74], and on the invariant mass of all decay products on the tag-side. The characteristic difference between τ-pairs events with LFV decays and backgrounds consisting of generic τ-pair, di-lepton, two-photon production and qq processes (where q = u, d or s), in the number of neutrinos in the signal-side and tag-side, as defined by the event topology in Section 3.1, are shown in Table 2.

# of ν's LFV Decays Generic τ-Pair Other Backgrounds
Signal-side 0 1-2 0 Tag-side 1-2 1-2 0 Since decay products of the τ decay via LFV in the signal-side do not contain any neutrino, the direction of the τ lepton in the tag-side can be precisely obtained in the center-of-mass frame by reversing the total momentum of the signal-side. This allows for good kinematic reconstruction of the missing mass in the tag-side, assuming that in the CM frame, the tag-side τ momentum is opposite that of the signal-side τ momentum and that its energy is constrained to be half the center-of-mass energy. Thus, selection of events with small values of the square of the missing mass (m 2 ν ) in the tag-side play an important role in the suppression of the background events [70].
Additional selection criteria are also used to suppress the backgrounds in the different LFV decay modes, which are mostly accidental in nature, except in τ − → e − γ and τ − → µ − γ searches. The dominant background in the searches arise from τ + τ − events decaying via the τ ± → e ± ν e ν τ (τ ± → µ ± ν µ ν τ ) channel with a photon coming from initial-state radiation or beam background. The e + e − γ and µ + µ − γ events are subdominant, and are estimated to contribute to <5% of the total backgrounds in the Belle search [70]. Contributions from other sources of backgrounds, such as two-photon and qq processes, are estimated to be quite small in the signal region.
Furthermore, each component of the background processes has distinctive features as visible in their respective two-dimensional distributions in the (∆M, ∆E) plane, where ∆M denotes the difference between the characteristic mass of the system of τ-daughters and the wellknown mass of the τ-lepton = (1776.86 ± 0.12) MeV [75], and ∆E, as defined above. The shapes of the leading backgrounds in search of τ − → µ − µ + µ − decays as performed at the BABAR experiment [76] are shown in Figure 5, where the red box indicates the rectangular boundaries of a generic region mostly populated by the signal processes. The SM ττ background events are generally restricted to small negative values of both ∆M and ∆E variables, because the reconstruction of signal event topology does not account for the neutrinos present in SM τ decays. QED background events are mostly dominated by di-lepton production as the main underlying hard process and typically lie within a narrow horizontal band across the ∆M variable centered around slightly positive values of ∆E, due to the presence of a pair of extra charged particles in such events. The QCD background events from various qq processes tend to populate the plane uniformly across the ∆M variable and drop towards large values of the ∆E variable. The expected background rates inside the signal region can be obtained by fitting the observed data in the (∆M, ∆E) plane to a sum of probability density functions. Such data-driven estimates, based on the shapes predicted by respective simulation samples and validated by data-driven control regions, scale well with larger data statistics. Thus, the background uncertainties can be controlled in a statistical manner, which is very useful in rare searches with high luminosity data sets.

Upper Limit Estimation
No excess of events has ever been observed in searches for LFV in τ decays. The upper limit at 90% confidence level (CL) on signal branching fraction (B 90 UL ) is calculated as: where N ττ is the number of τ-pairs produced, is the reconstruction efficiency of the signal decay mode, and S 90 UL is the 90% CL upper limit on number of signal events. The factor of two enters into the denominator because either one of the two τ leptons produced in the event can decay into the rare signal channel coming from LFV. The Belle experiment collected data with center-of-mass energies around the peak of Υ(nS) resonances corresponding to luminosities of 5.7 fb −1 , 24.9 fb −1 and 2.9 fb −1 at n = 1, 2 and 3, respectively, while the BABAR experiment collected luminosities of 13.6 fb −1 and 28.0 fb −1 at center-of-mass energies corresponding to the Υ(nS) peak at n = 2 and 3, respectively, as reported in Table 3.2.1 in the reference [52]. The statistical errors on these measured luminosities are much smaller than the systematic errors, which are estimated to be 1.4% at the Belle experiment, and 0.7% (0.6%) at the BABAR experiment for n = 2 and 3, respectively [52].
The estimated values of total σ ττ at the peak of Υ(nS) resonances and at 60 MeV below the corresponding resonances (labelled with a "-off") are listed in Table 3. Table 3. σ ττ at different center-of-mass energies corresponding to data-taking at the B-Factories.

Efficiency of Signal Reconstruction
The signal reconstruction efficiency receives multiplicative reduction factors corresponding to the application of trigger, acceptance, and event topology requirements, particle identification criteria, background suppression and choice of the signal region in the two-dimensional plane given by mass versus the normalized difference of energy of the τ decay products and √ s/2. At Belle and BABAR, the signal efficiencies were estimated to lie approximately between 2% to 12%, depending on the different decay channels. For example, the overall signal efficiency estimated for the search for τ − → e − η reconstructed via the η → ρ(→ π − π + )γ and η → π − π + η(→ γγ)) decay modes is (0.294 × 1 × 4.76 + 0.445 × 0.3943 × 4.27)% = 2.1% [77], while the search for τ − → µ + e − e − decays have an efficiency of 11.5% [78]. In the Belle II experiment, an increase in the signal efficiency can be expected due to higher trigger efficiencies, improvements in the vertex reconstruction, charged track and neutral meson reconstructions, and particle identification. Refinements in the analysis techniques will produce a more accurate understanding of the physics backgrounds and would thus contribute to an increase in the signal detection efficiency, which directly translates into higher sensitivities in searches for LFV.

Upper Limit on the Number of Signal Events
In the case of searches with very low counts, the search becomes a single-bin counting experiment following a Poisson probability distribution, with the mean count given by the expected number of background events (b) and possibly some signal events (s). The likelihood function (L) is thus described by: where N is the number of observed events. If the experimental resolution of the discriminating variables allows multiple bins, the difference in shapes of the discriminating variables between the signal and background distributions can be exploited in an extended unbinned maximum likelihood fit using: where i indicates the i-th event, PDF sig and PDF bkg are the probability density functions (PDF) for signal and sum of all background process, respectively. S 90 UL is obtained by considering L(s) = ∞ 0 L(s, b)db, and integrating the likelihood L(s) up to the value that includes 90% of the total integral of the likelihood function, following a flat prior Bayesian prescription [79]. Alternatively, following a Frequentist prescription [80], a toy Monte Carlo approach is used to generate numerous samples with sizes that follow a Poisson distribution about the mean value being given by the number of observed events. Each sample is then fitted to obtain the number of signal and background events using the same extended unbinned maximum likelihood fit procedure as that applied to the data. S 90 UL is obtained by varying the true branching fraction of the signal such that 90% of the samples yield a fitted number of signal events greater than the number of signal events in the observed data sample. In the unified approach for finding confidence levels [81], the order of samples in the acceptance interval for a specific value of the number of signal events follows an ordering principle based on likelihood ratios, where the denominator is determined by the best fit value in each sample.
In order to have an unbiased estimate of the expected sensitivity, a blinding procedure should be followed to predict the expected background rate inside the signal region (SR), which does not depend on the observed data inside a blinding region (BR), defined as a part of a broad fit range (FR), but hiding data events inside the SR. For well-controlled modeling of the total background PDF, the number of expected background events (N bkg SR ) inside the SR can then be estimated directly from the data N data FR−BR outside the blinded region using the formula: where SR PDF bkg and FR−BR PDF bkg are the integrals of the background probability density functions over the signal region and the non-blinded parts of the fit region.

Systematic Uncertainties
In terms of the number of τ decays being studied, t = 2N ττ , the number of signal events are written as (s = µt), where µ is the branching fraction of the signal process and the normalization factor t includes uncertainities on luminosity, cross-section and the signal efficiency. The upper limit B 90 UL including all systematic effects using the technique of Cousins and Highland [82], is calculated by propagating all the measured uncertainties onto the number of signal events (s) and background events (b).
Implementation of systematic effects in the POLE (POisson Limit Estimator) program [83] is based on the following likelihood function, which is a convolution of a Poisson distribution with two Gaussian resolution functions corresponding to the signal normalization factor and background, as described by the following formula: wheret andb are the average estimates corresponding to measured uncertainties of σ t and σ b , respectively.
In searches for rare processes such as LFV in τ decays, often a very small number of events are expected in the signal region. Sometimes the sensitivity of the search cannot easily distinguish a very small number of signal events from the background-only hypothesis, and inappropriately tends to exclude an unusually small signal value. To overcome such difficulties, the upper limits can be calculated using the CL s method [84,85], where the CL s is defined as the ratio of confidence levels for the signal-plus-background hypothesis normalized by the confidence level for the background-only hypothesis. Asymptotic calculations of the likelihood ratios used as the test statistic in such methods allow for a computationally efficient estimate of the CL s intervals [86].
A Neyman construction [87] of CL s upper limits including systematic uncertainties is provided by the HistFactory implementation [88,89], based on the likelihood function L(µ, θ j ) defined as: where N i is the number of events observed in the i th bin with signal normalization factor and background predictions given by t i and b i , respectively, of a multicategorical search describing, for example, different tag-side decay modes each with different sensitivity over possibly multiple decay channels of the signal mode. The systematic uncertainties are constrained by nuisance parameters θ j corresponding to various scale factors as determined from dedicated calibration constants of efficiency measurements and are obtained from simulation studies or analysis of control regions in the data.
The HistFactory allows for the calculation of upper limits in both Bayesian and Frequentist interpretations [90], with slightly different treatments of the nuisance parameters. While in the former interpretation, the nuisance parameters are eliminated by marginalizing the posterior density, using, for example, Markov Chain Monte Carlo integration, in the latter interpretation, the nuisance parameters are determined by profiling the likelihood function based on auxiliary measurements, such as control regions, side-bands, or dedicated calibration measurements. Some uncertainties arising from theoretical calculations or ad hoc estimates are not statistical in nature and thus are not associated with auxiliary measurements. However, log-normal probability density functions of nuisance parameters are used to constrain all the uncertainties, by convention.
Bayesian limits can also be calculated using the Bayesian Analysis Toolkit [91].

Current Status and Future Prospects
Summary of observed limits obtained by CLEO, BABAR, Belle, ATLAS, CMS, and LHCb experiments [92] are shown in Table 4 and Figure 6, along with projections for two illustrative scenarios of luminosity L = 5 ab −1 and 50 ab −1 at the Belle II experiment [40]. Projections are extrapolated from expected limits obtained at the Belle experiment. The expected limits for τ − → − γ decays are obtained from Ref. [70]. We assume the presence of irreducible backgrounds for τ − → − γ decays, thus approximating the sensitivity to upper bounds as proportional to 1/ √ L. Given the expected number of background events in each channel from the previous searches at the Belle experiment and the improvements listed in Section 3.4.2, the background expectations corresponding to the integrated luminosity at Belle II for all other modes are still of the order of unity or less. For such accidental backgrounds, the sensitivity for upper bounds is proportional to 1/L, as discussed in Section 8.2.1.3 of Ref. [93]. The projections for the corresponding upper limits at Belle II are estimated using the Feldman and Cousins approach [81]. Table 4. Current status of observed (obs) and expected (exp) upper limits (UL) [40,75].

Observed Limits
Expected Limits    A beam polarization upgrade of the SuperKEKB e − e + collider can enhance the sensitivity to LFV in τ decays at the Belle II experiment to levels beyond the ones listed in Table 4 and Figure 6. The proposed upgrade [111] will result in ∼70% longitudinal polarization of the high energy electron beam, which will influence the angular distribution of the τ decay products in the SM τ-pair backgrounds. The characteristic τ polarization dependence of the helicity angles of the τ decay products with beam polarization can then be used to further suppress the background in, for example, τ − → µ − γ searches, where one τ decays to a muon and a photon, while the other τ decays to a pion and a neutrino, the decay channel most sensitive to the polarization of the τ lepton. Similar background suppression can also be obtained with the other decay modes, which vary in their sensitivity to the τ polarization. In general, the maximal discriminating power is obtained by studying the polar angles in the center-of-mass frame times the charge of the τ decay.
The "irreducible background" from τ − → µ − ννγ decays are studied in Figure 7 [112]. While the distributions of the backgrounds show marked differences in the case of beam polarization with respect to the case of no beam polarization, the signal distribution modeled by uniform phase-phase does not change with beam polarization. By removing events where the distribution of the irreducible background shows a rising trend near unity, the background can be reduced significantly, corresponding to a small loss in signal efficiency. An optimization study has demonstrated that this would result in approximately a 10% improvement in the sensitivity to LFV. Similar analyses are expected to yield comparable gain in sensitivities for other decay modes.   It is worth noting that the uniform phase space model of the signal distribution is chosen because the underlying theory behind LFV is not known. Different spin-dependent operators are predicted to give significantly different features in the Dalitz plane of final state momenta distributions of, for example, τ − → µ − µ + µ − decays [113,114]. One of the most interesting aspects of having the beam polarization is the possibility to distinguish between these different new physics models to understand the helicity structure of the couplings producing LFV in τ decays, once such decays are observed.

Conclusions
LFV in τ decays are unambiguous signatures of new physics, and are thus of great experimental and theoretical interest. Many models from supersymmetric scenarios to leptoquarks predict LFV in τ decays at experimentally observable rates, which will be probed at Belle II. Searches for LFV in τ decays can discover new physics at the multi-TeV scale by identifying the underlying mechanism beyond the SM, or strongly constrain the flavor structure of TeV-scale extensions beyond the SM [127,128], as discussed in the context of different experimental efforts in Ref. [129]. The first generation B-Factory experiments, Belle and BABAR, saw an order of magnitude improvement on the upper limit on LFV in τ decays from 10 −6 level down to 10 −8 level. The Belle II experiment will continue to improve the sensitivity in searches of LFV in τ decays over the next decade. The projected sensitivity at Belle II for LFV in τ decays with 50 ab −1 of data is at the 10 −10 -10 −9 level, which constitutes one or two orders of magnitude of improvement over the previous experiments.