A Review of Basic Energy Reconstruction Techniques in Liquid Xenon and Argon Detectors for Dark Matter and Neutrino Physics Using NEST

Detectors based upon the noble elements, especially liquid xenon as well as liquid argon, as both single- and dual-phase types, require reconstruction of the energies of interacting particles, both in the field of direct detection of dark matter (WIMPs, axions, et al.) and in neutrino physics. Experimentalists, as well as theorists who reanalyze/reinterpret experimental data, have used a few different techniques over the past few decades. In this paper, we review techniques based on solely the primary scintillation channel, the ionization or secondary channel available at non-zero drift electric fields, and combined techniques that include a simple linear combination and weighted averages, with a brief discussion of the application of profile likelihood, maximum likelihood, and machine learning. Comparing results for electron recoils (beta and gamma interactions) and nuclear recoils (primarily from neutrons) from the NEST simulation to available data, we confirm that combining all available information generates higher-precision means, lower widths (energy resolution), and more symmetric shapes (approximately Gaussian) especially at keV-scale energies, with the symmetry even greater when thresholding is addressed. Near thresholds, bias from upward fluctuations matters. For MeV-GeV scales, if only one channel is utilized, an ionization-only-based energy scale outperforms scintillation; channel combination remains beneficial. We discuss what major collaborations use.


Introduction
The noble elements especially xenon (Xe) and argon (Ar) as liquids have been instrumental in the field of dark matter (DM) direct detection, focused on identifying the missing ∼25% of the mass-energy content of the universe. They have also been key for neutrinos. In the former case, Xe [1][2][3] and Ar [4,5] are each used by distinct large collaborations, and used both to search for continuous spectra, such as the approximate falling exponential expected from the traditional WIMP (Weakly Interacting Massive Particle) [6], or monoenergetic peaks expected from dark photons or bosonic super-WIMPs [7]. In the latter case, argon is used in long-and short-baseline oscillation studies [8,9] and xenon in the search for neutrinoless double-beta decay, as either a liquid [10] or a gas [11]. In all of these cases, there is a clear need for high accuracy and high precision in energy reconstruction and good energy resolution in order to identify signals and backgrounds, and calibrate the detectors. Combining the data with high-fidelity arXiv:2102.10209v1 [hep-ex] 20 Feb 2021 Monte Carlo (MC) simulations can aid in this task. Xe and Ar produce scintillation light, and when an external electric field is applied then ionization electrons can be extracted as well. In a dual-phase time projection chamber (TPC) a gas stage converts the ionization into a secondary scintillation pulse [12] while a single-phase TPC reads out the charge directly [13,14]. Energy scales have been based in the past and present on the scintillation, on the ionization, and on their combination.
In this work, we will be reviewing each of these methods, contrasting them and enumerating their strengths and weaknesses, in terms of the mean, median, and mode (e.g. Gaussian peak centroid) of reconstructed energy best matching the true energy (MC truth energies and/or monoenergetic calibration peaks), the width, and the shape (symmetry). Multiple types of particles will be covered, addressing scattering from atomic electrons and nuclei, electron recoil (ER) and nuclear recoil (NR) respectively, and recoil energies from sub-keV up to GeV, from DM-WIMP-induced NR or coherent elastic neutrino nucleus scattering (CEνNS), to neutrino-induced ER. The summaries in each section make idealized recommendations, with detector-dependent caveats, for future DM/neutrino projects.

General
Examples of usages of each of the possible energy scale definitions are taken from empirical data wherever possible, but also compared to NEST (Noble Element Simulation Technique), which is also used by itself where data are lacking. NEST is a global, experiment-and detector-independent MC framework that allows simulation of scintillation and ionization yield averages and resolutions as functions of incoming or deposited energy, electric field, and interaction type [15].
The values of detector-specific parameters also need to be known in order to permit NEST to simulate the detectors effects for a specific experiment. The most important numbers are g 1 , g 2 , and the magnitudes of the drift and extraction electric fields, if applicable. g 1 and g 2 are respectively defined as the gains of the primary and secondary scintillation channels, the latter from ionization (again, only if applicable). g 1 is always between 0 and 1, and is an efficiency which combines the quantum efficiencies of one's photo-sensors with the geometric light collection efficiency. (It is also known as the photon detection efficiency.) It can include or exclude, depending on choice of units, the probability for certain photon detectors (especially in the vacuum ultraviolet or VUV) to produce more than one photoelectron (phe) for a single incoming photon [16]. Typical values across all experiments using Xe or Ar are ∼0.05-0.20 phe per photon [17][18][19][20][21][22][23]. g 2 is a combination of the electron extraction efficiency for a two-phase TPC with the gas gain, i.e. the number of photons produced per extracted electron times the photon detection efficiency in the gas phase [24]. A typical value is ∼10-30 phe per electron.
Some photons, especially in the ultraviolet range, carry sufficient energy in order to generate more than one phe per incident photon, in the photocathodes of certain photon detectors, and stochastically, not consistently. The definition of the unit "photons detected" or "detected photons" or phd for short is simply phe (alternatively called PE by some authors) divided by (1+p 2e ), where p 2e is the probability of producing two (photo)electrons, typically 0.1-0.3, depending on the manufacturer, temperature, and individual phototube: phd = phe / (1 + p 2e ), while a similar translation applies to the g 1 . While taking this effect into account is ideal for the resolving of peaks and achieving the best possible background discrimination for one's final analyses, the convention for how the final results are plotted varies by experimental collaboration, making comparisons more difficult (XENON and DARWIN prefer phe or PE but LUX and LZ prefer phd). PIXeY's 2-phe probability was reported as 17.5% [25], implying that division by 1.175 would convert phe into phd.
We define N ph and N e− , as the original numbers of photons and electrons generated from an interaction site, determined by MC truth (integer) or empirical reconstruction (float). N q is their sum.
In data, or advanced MC including detector effects like finite detection efficiencies not just mean yields: N ph = S1 c /g 1 and N e− = S2 c /g 2 . S1 and S2 are the primary and secondary scintillation pulse areas. Subscript 'c' denotes correction, primarily for XYZ-position effects, as light collection efficiency may depend significantly on 3D position, especially in a large-scale detector [26]. The energy dependence is intrinsic to the element, unrelated to the position inside a particular detector. It is thus useful to also define two additional terms, L y and Q y , which respectively refer to the N ph and N e− per unit of energy.
Our default guiding formula, at least for combined energy-scale reconstruction, is therefore: L is Lindhard factor, quantifying E "lost" (if a detector cannot see phonons) into heat (atomic motion) instead of observable quanta. Called also "quenching" historically, that word is not precise: quanta are not being quenched per se but not being created ab initio, unlike in quenching by impurities or ionization density. It depends on E, making Eqn. 1 circular. Circularity can be avoided by a good MC model like NEST, and by quasi-monoenergetic NR data, as in the LUX D-D analysis [27]. For ER, L is taken to be 1.0, not implying no heat loss but an approximately constant loss as a function of E that can be "rolled into" the definition of work function, essentially raising it. For neutrino experiments, E depositions are tracks not point-like, and dE/dx (energy loss per unit of distance) is more relevant.
The L quantifies the effectiveness of an initial NR at producing more elastic recoils as secondary interactions not inelastic, i.e., atomic excitations/ionizations. It can be modeled in other/better ways than Lindhard's, expected to break down at low E's, as seen in Si/Ge [28]. It should not be confused with L e f f used for S1 [17,[29][30][31] at zero field, and not accounting for 57 Co 122 keV γ-rays not being representative of even ER yields, at different E's [32,33].
(L e f f was the ratio of NR to ER light yields.) L(E) can be thought of as merging the scintillation and ionization efficiencies (whereas L e f f is only the scintillation efficiency, again compared to ER's L y ). However, as with the word quenching, it is better to avoid this terminology; it can be confused with the efficiency set in DAQ and/or data analysis, by for example requiring a certain number of PMTs to fire to count an S1 (2-fold coincidence for example in LUX: see Fig. 4 later, ER, where this, g 1 1, and other effects lead to a threshold). The E within the L-factor is the true energy of a nucleus recoiling from a neutron (or ν coherently, or DM hopefully in the future). When quoting imperfect reconstructed energy using Eqn. 1, with 0 < L < 1, the standard unit is keV nr to contrast with the unit defined assuming L = 1 for ER earlier: keV ee .
The subscript 'q' indicates that the W value is not based on scintillation or ionization alone. W q is effectively an average over microphysical processes producing excited/ionized atoms. We do not describe them, as they are within other works [34][35][36], nor establish NEST's accuracy here, using it for convenience where data are lacking. The physical details and how NEST captures them are beyond the scope of this paper. In this section in particular we summarize established methods [32,37].
At zero field one only has access to the S1, and for specialized searches for (sub-)GeV DM [38,39] to only S2 due to the low energies involved, as the ionization i.e. charge yield is typically larger and easier to detect, even as E goes to zero. The formula correspondingly morphs into one of these: where L y = N ph /E and Q y = N e− /E are functions of energy E and field E . Each differ for ER and NR; they are not fixed as W q was, for ER. The challenge of E reconstruction increases via an inherent non-linearity: e.g. 2x the E does not mean 2x the light or charge as it does with N q at least for ER. When considering resolution, this causes deviation from the Poisson expectation of 1/ √ E improvement with higher E, which only applies for the combined scale (1/ N q ). A more general power law then works best [40][41][42]. In LXe (liquid xenon) the fact L y is not flat vs. E was first demonstrated by Obodovskii & Ospanov [43] for ER (implying for fixed W that ER Q y is also not constant) and by Aprile et al. for NR Q y [44,45] but not well known in the DM field for ER background at least until much later publications [33,46]. In the case of experiments like EXO (0νββ decay) and DUNE (long-baseline ν's) using Xe/Ar, only the first half of (3) applies, as they measure Q directly in liquid using wire readout planes, instead of in gas. Also, for combined E (Eqn. 1) especially, S2 can be defined using a subset of photon sensors instead of all. A cylindrical TPC typically has two arrays, one at each end. The bottom one alone (subscript 'b' for differentiation from total S2) may be used for E reconstruction as in [47] to adjust for light loss created by inoperative phototubes in the gas, or saturation. Lastly, g 1 can be vastly smaller in kilotonne-scale ν experiments, unlike the range quoted earlier for DM: they rely on Q y .
For the purposes of reproducibility we note that all of the work presented here uses the latest stable NEST release at time of writing: Version 2.2.0 [48] for which the default detector parameter file is designed to mimic the first science run of LUX [49] but which we modify as needed to reproduce other experiments, focusing again on g 1 , g 2 , and drift electric field E as the three most salient inputs.

NEST-specific Details
The main assumptions utilized in NEST to reproduce efficiencies are briefly explained below. 1. A Fano-like factor sets variation in total quanta, with a binomial distribution for differentiating excitons/ions (inelastic scattering). In LXe it is not sub-Poissonian as in GXe, as experimentally verified in each phase [50]. 2. Recombination fluctuations [51][52][53]: the "slosh" of N ph vs. N e− caused by recombination probability for ionization e − 's, which may either recombine to make more S1, or escape to make S2. These are worse (i.e. larger) than expected naïvely (non-binomial). This is distinct from Fano factor, and canceled by combined E.
The above are general, but the below depend on the detector, all combining into the final efficiency. 1. g 1 is used to define a binomial distribution for the S1 photon detection efficiency with < S1 >= g 1 N ph . 2. For an S1 to be above the trigger threshold, most experiments require that O(0.1) phe must be observed in N PMTs for N-fold coincidence, where usually N = 2 or 3, within a coincidence window, of 50-150 ns, requiring a basic timing model for singlet and triplet states and photon propagation time. The 2 or 3-fold coincidence prevents triggering on photo-sensor dark counts. Baseline noise O(0.1) phe is also modeled. 3. The pulse areas of single phe are assumed to follow a truncated (negative phe are not possible) Gaussian distribution, with O(10%) resolution differing by photon-sensor, but a detector-wide average is used for NEST, as an approximation. If single phe detection efficiency is reported, it can be used instead of a threshold applied to a Gaussian random number generator, thus taking non-Gaussianity and other detector-specific idiosyncrasies into account. This and others numbers are collected from arXiv, publications, and theses. 4. Drifting, diffusing electrons are removed via an exponential electron lifetime, and are assumed to follow a binomial extraction efficiency, while the number of photons produced per surviving extracted electron depends on the gas density, electroluminescence electric field, and gas gap size, in a 2-phase TPC [54,55]. 5. A special Fano factor, typically also > 1 for S2 accounts for non-Poissonian behavior, due for example to grid wire sagging. 2-4 is normal [56]. S2 photons experience a similar binomial photon detection efficiency as S1 photons, moving along from photons to phe (for S2, from electrons to photons to phe). A raw, total S2 threshold O(100) phe removes the lowest-energy events, to avoid few-electron backgrounds [57]. 6. S1 and S2 XYZ variation is simulated in NEST if provided in analytical form, then realistically corrected back out, based upon finite position resolution, not MC truth positions, thus allowing not only for correct means but correct widths. (Z or drift correction applies only to S1, handled for S2 by the electron lifetime.) 7. N ph falls while N e− rises with drift field in anti-correlated fashion, and fields can be non-uniform.
A final step of noise is applied as an empirical smearing to the S1 and S2 pulse areas to match realistic experimental data; however, the above lists capture the vast majority of fluctuations that can shift the low-energy efficiency higher or lower. Additional noise is typically at a level of O(1%) from unknown sources, but likely due to position correction imperfection and other analysis-specific effects, discussed in [24,36,58]. This is uncorrelated noise, as it is applied separately to S1 and S2, while the variation induced by the Fano factor (Step 1 in first list) is correlated "noise" due to raising S1 and S2 together, and the recombination fluctuations constitute anti-correlated noise, as in raising S1 they lower S2, and vice versa. All fluctuations can move events above and/or below nominal thresholds.
All steps above are uncertain especially when it comes to a simulation software package such as NEST. It would be infeasible to discuss and address all uncertainties. Thus, we will mention only one that is often largest. The N ex and N i , followed by N ph and N e− , produced at Step (1.) in the first list above, depend on particle, energy, field, density via temperature and pressure, and phase, and at the lowest energies (sub-keV especially, and in particular for NRs) there is a non-negligible systematic uncertainty from the values assumed for the average yields (L y and Q y ) which are always the first step in NEST modeling. For ER Q y , the discrepancy between different data sets and models is as much as 20% below 1 keV, as illustrated by contrasting NEST [58] with PIXeY 37 Ar data [59], x-rays in LUX [60], and 3 H in XENON100 [61]. L y = 0 in NEST for sub-keV ER but not in XENON models, which thus predict higher efficiency [42]. The 50% efficiency point typically is "E threshold." For the purposes of this review paper, what is of greatest importance is that the assumptions, of the default NEST yields models, are not varied, when comparing different energy reconstruction techniques, so that at the very least a robust comparison can be made among them. That being said, we stress the accuracy of NEST in reproducing efficiency even for publications where the authors have no access to the original data is high (not only LUX [62]). XENON1T is an excellent example [58].

Results
Xe is examined first (ER then NR) followed by Ar. For Xe, the present-day relevant experiments seeking DM for whom this review is most pertinent include: LZ, XENON, PandaX. The use of LAr is divided, present and past, across DEAP, CLEAN, ArDM, and DarkSide (first two single-phase and zero field, latter two dual-phase and non-zero electric field) on the DM front, and DUNE, MicroBooNE, ArgoNeuT, ICARUS, plus many others, studying neutrinos. Enriched LXe is used by nEXO, a TPC but only one phase, and NEXT (GXe) for the hunt for 0νββ decays. α's and heavier ions different from the medium, with properties like additional quenching, modify the E reconstruction formulae, but we will only focus on basic ER and NR; other recoil types are already covered elsewhere [15,63].

Low Energy: keV-scale (Dark Matter Background, Signal) Basic Recon of Mono-E Peaks
The term electron recoil or ER refers to interactions with the electron cloud, such as from beta emissions and the Compton scattering or photoabsorption of gamma rays. In a WIMP search, ER is the primary background, but in a more general DM (or exotic physics) search a monoenergetic peak or even continuous ER spectrum can be the signal [64]. To illustrate the differences among the S1-only, S2-only or Q-only, and combined-E scales for reconstruction the first example is the lowest-energy ER calibration peak available at time of writing for LXe where we have S1 and S2 still: the electron capture decay of 37 Ar at 2.82 keV. It is also a timely example due to the recent XENON1T ER result [42,58].
While this E is not often in the regime of efficiency drop, we do address it in this section. Our source of data here was the seminal work led by McKinsey et al. at Yale/Berkeley [59] who constructed a small-scale calibration chamber, PIXeY, with g 1 = 0.097 ± 0.007 phe/photon and g 2 = (0.78 ± 0.05) * 36.88 = 28.77 ± 1.85 phe/e − (extraction efficiency times the single-e − pulse area). To replicate PIXeY precisely with NEST in Fig. 1, we use g 1 = 0.1015, g 2 = 30.65, only 0.6-1.0σ higher. We studied only 198 V/cm. W q was taken to be 13.5 eV, between Dahl [36] and neriX [65] measurements. In all plots, keV ee means reconstructed energy (ee standing for electron equivalent, for when NR is translated into this scale) as opposed to MC truth, or a known single E from a peak, reported in keV sans subscript.
While lower-E calibrations exist than 2.8 keV, this is the lowest where S1 and S2 are identified separately. Others become S2-only [60]. Fig. 1a demonstrates that the S1-only scale, used primarily on XENON10 [29] and continuing on in a subset of XENON100 analyses [66], performs the poorest, with an energy resolution σ/µ of 38.63% for data (38.59 NEST) in red. (Both XENON10/100 had similar g 1 's to PIXeY, but slightly lower, at ∼0.07.) Values in parentheses following values from data now are for NEST, on to subsequent pages, in the style of A (B). An S1 basis is further complicated by non-linearity. Next, in Fig. 1b the S2 only is plotted, leading to an 11.65% (12.20) resolution. While we see later that the combined scale, by including all available information, is typically best at higher energies, this is not the case any longer at keV-scale energies, as (c) indicates. Combined resolution is 15.84% (15.51). This is caused by data from these E's being comprised of upward S1 fluctuations above nominal S1 threshold, due to finite g 1 . An experimenter measures the right tail of S1s essentially.
Nevertheless, it is possible to mitigate S1 effects, still include S1 in the E calculation, and obtain the best possible resolution. One way is to fit a skew Gaussian (parameters explained in [53] and [58]): While this has been done for bins in S2 vs. S1 [53] and once for combined E [58] it is most effective for S1: see the improvement in Fig. 1a (blue vs. red). A skew fit, while including an error function er f and similar to the equation used in [59] that should account for triggering, still misses some points, as keV-level S1 becomes non-Gaussian and non-symmetric due to trigger efficiency dropping below 100%. In plot (c) however, the reduced χ 2 drops from O(100) for both data and NEST to 2.6 for the 1.5-5 keV ee range, still too high due to features in the data not captured even by a skew-normal fit, but more sensible. Asymmetries arise from both thresholding bias [27] and microphysics [53,58].

More Advanced Energy Reconstruction Strategies, from keV to MeV Scales, and Resolution
A superior mitigation strategy can be found upon realization that the optimal weights for the S1 and S2 pulse areas are no longer simply g 1 and g 2 at the O(keV) scale. We can recast this statement in terms of the (n)EXO-style combined-energy scale first developed by Conti [51]: instead of using a g 1 and g 2 it defines what is known as an angle of anti-correlation for summing S1 plus Q or S2. As energy decreases the angle becomes energy-dependent instead of being fixed as tan −1 (g 2 /g 1 ) [40] and thus no longer respecting "perfect" anti-correlation of quanta, with N ph and N e− always summing to N q = E/W q . Note there is no evidence of anti-correlation breakdown at least in LXe above 1 keV: this effect is caused by inability to reconstruct N ph well in data due to dropping S1 efficiency, as first suggested by Szydagis (2012) and first publicly applied in the PIXeY 37 Ar paper [59].
Parameter w 1 decreases the weight assigned to S1 for low E's, countering thresholding; one could increase the S2 weight, but this is equivalent. Multiplying S1 by one weight, and S2 by another would be redundant. Instead, one weight is applied to S1, and a second weight w 2 to the formula as a whole to bring the average of the energy being reconstructed back to the correct mean after the shift caused by adding w 1 , while simultaneously correcting for any efficiency bias near the S1 and/or S2 thresholds. It is not technically independent then, thus written in Eqn. 5 as a function of w 1 , which itself is a function of energy. To avoid a circular reference, Eqn. 1 can be used to determine its energy dependence, for use in 5, and the process can be iterative, defining E" after E', etc. Knowledge of the proper weights a priori is achievable via MC. Figure 1. Reconstruction of the 2.8224 keV 37 Ar peak in the PIXeY detector [59] compared to NEST. Real data always hollow black circles, NEST MC green squares. Gaussian fits in red, skew Gaussian (better fit) in blue, with the fits to data in long dash and NEST in short (indistinguishable due to NEST's fidelity). Number of events in data 7.4 × 10 4 , while 9.3 × 10 5 in the MC, after all cuts (i.e. all thresholds). (a.) Original, non-linear S1-only E scale used for LXe. Bins with non-zero counts begin very suddenly at the left in both NEST and data due to a cut-off created by triggering only on 3-fold PMT coincidence, and other threshold requirements. The results are highly skewed, driving the asymmetry within the combined-energy fit later. (b.) S2-only, which is quite symmetric, so that Gaussian and skew-Gaussian fits overlap. (c.) The combined-energy scale in common use now for LXe DM detectors. Gaussian fits in red are clearly poorer compared to skew fits, diverging from the histogram in the cases of NEST and data alike. (d.) An optimized combination for energy, as done on PIXeY. Both NEST and data, and Gaussian and skew-Gaussian fits alike, have all become indistinguishable for this stage. The best-fit mean energy has shifted from 3.03 keV in (c) to 2.82 keV for (d). This improvement in precision is also reflected in the sum of the mean quanta from (a) and (b) matching (d), but not (c), which is too high. The skew parameter α decreases from 3 for S1 only (a) to 2 for the combined scale in (c) and 1 in (b,d) Without an MC like NEST tuned on earlier calibration data, it is possible to empirically determine the two weights by calibrating an experiment with monoenergetic peaks (e − capture, x-ray, gamma-ray). In the case of PIXeY's 37 Ar measurement, the values which minimize the width of the Ar peak in reconstructed energy (optimum resolution) are w 1 = 0.19 and w 2 = 1.38 (NEST: 0. 23, 1.35). Looking back at Fig. 1d the very positive effect of applying Eqn. 5 is evident: the resolution is 10.60% (10.68%) close to S2-only (b) but a factor of ∼1.5 improvement over (c) the more traditional "plain" combined scale. Moreover, the Gaussian centroid has dropped from 3.03 keV ee (again, higher because of triggering on high-S1 fluctuations) to 2.83 (2.81) much closer to the true value of 2.82, while the asymmetry in the histogram has nearly vanished, with the best-fit skew Gaussian possessing a positive (right-hand) skewness parameter α < 1 (compared to ∼2-2.5 in (c), which used Eqn. 1). While it is not possible to remove all of the skew, as some is intrinsic (from the physics of recombination probability) [53], in panel (d) the Gaussian (red) and skew (blue) fits are nearly indistinguishable (unlike in plot 1c).
This technique, essentially equivalent to inverse-variance weighting, is not widespread for DM, although found in [25]. From its inception, LUX has relied on Eqn. 1 i.e. plot 1c's method, but a PLR (Profile Likelihood Ratio) analysis effectively takes into account energy bias by relying on NEST to produce non-analytic 2D PDFs for both background and signal, relying on MC truth energy converted to S1 and S2, not reconstructed energy [67]. XENON's own MCs for its PLR perform the same function, taking MC truth and/or calibrations as input, and outputting (S1, S2) PDFs mimicking data [68].
Many possible enhancements exist, like Maximum Likelihood [69] and Machine Learning [70]. These can take more than S1 and S2 into account, e.g. bypassing calibrated 3D corrections, feeding raw S1 & S2 plus positions into an artificial neural network (ANN) or Boosted Decision Tree (BDT) from which XYZ dependence emerges, given sufficient training. Ernst and Carrera suggest that it is possible to determine how much is sufficient [71]. Some examples of additional training variables include the E-dependent S1 pulse shape, usable given sufficient statistics [54,72], as well as breakdown of pulse areas into top and bottom arrays, capitalizing on anti-correlation of top vs. bottom light similar to that of S1 vs. S2, as used early on LUX [73], good for detectors sans S2 (0 V/cm). However, idiosyncrasies of individual detectors make it difficult to review such methods, which still rely on S1 and S2 as the two most important variables regardless. ANNs/BDTs are best trained by a combination of S1s and S2s from data and MC, per analysis, and have so far never been applied at < 1 MeV.
Given these difficulties, this paper focuses only on the energy reconstruction scales which can be formulated purely analytically with relative ease: again, S1-only (Eqn. 2), S2-only (Eqn. 3), combined (Eqn. 1), and the so-called optimized scale E (Eqn. 5). To broaden applicability to more detectors, we also consider variants. Fig. 2a shows S1-only in red and S2-only in cyan. The dashed red line illustrates how the S1 scale is poorer (the effect propagates into combined energy) when one does not account for the so-called "2-phe effect," mentioned earlier [16]. Accounting for this via dividing it out improves the resolution, as the additional phe do not provide any new information on the original number of photons produced N ph , even though they may be useful in lowering threshold and increasing the sensitivity to lower-mass DM [74]. The solid red NEST line demonstrates the improvement achieved in doing this, plus attempting to reconstruct the integer numbers of photons hitting the photomultiplier tubes (PMTs), instead of only reporting S1 pulse areas. This technique is known as photon counting or spike counting [62] and is easy/feasible only at low E. More importantly than the slight improvement in energy resolution, at only the lowest energies (< 10 keV), this reduces the leakage of background ER events into the WIMP (NR) region in S2 vs. S1 [18]. All points plotted are defined as raw σ/µ, but comparable results can be achieved with Gaussian/skew-fit centroids or medians.
Cyan lines represent S2, which as seen before can be better than S1 or even combined-E scales, but only at O(1) keV. It is at least comparable, which is important given the historical use of only S1 for E even in 2-phase TPCs with both channels, and continued usage for gas-less regions like the skin vetos of LZ and  [52]. Only combined resolutions published (black) but S1 (red) and S2 (cyan) scales included from NEST to show they are mutually comparable at O(10-1000) keV. The lowest-E point is optimized as done in PIXeY, so anomalously good (low). NEST combined scale blue (below S2 only in cyan except for 37 Ar) and optimized gold. w 1 in optimal scale varies from 0.26 to 1.0 from 2.8 to 662 keV, while w 2 falls from 1.45 to 1.00. (Lines are guides not fits.) (b.) Data from neriX (Columbia's small-scale calibration chamber, like PIXeY) as black points vs. measured recoil energies from Compton scatters [65] compared to NEST in blue vs. true energy known from MC, showing consistent deviation for both in reconstructed energies for a combined but non-optimized scale due to threshold bias (weights as used for NEST in gold in (a) correcting for this not applied purposely in blue in (b) to show this effect).
XENONnT. For S2, it is critical to understand the limitations, especially in low-mass-DM experiments like LBECA, where it is the only channel used. It can suffer from poor drift e − lifetime (impurities), incomplete extraction at a liquid-gas interface due to fields being too low, or both. The former effect (same as latter) is shown by the dash vs. solid cyan. Even when lifetime and extraction are known, along with single-e − pulse size, low values lead to high S2 area variation, although the effect is muted above 50 keV. The 41.55 keV "hiccups" are 83m Kr, a combination of 2 decays, at 9.4 and 32.1 keV [75]. An inverse square root is not a good fit to the S2 or S1 alone and not included; it is due to incomplete accounting of quanta, also to linear noise flattening the curves. Such noise impacts higher E's more and is defined in detail in Section 4.3 of [36] and Section III.A of [58]. As seen in the ∼straight lines in log-log, a power law (often plus constant for noise) is reasonable for combined E (blue) but below ∼5 keV even E − 1 2 breaks down, unless the power is free, preventing extrapolation from 10-100+ keV down to where behavior may even be non-analytic, and E resolution ill-defined.
The NEST combined scale is in solid blue in Fig. 2a compared to LUX data from its first science run as black circles. LUX used the same combined scale, which again is clearly advantageous compared to single-channel methods, with g 1 = 0.117 ± 0.003 and g 2 = 12.1 ± 0.9 [18,76]. The exception in the plot is the first point (2.8 keV) which should be compared to the optimized NEST in gold. The dashed and dotted blue lines are examples of further resolution improvements. First, by removing linear noise, in MC, as doing so is certainly not as easy in data (noise proportional to the S1 and S2 areas representing, e.g., imperfect position corrections). This is modeled as only 1.4% for LUX. It is typically O(1%) [36,58]. Second, respectively, by simulating a uniform electric field, when the real field varied with position, though not significantly in LUX's first WIMP search run (180 V/cm average) and the (much larger) variation was taken into account in the second [77]. The last few highest-energy points in data (black) do not overlap with MC due to not fully accounting for PMT saturation (S2 clipping) above 500 keV.
The advantages of the optimal scale (gold) disappear rapidly above 10 keV, comparing gold to blue. There is a benefit to this. It means that at sufficiently high energies the rotating/re-weighting of a peak in S1 and S2 to find the optimal resolution results in a derivation of g 1 and g 2 [49]. This abrogates the need for multiple peaks, arranged in S2 vs. S1 (means) in what is known as the Doke plot. Such a plot is a straight line due to the anti-correlation between N ph and N e− for ER [78] shown to work across at least four orders of magnitude in energy, and different fields [36,62,79,80]. Alternatively, if studies of anti-correlation both within peaks and across peaks for a given analysis in a certain experiment are possible, then these two methods for deriving the S1 and S2 gains can serve as cross-checks, on top of NEST comparisons and known-spectrum reproduction such as that from tritium betas [76]. As explained in the caption of Fig. 2, the S1 weight w 1 is decreasing toward 0 with decreasing E's while w 2 increases to compensate, but with increasing E's both w's asymptote to 1.0, as expected.
In the second plot plane (b) we focus on the mean instead of the width (resolution) demonstrating explicitly with both data from Compton scattering in black [65] and our NEST MCs (despite significant uncertainties) that the thresholding effects raise the reconstructed energy significantly above the true value. This phenomenon becomes most prevalent in the sub-keV regime, however, where resolution becomes ill-defined due to individual photon and electron quanta becoming resolvable, generating a multiple-peak structure [27]. The peaks become not just skewed-Gaussian, but entirely non-Gaussian, or even non-analytic [81]. For this reason, Fig. 2a stops at 2 keV on the x-axis. But Fig. 2b continues below that, focused on mean (i.e. ratio of reconstructed over known energy) not width however. We switch to neriX from LUX here, as LUX does not have a relevant plot published with which we can compare, and did not have direct, quasi-monoenergetic measurements below 1 keV. (Nevertheless, due to similar g 1 's and g 2 's in these and most experiments the results should be quite general.) Data uncertainties are driven in x (E axis) by finite resolution in the Ge detector used for independent energy determination, and in y by uncertainties in neriX's g 1 and g 2 (0.105 ± 0.003 phe/photon, 16.06 +0.9 −1.0 phe/e − ). NEST uncertainties are large only at sub-keV, and are not statistical due to large simulations. Instead they are due to the uncertainty on how to define a central value, using a mean or median or attempted Gaussian fit, due to the multi-peak effect mentioned (photon and e − discretization).
At low energy the benefits of not just a combined but optimally-combined (re-weighted) scale are significant: not just a built-in erasure due to w 2 of the growing discrepancy between the reconstructed and real values of energy illustrated effectively in Fig. 2b (see also neriX's Fig. 7 [65]) but a reduction in width that was 50% (relative) for 37 Ar in both PIXeY and LUX. Lastly, as illustrated in Fig. 1d the shape becomes more symmetric at individual energies, with the skew nearly disappearing. While this matters more for monoenergetic ER peak searches (for axion-like particles or ALPs, bosonic WIMPs, et al.) a benefit for a WIMP search, or for any analysis in fact, is better determination of the g 1 and g 2 through tighter windowing around single-energy calibration lines in 2D, in S2 vs. S1, which can occur iteratively, reducing the errors on g 1 and g 2 (5-10% typical) that often drive systematic uncertainties on both yield analyses and final physics results, especially in terms of S1 and S2 thresholds [18,58,76]. The only disadvantage is loss of the field-independence a combined scale usually has, as the yields change with field. As most experiments run at only one electric field however, that is not a true drawback.

High Energy: The MeV Scale (Neutrinoless Double-Beta Decay)
Far from the hard thresholds, we turn our attention next to 0νββ decay. Searches for this require great resolution for good background discrimination, at Q ββ = 2.458 MeV for 136 Xe specifically [82]. While resolution naturally improves with E due to the greater numbers of quanta produced, effects such as PMT saturation and different noise sources, including position-dependent effects, become more prominent. While machine learning can help a great deal as done on EXO-200 especially with detector-specific idiosyncrasies [70] the analytic optimum scale becomes degenerate with combined E above 0.1 MeV even already as illustrated earlier. Table 1 reviews the resolutions achieved in actual experiments: projections of future performance e.g. for LZ [83] are not included, in order to showcase only what has been demonstrated, or extrapolated with σ/E ∝ 1/ √ E (+ optional constant). In EXO, in its references cited below, a richer formulation was adopted: σ 2 = a + bE + cE 2 . It considers more detector noise sources. For a = c = 0, it simplifies to σ 2 = bE or σ/E = √ b/E. The reader must be cautioned not to conclude that one technology (XENON is two-phase, but others single) is better, as fiducial mass and total exposure time, position resolution, overall background rate in the region of interest and self-shielding, and enrichment in 136 Xe come into play. The XENON series of detectors have focused primarily on DM not 0νββ and so were not enriched. Their intrinsically better resolutions are due not necessarily to the addition of a gas stage (converting Q into S2) but the use of PMTs with single-photon resolution, while EXO used silicon photo-multipliers (SiPMs) with poorer single-phe resolution (not needed at MeV energies) required for their lower radioactivity that is superior to even the custom PMTs for LZ/LUX and XENON [93,[102][103][104][105][106]. Lastly, we do not explore GXe, for which there is much data: NEXT has achieved better resolution than reported here, 0.1-0.3% (0.30-0.74 FWHM) due to lower total-quanta Fano and lower recombination fluctuations [107,108] both accounted for in NEST [48]. (The question of high mass vs. superior resolution is beyond our scope.) What we can do is perform detailed MC scans to predict the best potential LXe resolution, for 2458 keV, but differing conditions; real experiments measure it at a nearby E e.g. 2615 keV from 208 Tl. Validations are not overlaid, but we point to successes in predicting resolution for XENON1T [41,42,58] and postdicting LUX. To narrow the enormous parameter space, infinite e − lifetime and 100% extraction efficiency (or, all-LXe detector like n/EXO) are assumed, with 0% noise in Q readout from the grids, but varying S1 noise level and wide E-field range for completeness. Second, an assumption is made of fixed medium g 1 = 0.1, a conservative baseline based on what is possible now, while NEST systematic uncertainty will be shown, stemming mainly from different assumptions for the Fano factor. Fig. 3a is resolution dependence on g 1 , from a pessimistic scenario of 1%, all the way up to 100%. Higher E-field is only better at low g 1 , due to NEST's strictly empirical Fano factor F q increasing with field. It is unphysical, but needed to match data claimed to be de-noised or low in noise [109]. This is important, given the rush to achieve higher field for better resolution [110] similar to the rush in the DM field, for lower leakage of ER backgrounds into the NR regime [53,56,111]. While L y and Q y are changing The middle-of-the-road default beta model was selected, and g 1 and field frozen at 0.1 and 500 V/cm, and the resolution as a function of the Fano factor assumed is presented, for different levels of noise in the S1 (primary scintillation) signal. As at left, combined E used, except for the dashed line (S1) and dotted (S2 or Q) for comparison to more detectors, as close to the max (worst) possible.
with E-field thus changing combined resolution, higher g 1 is naturally better, at least for non-zero field and combined E, due to more photons being collected. XENON1T, with its g 1 ≈ 0.13 and field 120 V/cm, appears to have achieved close to the best possible for those values [41] at 0.8% (Table 1), also best overall. NEST's theory prediction of 0.7% as best possible for XENON's g's and field rises slightly to match at 0.8 when applying XENON's e − lifetime and e − extraction efficiency [58]. In Fig. 3b is resolution's dependence on Fano factor, from a theoretical value [112] (sub-Poissonian, 1) up to the largest experimental one, of Conti et al. [51]. This governs standard deviation: F q N q . While the best-fit (world data) NEST value, for 2.5 MeV and 500 V/cm, is 14 by default, we treat the Fano factor as free in Fig. 3b, extending down to 0.2 due to NEST possibly absorbing detector-specific noises by mistake into the Fano value (even if this is not likely due to matching data across decades [41,109]). All the various EXO-200 results can be explained, as being between ∼2 and 5% noise levels in the detection of scintillation. The dotted cyan line holding steady at 4.4-4.5% explains precisely the seminal result with Q-only resolution of 4.5% in Table 1. It is not affected much by Fano factor, since when one considers only a single channel the recombination fluctuations, which move quanta between the scintillation and ionization signals, dominate [29,36,53]. It is very similar at different fields, since in the minimally ionizing regime (as ER energy approaches and exceeds 1 MeV) Q y and L y asymptote to constants [24,32,113]. The dashed line is too high to explain EXO's S1-only values, but S1 resolution improves with more light at lower E-field (higher E-field increases charge, at expense of light).
Regardless of whether it is achieved through ramping up g 1 (not unrealistic for the future given 100% QE devices [114] and high-quality reflectors [115]) or F q dropping to zero (it is not tuneable, at least not without doping of Xe with other materials; only feasible if the intrinsic value is already below Poisson i.e. 1) the best possible value for resolution appears to be 0.4%, a "basement" created by binomial fluctuations in excitation and ionization, combined with non-binomial recombination fluctuations [52,79] if there is no noise (versus fixed 2% example in pane b). Given realistic detector conditions a more reasonable estimate of the minimum possible here is 0.6% (comparable to gas).
Even if the total number of quanta is higher than assumed here, due to W being lower, as recently measured by EXO-200, 11.5 ± 0.5 eV [110] as opposed to 13.7 ± 0.2 eV (Dahl [36]) or 13.4 ± 0.4 eV (Goetzke, neriX [65]) or 13.8 ± 0.9 eV (Doke [78]), then this basement is unlikely to change significantly, as even F q = 0 was explored above. This discrepancy, observed in the light not charge channel and thus unlikely to be due to charge amp calibration differences, may be due to SiPMs being more sensitive (relative to PMTs) to wavelengths other than VUV, such as infrared (IR scintillation has been observed in LAr [116]). While at 2 nd order the fluctuation models in NEST would have to be revised, to 1 st order everything discussed here would remain the same, but with g 1 and g 2 estimates decreasing by ∼15%.

Energy Reconstruction and Efficiencies for a Continuous Spectrum
In searching for either 0νββ decay or the dark matter, a continuous-spectrum background can obscure any potential signal of beta decay or dark matter respectively, in addition to peaks in the background, or calibration peaks [90,[117][118][119][120]. In our final ER analysis, we return to optimal combined E, but consider a non-monoenergetic spectrum. A new challenge appears, as cross-contamination, e.g., between bins in a histogram, makes it difficult to separate the upward fluctuations of lower E's from simultaneous downward fluctuations from the higher bins.
This difficulty is exacerbated by the fact that energy resolution is not fixed, so this is not a flat or linear effect with which it is easy to deal analytically. The resolution of course degrades as energy goes to zero. Because the light and charge yields depend on energy, an additional problem is the fact a spectrum flat in (combined) energy is not flat in S1 nor in S2, and not all background spectra are going to be flat. That being said, this is approximately true at low energies for DM searches in LXe TPCs, after the contributions from all background radioisotopes are summed together, from Compton plateaus, neutrinos, and/or beta spectra, as in [24,42,53,58].
A naïve optimization attempt for a uniform spectrum that allows both w 1 , w 2 to vary distorts it more than normal. Better results are obtained fixing w 1 , unlike before. LUX is the example again: it is NEST's default. A generic flat ER background is simulated from 0-20 keV in real energy (it should not be taken to represent the true backgrounds found within [49,118]). An excellent analytic fit for the detection efficiency vs. E for a continuous spectrum is a modified Gompertz function suggested by a  Fig. 4a shows how the optimal scale, in gold again, is closer to the correct energies known from NEST MC in grey, relative to traditional combined energy in dark blue again. 3 H (tritium) betas are not flat in energy but their LUX trigger efficiency should be similar enough to a flat spectrum, so it is included in black to verify NEST's reasonableness [76]. A 3 H beta spectrum terminates (Q β ) at an 18.6 keV endpoint, but finite resolution causes the fluctuations around that energy. In gold, the w 1 is held constant at 1.0, but w 2 = 1.025 − e − E 0.35 , basically adjusting for the growing deviation between reconstructed and real energies as showcased in Fig. 2b using an S-shaped curve asymptoting close to 1.0 (without a shrinking w 1 , this weight w 2 has the opposite trend compared to Fig. 2a). E in keV can stem from either the traditional combined scale, or MC truth, which in an actual experiment can be validated with a series of monoenergetic calibration peaks. The χ 2 /DOF = 2.65 for blue compared directly bin by bin with no fit to grey at 0.5-17.5 keV, versus 1.68 for gold. Bin widths are 0.1 keV. The 50% fall-off point at high E's (perfect step function in true E in grey) shifts from 19.5 to 20 keV, and is thus more accurate in gold compared to blue, forcing the endpoint smearing to be symmetric. Fig. 4b reiterates once more how distorted S1/S2-only E scales can be, in red/cyan. While Fig. 1 did show S2-only can be best for low-E peaks, this is not the case for a continuum. Both S1/S2-only are non-uniform, despite the underlying spectrum being flat, and accounting for non-linearities in the underlying S1 and S2 yields, fitting quadratic not linear functions vs. (true) E. The flat top should be 0.005 as in the truth spectrum, due to normalization: bin width over range = (0.1 keV)/(20-0 keV) = 0.005. S1 is pulse area not spike, but in units of phd [16,19,62]. Despite this, there are unnatural peaks at both low and high E, caused by threshold and the maximum E simulated (20 keV) respectively. , and re-weighted optimal reconstruction (gold). Gold outperforms blue even visually, correcting underestimation of efficiency sub-keV, plus overestimation near 1.5 keV (see the text for quantitative goodness of fit comparisons). Tritiated methane (CH 3 T) is the black points, for validation against actual data. Its efficiency curve is markedly similar despite a non-flat spectrum. As it is continuous, the E is still only reconstructed, not known to infinite precision as in MCs, although an attempt to empirically account for smearing was made by LUX [76,79]. The solid black line is the Gompertz fit, superior to a more traditional er f , dashed, while the inset zooms on low energies for clarity, with a linear x-axis and log y now. (b.) The true E's repeated in grey, but now compared to S1 (red) and S2-only (cyan) scales (Eqns. 2, 3), with the former possessing unnatural peaks at left and right, and the latter grossly underestimating efficiency at keV scales. The default public β model (v2.2.0) is used here but comparable results occur with γ-rays.

Liquid Xenon Nuclear Recoil (Dark Matter Signal, and Boron-8 Background)
Pivoting toward nuclear recoil, the first hurdle is that for this type of recoil the total number of quanta per unit energy is not fixed, unlike what was shown for ER (first by Doke et al. at 1 MeV [78], and confirmed for energies of greater interest to DM experiments by Dahl [36]). This would seem to imply there is no anti-correlation between photons and electrons for NR and thus no benefit to using a combined energy scale for them. However, this does not appear to be the case, as the sum of quanta in actual data is well-fit by a power law, when combining all world data ever collected. This power law simply replaces the flat line (or general linear function but with no y-offset, if not dividing by energy: twice the energy means twice the N q ) that works so well for ER, given a fixed work function W q averaged over both flavors of quantum. This fit is related simply to the L(E) from Eqn. 1.
The mixing of units is possible, and depending upon whether one uses an S1-only, S2-only, or combined-energy scale, for NR the unit of keV ee can mean the beta, Compton, or photoabsorption event equivalent energy at which NR produces the same amount of S1, amount of S2, or the sum. In none of these three cases however is the conversion a simple constant or linear function, and can differ wildly, from keV ee being 2-10x smaller than keV nr , depending also on energy [15]. The reason it is smaller: it takes less energy for ER to produce the same number of quanta compared to NR for the same energy deposit (L < 1). For the combined-energy scale it is at least field-independent, as L should not depend on field, only the recoil energy, and the E resolution may be best via combination of information from both S1 and S2 again. (For additional clarity: some authors refer to L as f n [35].) Fig. 5 has all data available on N ph + N e− , from which we extract L. The plots suggest combined E may still be beneficial even for NR, due to anti-correlation. The evidence is indirect, but strong: >300 data points from >20 experiments across nearly 2 decades were combined, respecting the systematics of each (typically driven by how well g 1 and g 2 were known). Remarkably, within uncertainty at least at the 2-sigma level the vast majority of the the hundreds of data points lie along the same straight line in log-log space. Publications reporting continuous lines stemming from e.g. a modified NEST version or their own custom MCs are not ignored, but a few sample points at discrete energies are plotted. Fig. 5's plot style is similar to that pioneered by Sorensen & Dahl in [35] as well as in later works.
Uncovering direct evidence of anti-correlation in NR is challenging: monoenergetic neutron (n) sources exist, but internal monoenergetic NR sources do not. The common sources used such as AmBe and 252 Cf produce n's at energies O(1-10) MeV which lead to Xe recoils O(1-100) keV for calibrating DM detectors. While n's can be background [136] they are sub-dominant compared to ER [137,138] and appear more commonly as the stand-in for DM used to calibrate WIMPs, being neutral particles that primarily scatter elastically, as DM should do [6,139]. The closest one can get to separable recoil energies comes from determination of neutron angle and double scattering as done in LUX using a D-D neutron generator, but even in this most optimum situation the direct testing of anti-correlation was not possible: Q y , L y were reported at energies that did not match, and the former data included x-axis (E) error bars driven by uncertainty in angle, while the energy error for (single-scatter) L y data stemmed in turn from uncertainty from the Q y used to establish an in situ S2-only E scale [27,140].
As this is not a paper presenting any novel models (in NEST for instance) we do not focus on breakdown of the total quanta into N ph or L y and N e− or Q y , which numerous papers already discuss at great length [141]. We also pass over the Migdal Effect, which could increase the light and/or charge at keV scales due to additional ER from initial NR; there is no evidence of its existence at present, but it is predicted to describe the behavior of electrons "left over" after a nucleus recoils [142]. Additional phenomena like it are secondary when combining all individual channels. Returning our attention to Figs. 5a-c, the empirical total number of quanta is described by the following power law (black line): (6) N q = αE β , where α = 11.43 ± 0.13 and β = 1.068 ± 0.003; N q /E = αE β−1 ∼ 11 quanta keV Not only can summed data from [27,30,36,38,68,121,122,[125][126][127][128][129][131][132][133][134][135] be described simply with a power fit but with an exponent near 1, implying a nearly fixed number of quanta/keV, ∼11.5 (c f . 73 for ER, coming from 1/W) i.e. fixed L of ∼0.157. The conversion into L, re-arranging Eqn. 1, is: W q = 13.7 × 10 −3 keV is assumed here with no uncertainty as just an example, while the last term is the Taylor expansion to first order at 10 keV. In Figs. 5d-e, the fit to N q from data is compared to several models, starting with the traditional Lindhard approach [139,143]: Where Z = 54, A = 131.293 (average) for Xe. is called "reduced energy." It allows dimensionless L comparison across different elements. A Taylor expansion for Eqn. 8 at 10 keV is L(E) = 0.1790 + 0.0028E. At this E, the value of L for the expansion of Eqn. 7 is close, < 5% lower than the one for Lindhard (8). Eqn. 7's linear approximation is lower for both of its terms, but this can be explained by bi-excitonic and/or Penning quenching, which increases with higher dE/dx, which occurs with increasing E's for keV NR.  Where the E's did not match up (sometimes even within the same data set), a simple power law was used to spline (only interpolate not extrapolate) the numbers of photons first to add them to electrons. E-field does not cause measurable differences. (a.) Only directly measured yields using angular measurements to determine E [27,30,121,122]. These are handled as more "trustworthy" in the community due to being quasi-monoenergetic analyses, and thus given the most weight in the fit even if there were fewer points. One simple power law appears to describe the data points across over three orders of magnitude in E, depicted as the black line, dashed or solid, within every plot pane. For (a), an uncertainty band was included (2σ for clear visualization in green, not 1σ). (b.) Dahl's thesis data from the Xed detector taken from broad-spectrum shape spline fits ( 252 Cf) [36]. *Corr(ected) on the y-axis refers to correcting the data in our global meta-re-analysis for effects often not known at the time of data-taking, such as the 2-phe effect, or the extraction efficiency being much less than 100% than the data-takers had originally estimated [16,123,124]. The former can lower the L y measurements, depending on the analysis technique, while the latter raises Q y data points typically. The x-axis was corrected in the sense of energy estimates updated with a more modern combined-E scale whenever possible. (c.) More (indirect) measurements from continuous source bands, from XENON, ZEPLIN, and PandaX [35,68,80,[125][126][127][128][129][130]. Errors in data used whenever reported. For PandaX, L y was not provided, only Q y . The former was estimated using their AmBe bands, by the authors of this work. (d. plus e.) A review of models [27,48,[131][132][133][134][135]]. (f.) Low-E zoom of models, with data included, as larger-size points.
Colliding pairs of excitons may lead to de-excitation, and thus less S1 [131]. Some fraction of excitation may also be converted into ionization, adding to Q. Using the values in [30] or [133] it is even possible to show that NEST, similar to the data fit, follows Lindhard closely above O(1 keV) as long as additional quenching is added, for photons (see NR analysis note [15]). This is remarkable given that it was not expected that Lindhard would work even that high in energy [28,131,132]. Yet data, once summed, exhibit no significant deviation from the Lindhard model, down to sub-keV even. At higher energy, the work of Hitachi [45,131], who incorporates quenching, may be more appropriate however; it can be approximated using Lindhard but with k = 0.11 (blue). Fig. 5d also includes the fit to LUX D-D n gun data alone (green) which agrees with standard Lindhard at the 1σ level between 0.7-74 keV, given k = 0.174 ± 0.006, after accounting for extra quenching, separately [27,62]. Sorensen 2015 in yellow in Fig. 5e assumes standard-k Lindhard at high E's, and an atomic-physics motivated roll-off below 1 keV, with a free parameter q (in same units as ) we chose to best match all contemporary data, 1.1 × 10 −5 or 10.5 eV. Only the min (blue) and max (green) k (which is uncertain and can range from 0.1-0.2 according to [28,35,134]) easiest to justify are depicted (Fig. 5d). k = 0.166 would be between, as would k = 0.14 from Lenardo et al., best fit to data as of 2014 [144].
Any NEST version ≥2.0.1 has a power fit of 11E 1.1 (red, Fig. 5d) matching Eqn. 6 with rounding, but more importantly different because of considering not just raw yields but log 10 (S2 c /S1 c ) band means, and giving greater weight to the lowest-E data, of greatest importance to a DM search [139]. (Previously 12.6E 1.05 , before all statistical and systematic uncertainties were properly considered.) The dash purple line of Fig. 5d is not just the power law but the full NEST that also since v2.0.1 to today includes sigmoidal corrections, separate in both L y and Q y , to allow for modeling of NR violating strict macroscopic anti-correlation. These cause the N ph and N e− to realistically drop below the power law, conservatively accounting for non-Lindhard-like behavior below a few keV, and better matching data in this regime such as from the D-D calibration of LUX's second science run [122]. Nevertheless, all models agree very well at low energies, all of them extrapolating at 200 eV to 1-2 quanta (see inset). Below 0.2 keV, NEST conservatively assumes 0 quanta, justifiable from first principles [15,28,48,132].
For greater readability several models have been omitted from Figs. 5d-e, which do not include the work of Sarkis, Aguilar-Arevalo, and D'Olivo [28] nor of Wang and Mei [141]. This is not to say their approaches are not valuable, but the former is markedly similar to Sorensen [132] and to the complete NEST equation with a sub-keV roll-off despite starting with different formulae, while the latter is fit to the LUX data, and thus practically redundant with dashed green. In Fig. 5f we zoom in on energies < 5 keV only, of greatest interest not only for lower-mass WIMPs, but any rest mass-energy WIMP due to the falling-exponential nature of NR from WIMPs, in the Standard Halo Model [6]. The cut-off in quanta may not be as sharp as expected due to the Migdal Effect adding more light at least [145]. Below 3 keV 8 B solar neutrino CEνNS is of great significance as well, interesting in its own right for the first detection of the recently measured coherent scattering [146,147] but from solar neutrinos, and as a background to next-generation LXe-based WIMP experiments [1,148].
In following the ER section, the next step should be a discussion of the energy resolution as a function of energy. This has never been published for the combined scale with NR, however, to the best of our knowledge, except for plotting of the S1-and S2-only scales only once each as far as we know, by Plante [31,149] (but only at zero field, in a dissertation) and Verbus [27,140] respectively. In both situations these were only quasi-monoenergetic reconstructions, tagging neutron scatter angle with a Ge detector in the former case, as typical for all L e f f measurements, and in situ in the same Xe volume (LUX) in the latter. For NEST comparisons to both of these, see Figure 3 bottom in the LZ simulation paper [150]. We also point to potential examples of threshold bias "lifting" E's above correct values (or, Eddington bias, for continuous spectra), as manifesting in light yields [112,151,152] seen much earlier for ER in Fig. 2b but it is handled in later works [140,153] and thus not shown explicitly pre-correction.  Figure 6. (a.) Combined (blue) and optimal (yellow) scales for a 50 GeV standard WIMP spectrum, the MC truth for which is grey. Above 2 keV, bins are omitted in log fashion for clarity. LUX detector parameters (i.e. Run03, first WIMP search) used as earlier for similar ER plot, again for illustrative purposes (∼180 V/cm). Yellow corrects for sub-keV efficiency underestimation, and an overestimation at several keV, comparing yellow to grey and blue to grey. (These effects can change from detector to detector, with signal shape.) While a distinct functional form vis-à-vis ER, w 2 is again S-shaped. Better results may be possible with lower w 1 , kept fixed at 1.0 again here for simplicity. In actual data, with ER and NR mixed, backgrounds with potential signals, it is impossible to know a priori if applying keV nr or ee is more appropriate by event.
Lacking truly monoenergetic peaks, continuous spectra present an opportunity still for contrasting the S1, S2, combined, and optimized-E scales. Those from all known n sources are highly dependent on geometry, however; thus NEST is insufficient: a full-fledged Geant4 MC [154,155] would be required to model detectors. Instead, an example of a 50 GeV/c 2 mass WIMP will be shown with an unrealistically large cross-section of interaction (1 pb, and 1 kg-day exposure). While artificial, this illustrative example is valid given underlying assumptions for not just total quanta but individual photons and electrons, and resolution in both channels, verified by NEST comparison with data elsewhere [32,34,144] and real experiments will be able to test the ideas presented in future NR calibrations, in XENONnT and LZ. Fig. 6 is a repeat of Fig. 4, but is for NR. The WIMP spectrum is more relevant than a uniform-E spectrum would be, as even for a massive (multi-TeV mass-scale) WIMP it is a poor approximation to what is inherently an exponential spectrum, falling as energy increases. The drop off on the left is of course caused as before by a combination of separate threshold effects that remove the lowest-energy S1s and S2s. The optimal scale shows improvement again over combined, but by itself combined E is not dissimilar from S1-or S2-only for NR, because the lower-quantum/area signals are dominated by detector specifics such as (binomial or Poisson) light collection efficiency. S1 only is most common, used in every experiment starting with the seminal XENON10 result [29] due not to a better energy reconstruction, but signal (NR) versus background (ER) discrimination [36,53]. S2 only may be as good for discrimination if not better however, according to the work of Arisaka, Ghag, Beltrame, et al. [156].

Liquid Xenon Summaries
The key points of the LXe (ER) section (with detector-specific caveats) are: • A combined scale reconstructs monoenergetic ER peaks best for DM/ν projects, but below 3 keV at least this is not true according to an 37 Ar study with S2-only best (outperforming S1 as well) if e − lifetime is high. A combination can be established with two numbers, S1 and S2 gains, leading to a 1D histogram (XENON/LUX style) or equivalently a 2D rotation angle (Conti/EXO method). • An optimal weighting of S1 and S2 can result in better resolution than simple combined energy, down to O(1) keV even, and mitigation of threshold bias and skew. Higher, the best resolution occurs when the weights applied to the S1 and S2 are 1/g 1 and 1/g 2 , but machine learning is likely to outperform analytic methods, if more parameters (beyond S1, S2) are considered. • For neutrinoless double-beta decay, O(1%) resolution has been achieved in the relevant energy range by a multitude of different experiments and technologies, while the best feasible may be 0.4-0.6%, in liquid, which may be limited by a Fano factor (often confused with recombination fluctuations) that is higher than in gas. But no one experiment has yet reached its full potential. • For a continuous ER spectrum, the combined scale is a clear winner over S1-only and S2-only alike, at least for a uniform energy distribution (uniform in neither S1 nor S2, as L y and Q y are functions of energy, not flat). But optimization with re-weighting is still possible, just in a different manner than done for monoenergetic peaks, because of cross-contamination between bins.
Next we summarize NR; there is good agreement on total yield from different experimentalists.
• While impossible to obtain from truly monoenergetic lines, a summation of separate N ph and N e− data sets results in strong evidence of NR anti-correlation akin to ER's and no statistically significant difference from Lindhard even sub-keV, at least given additional high-E quenching. • Despite the point above, the advantages of a combined scale are not significant compared to the S1-only default (but S2 comparable) as so much E is lost to heat (>80%) decreasing pulse areas. • An optimized combination scale, which corrects for order-of-magnitude discrepancies in efficiency below 1 keV, is still best, but likely requires fine-tuning by energy spectrum. It is also likely to be highly detector-dependent and only important after a WIMP discovery is made, to fit the mass and cross-section the most precisely. A uniform spectrum is a bad approximation in any case.

Low Energy: keV-scale (Dark Matter Backgrounds / Calibrations) Monoenergetic Peaks
For ER in LAr the best example of a low-energy calibration line is the 83m Kr peak, at 41.5 keV, commonly used to calibrate both LXe and LAr experiments, but in this latter case there exists no evidence of the yields depending on the separation time between the individual 32.1 and 9.4 keV peaks [157], unlike in LXe [75], so the MC comparison is easier. The 2.82 keV electron capture from 37 Ar has also been studied in LAr but most commonly at zero electric field in single-phase (liquid) detectors, making it a less ideal choice for complete NEST comparisons to S1, S2, and combined-E histograms [158]. For a WIMP search ER is again the main background, due to 39 Ar in DarkSide-50 (DS-50) at Gran Sasso [159], DEAP-3600 (and formerly CLEAN precursors) at SNOLab [160,161], and ArDM (at Canfranc). Underground Ar depleted in this isotope reduces the background, but it remains dominant [162]. In neutrino experiments, ER is the signal, via neutral-current, charged-current, and elastic-scattering interactions. In this section, we begin by focusing on dark matter at the keV scale, later moving on to cover the GeV scale more relevant for accelerator neutrino experiments like DUNE [163], with the intermediate scale at MeV also important for supernova or solar neutrinos [164,165].
Predictions of NEST's LAr ER model in comparison to experimental data are shown in Fig. 7. The source of data here is DarkSide (DS), specifically several PhD theses of its students [21][22][23]. For this experiment, g 1 = 0.18 ± 0.01 phe/photon [21], higher than within LXe detectors likely because of conversion of VUV photons into visible light using wavelength shifter followed by detection in high-QE visible-light SiPMs/MPPCs/APDs. Another source gives this as g 1 = 0.1856 ± 0.0007 stat ± 0.0008 syst [23].
To most closely center NEST with respect to DS data in Fig. 7, we set g 1 = 0.181 and g2 = 23.8, both well within the above uncertainties. Only one example electric field was studied, of 200 V/cm, comparable to the electric fields studied earlier for LXe. The work function or W q was assumed to be 19.3 eV, a value justified later when discussing reconstruction for neutrino detectors. Reconstructed energy is again keV ee as opposed to keV (without the subscript) for known individual energies.
Unlike in LXe, the S1-only energy scale does not necessarily perform the poorest. Our particular 83m Kr example has an energy resolution σ/µ of 6.5%, in Fig. 7a. The S1-based scale is more reliable in LAr due to its greater linearity, wherein the light yield for different betas and gamma rays is quite flat in energy, starting at 40-50 photons/keV and falling as the electric field rises and more energy goes into charge production [78,157,[169][170][171][172][173][174]. Next, in Fig. 7b the S2-only scale is plotted, leading to a resolution of 25%. Reproducing this large width was done artificially by setting the linear noise in the S2 channel to 25.0% (only 2.9% for S1 to match that width, an effectively negligible value). Compare this to 6.0% (S2, so high due to e − trains/tails/bursts [57]) and 1.4% (S1) assumed by default for LUX Run03 within NEST. We interpret the cause of the large value for S2 in LAr as stemming from a lack of full 3D corrections in the initial DS analysis. There is also a clear offset in skew compared to the MC which may come from DS-specific effects difficult to fully capture in a custom MC like NEST that does not account for all detector idiosyncrasies. No (statistical) errors are depicted as they are negligible from this high-statistics calibration. Fig. 7c shows a combined resolution of 7.5%, not improving on the S1-only resolution, but still a marked improvement over S2-only, suggestive of anti-correlation of charge and light in LAr (more robustly established later in this section). The agreement is even worse with MC here, however, than in the S2 plot, but this can be explained by the peak being from a different analysis from (a) and (b), with the Kr calibration sitting on top of a large background, from 39 Ar inside natural Ar, which we did not model. (But this is the only example of a DS combined-energy peak we were able to locate.) In the original source the peak was not centered on 41.5 keV ee , likely due to a systematic offset in g 1 and/or g 2 (see contradictory values above) but we had no difficulty in NEST with this. For ease of comparison for at least the width, 2.5 keV ee was added to data, to force alignment to a 41.5 keV(ee) mean [23]. Fig. 7d is the combined-E resolution vs. E. NEST is in blue. The data have full position corrections from high-statistics 83m Kr calibrations [21][22][23] as in LXe. They are most crucial for S2, in XY (or radius and angle) as Z is already handled by electron lifetime. NEST points for comparison are at semi-log steps in E = 2 − 5000 keV. While retaining the 3% S1 noise term from earlier, they have had the 25% S2 noise from (b,c) removed, so they can be effectively treated as close to the MC-predicted min possible, at least for the given (DS-50) combination of g 1 , g 2 , and E-field. The combined-E resolution drops from the 7.5% in (c) to below 5% in this case at 41.5 keV in (d). The empirical S2 noise term was thus likely accounting for imperfect correction for the (2D) position-dependent S2 light collection.
There is no equivalent of Fig. 2b for LAr, as no example of Eddington-like bias could be found. An opposite effect (lower instead of higher reconstructed E) is recorded in Figure 3.11 of [23]. However, this can easily be interpreted as E "loss" into charge when using only S1, which while more linear in LAr compared to LXe, is still not a fixed flat L y at all energies, not even at null electric field [158,175].

High Energies: The MeV and GeV Scales (Neutrino Physics)
In moving from the keV scale of DM experiments to the GeV scale of the accelerator neutrino experiments, we study the MeV scale as a boundary case. A 1 MeV beta was chosen as an example of an ER interaction energy just beyond what is considered relevant for a DM search, but on the other hand at the extremely low-energy end for neutrino physics, potentially just barely above threshold in an experiment Figure 7. Reconstruction of the 41.5 keV 83m Kr peak in DS [23] compared to NEST, with data as black dots, NEST grey squares. For clarity, no fits overlaid, although most data are ∼Gaussian. (a.) S1-only E scale most common for LAr-based DM detectors [22]. Our work/conclusions for such a scale should apply not only to TPCs where ionization e − 's are drifted but also to 0 V/cm 1-phase detectors, where these electrons fail to recombine to add to the S1 in LAr for both ER and NR [158,166,167] just as in LXe [32], with the L y having the same shape vs. E as for non-zero fields as field goes to 0 in TPCs (see Doke et al. as well as Wang and Mei for possible reasons [78,141,168,169]). (b.) S2-only, with the slight right-hand asymmetry nearly reproduced by NEST [21]. (c.) Combined-E scale, now standard at least in DS. Optimization as done in LXe could be possible, but not shown. Resolution is poorer than in (a) instead of better due to poor S2 resolution in (b) and that S2 is being combined with S1, on top of 39 Ar. This is explained in the text, as likely due to lack of XY correction creating noise not correlated with S1. For E's below 1 MeV like here this is not likely due to delta rays not being simulated by NEST by itself, as no S2 noise (and little in S1) is needed in the next plot (d) and S1 is not as wide as S2 in (a), while delta rays would affect both. Combined E, canceling them, should still be better in general in Ar as in Xe. (d.) Resolution vs. E for monoenergetic calibrations/backgrounds studied on DS [23]. Only one set of resolutions covering a broad E range was found (black) to compare to combined E from NEST (blue).
like DUNE [176]. This is of the same order of magnitude but just slightly above the 39 Ar beta spectrum endpoint (565 keV), and is also near the same energies (976 keV and 1.05 MeV internal-conversion electron and gamma ray, respectively, from 207 Bi) studied by Doke et al. in their seminal 1989-2002 papers [78,169].
Our own re-analysis shows that if one sums L y and Q y vs. electric field, it is a constant number of quanta/keV within experimental uncertainties, and consistent with 1 / W q , given reasonable assumptions on g 1 and g 2 . Table 2. Best (lowest) possible resolution of a 1 MeV electron recoil at 0.5 kV/cm as a function of the S1 photon detection efficiency g 1 and wire noise (in semi-log steps). Each entry is averaged over 10 4 simulations in stand-alone NEST, with the effect of delta rays approximated analytically (based on G4). All photons and electrons are included from the interaction, which is thus being treated as a single site. 1 MeV is also associated with a track length at the border of position resolution in an experiment like DUNE, < 1 cm on average, creating "blips" instead of obvious tracks, straddling the point-like interactions observed in DM detectors vs. the track-like nature of most interactions in LAr TPC, or LArTPC for short, neutrino experiments. Its low dE/dx is also at the cusp of the minimally ionizing regime, making a 1 MeV electron a MIP (Minimally Ionizing Particle) like electrons from GeV-scale neutrino interactions prior to showering [177]. In neutrino detectors there is no g 2 however, as electron charge is directly measured, as on EXO, by wire planes instead of S2. They are 1-phase liquid TPCs with unit gain for charge readout. Table 2 scans different levels of energy resolution associated with noise from the wire readout, showing the best (lowest) resolution for different values of g 1 . Combined-energy and even S1-only resolution appear in the table, for high g 1 paired with high wire noise.
Resolution in a DUNE-like detector can be halved already at g 1 = 0.02 i.e. 2%, for 10% wire noise. Such noise, starting even at 1%, comprises Q-only resolution almost entirely. For 0% wire noise, the 0.46% resolution is driven by excitation and recombination, both contributing binomial fluctuations (unlike in LXe). F q drives combined-E results. Its theoretical value 0.1 was assumed, given no other data [168]. For S1 only, a g 1 -based binomial drives what is possible. 0% noise was assumed for S1.
Unfortunately, large LArTPC neutrino experiments like DUNE will achieve much lower values of g 1 [178], though g 1 = 2% is already much lower than any Xe or Ar DM experiment has achieved. At higher levels of noise, still realistic for future experiments but energy-dependent, even lower g 1 suffices for making combined energy superior to Q-only. When reading this table from the top down, if the value starts changing this means that the minimum resolution being quoted is from the combination of scintillation and ionization, and no longer just the ionization. If reading across: when the values stop changing that means (at high g 1 ) an S1-only scale is best, as at higher levels of wire noise not only does the Q-only energy scale become unreliable, but the benefit in utilizing the anti-correlation of charge and light washes out for the combined scale, leaving the S1-only scale.
The small-scale LArTPC R&D detector, LArIAT (Liquid Argon In A Test beam), has investigated the claim that the combined-energy scale, making use of both ionization charge and scintillation light, is more precise [179,180]. While not targeting the removal of the wire noise, it does effectively cancel out the exciton-ion and recombination fluctuations, shown using a sample of Michel electrons (from muon decay) at a scale of tens of MeV. Here, the definition of combination is more appropriately set for neutrino interactions using differential energy loss along the particle track, updating our earlier equations, starting with (1) but with L = 1 assumed as for all ER: where the number of photons N ph and number of electrons N e− are replaced for the first step by L y and Q y , specific yields per unit energy, each multiplied by energy. Next, to align our terminology with what is more common in the neutrino field we rewrite L y as dS/dE and Q y as dQ/dE [181] (instead of N ph /E and N e− /E, with S and Q standing respectively for scintillation and charge, the quantum values). S is used for the scintillation light instead of L to avoid confusion with Lindhard factor. The resultant formula above is still not what is most commonly used in the field. Instead, that is: This is equivalent to the S2-only scale for two-phase TPCs in Eqn. (3) where E = N e− /Q y (E, E ), except divided by dx and g 2 = 1, given no need for extraction from liquid to gas, nor any photons created by electrons as a secondary process in gas (S2 i.e. electroluminescence). Although, the electron lifetime must still be high, ideally much larger than the full drift time across a TPC, and also well-measured, so that the Q can be corrected in the same way as S2. Q y (N e− /E) is often parameterized with Birks' Law [32,175,182] in terms of dE/dx instead of E: where W i (sometimes called W e ) is not the same as the work function W q defined much earlier, but trivially related. Being defined as E/N i , not E/(N ex + N i ), the convention of ionization W is related to the overall or total work function by W i = (1 + N ex /N i )W q . (N ex /N i are excitons/ions). For LAr, this means W i = (1 + 0.21)W q = 1.21 * (19.5 eV) = 23.6 eV, approximately [78,183]. Note LAr's W q has been labeled W max ph [78], as it is related to the maximum possible L y , when Q y = 0, but this is not possible even at 0 V/cm (it does become possible as ionization density from interactions goes to ∞ and forces recombination in both LXe and LAr [78,184]). Drift electric field is E while k B is known as Birks' constant, and A is the correction factor explained later. Q 0 is effectively the maximum possible charge, at infinite E , defined as E/W i or N i . While Birks is not the only possible parameterization (e.g. there is also the Thomas-Imel box model [185,186]) the focus of this work is on energy reconstruction not the various microphysics models. We take this to be only one, representative example here.
Another approach, taking into account not just excitation vs. ionization, but e − -ion recombination, begins the same as for LXe [35,36,144], and LAr experiments used for DM instead of neutrinos: Here r refers to the recombination probability and depends not only on energy and electric field but also the particle or interaction type, N ex is the number of excited atoms initially produced, and N i the number of e − -ion pairs. In this case, N ph + N e− is also N ex + N i . Applying the same revision to Birks' Law suggested in 2011 by Szydagis et al. for Xe [32] to Ar recasts it in terms of first principles: with k(E ) = k B /E only one possible parameterization of recombination's E-field dependence, and in turn charge and light yields, with a more general negative power law possible [32,187]. Only the A from ICARUS [188] is lacking in terms of robust justification, but it is likely a correction needed only when the secondary particle production range cut in the Geant simulation is set too high to allow for delta-ray formation down to keV-scale energies. Delta rays are lower-energy, higher-dE/dx tracks with greater recombination, hence more light, at the expense of charge, explaining why A < 1. Increased simulation time and memory usage associated with lowering the secondary particle production range cut often lead this cut to be set too high in Geant simulations to capture this effect. In Fig. 8a the number of ionization electrons produced from 1 MeV primary electrons is plotted versus both the Geant4 secondary production range cut ("length scale factor") as well as the associated energy threshold for delta rays. The ratio between the values of the two plateaus at the left and right extremes in Q is almost exactly equal to the ICARUS best-fit value of the renormalization constant A (0.800 ± 0.003 in Amoruso et al. 2003 [188]). The default secondary production range cut of 0.7 mm used in the LArSoft (Geant4) simulation allows for only ∼270+ keV delta rays to form [189].  [190]. The default threshold is typically too high by >10 times, in length and/or E, in neutrino simulations [189]. (b.) Demonstration of S versus Q anti-correlation from our reanalysis of Doke et al. 1988 and2002 [78,169], confirmed by the recent LArIAT study [180]. 1 MeV ER at different E-fields, as opposed to many E's at one field (as done for a contemporary "Doke plot"). (c.) Noiseless charge-only resolution vs. fixed electron E using Geant4, with G4 secondary production range cuts of 1 µm (blue) and 0.7 mm (black, in LArSoft), compared to data points at isolated E's on recombination fluctuations with delta rays from [186] (red) and [191] (green, no measurement uncertainty provided). Fig. 8b (upper right) shows that the lower plateau in the large left plot (a) in its lower left corner is closer to correct. The zero-electric-field light yield for a 1 MeV beta, or other electron, is approximately 41 photons/keV [169] while the red line in (b) shows a reduction to ∼0.6 of that value by 500 V/cm, to 24.6 photons/keV. If the work function is 19.5 ± 1.0 eV, as reported also by Doke, this implies a total (S and Q summed) of 51.3 +2.8 −2.5 quanta/keV (an S value different than 41, potentially higher, only strengthens our following arguments by lowering Q). The charge yield is total minus light, leading us to 51.3 − 24.6 = 26.7 +2.8 −2.5 e − /keV. The NEST (2012) value is thus very much within the error envelope of actual data for the lower level, but outside it for the higher one, which also never quite flattens (Fig. 8a  upper right). Including more, relevant measurement uncertainties does not sufficiently explain this discrepancy. A similar conclusion can be reached by converting relative charge (green curve in Fig. 8b) to absolute, pointing again toward the lower value of Q being the more correct one. Fig. 8c demonstrates fully accounting for delta rays is also important for correctly predicting ionization-only E resolution for primary electrons at these energies (∼1 MeV).
The 2012-2013 version of the NEST LAr ER model was a combination of the Birks-Law and Box (Thomas-Imel) models of recombination, with LET < 25 (MeV·cm 2 )/g driven primarily by Birks, as given by Eqn. 13 (with the Thomas-Imel box still called for any accompanying delta rays). There was no need for renormalization as in Eqn. 11 (so A = 1) due to using a significantly smaller secondary production range cut in Geant4, and the Birks constant electric field dependence was k(E ) = 0.07/E 0.85 (c f . Amoruso's k(E ) = (0.0486 ± 0.0006)/E ). The different power on the electric field dependence, less than 1, on top of the different constant within the numerator is likely due to using Birks to model the recombination directly instead of R = 1 − r or Q Q 0 . In addition, it was based on higher-LET dark matter detector data and extrapolated lower. This is also nearly identical to Equation 8 from Obodovskiy's comprehensive report [187].
One point of confusion to address is the most common quantity used for data/MC comparison in the neutrino field, the recombination factor, R. It is actually an "escape" factor for electrons from ionized atoms: [183]. See the NEST 2013 comparison to ICARUS data (from both muons and protons mixed) in Fig. 9, where a good match is observed. This is a different NEST version from the dark matter experiment (DarkSide) comparisons earlier at lower energies (higher dE/dx) because at time of writing the ER model for LAr in the latest NEST version has not been recast yet into a dE/dx basis for a robust comparison [192,193] assuming a typical LAr density of 1.4 g/cm 3 . Minor discrepancies are observed at low dE/dx, which will be addressed with planned improvements to the NEST LAr ER model in the near future.
The corrections discussed here are important not only for energy reconstruction considerations but also for background discrimination, such as for differentiating neutral pions, photons, and e + e − pair production from single electrons which comprise the signal of interest for accelerator neutrino experiments. This particle identification can be carried out with the use of dE/dx by first measuring dQ/dx [194]. Potentially one can add in dS/dx for a low-enough energy threshold, if the g 1 is high enough, as discussed earlier. The exclusion of delta rays common in typical Geant4 simulations for LArTPC neutrino experiments affects not only the mean yields, which can still be simulated accurately with a 20% correction almost independently of electric field and LET, but also resolution: delta rays will degrade the energy resolution and require more complicated corrections if they are deactivated in Monte Carlo simulations (in Fig. 8c). Incorrectly-modeled energy, and thus dE/dx, resolution can lead to potential biases between data and simulation when using reconstructed dE/dx for particle identification: electron/photon discrimination in LArTPC neutrino experiments for example.

Liquid Argon Nuclear Recoil (Dark Matter Signal, and CEvNS)
The final topic is low-energy (keV-scale) nuclear recoil or NR within LAr, which is important not only for dark matter in experiments such as DS [195] but also for CEνNS on collaborations like COHERENT [196]. The sum of quanta in data is again fit well by a power law, combining all global data on total yields, related to L(E) from the beginning, Eqn. 1.
The fit in Fig. 10 is quite similar to that from LXe earlier in Fig. 5 in both the base and exponent. Here we plot total quanta/keV instead of just quanta for greater clarity: in terms of quanta the power would be  Fig. 10a, as most LAr data for DM traditionally were 0 V/cm (e.g. microCLEAN, DEAP) thus lacking N e− to add to N ph . L e f f data exist from many sources, collected in [161,[197][198][199] but these do not directly inform L, being just L y .
Lindhard works surprisingly well according to Fig. 10b, staying within 2σ for the power fit to data below 50 keV. Deviation below Lindhard at high E is easy to understand as bi-excitonic quenching again as per Hitachi, while (some) deviation at sub-keV is also understandable, as Lindhard's approach should break down at < 1 keV regardless. The total number of quanta per keV is remarkably flat from 1-300 keV, around 15 quanta/keV. Breaking the total down into light and charge, a similar behavior is observed as in LXe, with L y increasing in energy, while Q y decreases [175,200]. Confusion between L y and total yield has led to past claims Lindhard does not fit well [200,201]. Regardless of whether Lindhard fits well or does not in some energy regime, Fig. 10a is indicative of a combined-E scale being worthwhile (in place of S1 only) as in LXe, as argued both by DS [23] and LArIAT [180].
The final plot, Fig. 11, displays the breakdown into the S1 and S2 pulse areas for one example energy coming from the same quasi-monoenergetic technique cited earlier for table-top LXe yield measurements (for both ER and NR) relying upon coincidence between a noble-element detector and a second detector, typically Ge [31,65,175]. For NEST MC, the means of their values for the gains have been assumed: g 1 = 0.104 ± 0.006 PE/photon and g 2 = 3.1 ± 0.3 PE/e − . The drift electric field was 193 V/cm and sample energy 16.9 keV, selected due to being the lowest value for which the S1 and S2 histograms are displayed at an identical energy anywhere within the existing literature.
NEST seems to overestimate the L y , but a larger issue is at play: contradictory data exist for NR from over the past ∼two decades: some experiments showed the amount of scintillation staying flat, others decreasing, and several even increasing with decreasing recoil energy [161,197,198], physically possible if one adapts the logic from Bezrukov, Kahlhoefer, and Lindner from LXe on screening [133]. Those showing increase cannot be easily explained as caused by threshold bias, as several distinct sets of results agreed upon the increase [195,200]. Other hypotheses include different collection efficiencies, especially as wavelength shifters are generally used in LAr, and different ones [204], some of which may have efficiencies higher than 100% effectively due to creation of multiple visible photons per individual input photon in the extreme UV, though the data on the existence of this are also contradictory [205]. A related effect may be something in photo-sensors exposed to LAr analagous to the 2-PE effect seen in LXe. A compounding problem is the discrepancy for the zero-field L y for 57 Co (or 83m Kr) used as in LXe to set L e f f . Historically, it is stated as near 40 photons/keV [169] but more recent work, with g 1 known by Doke plot, suggests closer to 50, nearer the max (1/W q ) if Q y → 0, sensible for E → 0 [166] (though as explained earlier, unrecombined e − 's remain). Systematic uncertainty in L y at ∼20 keV may be as high as 40% (L e f f = 0.35, or 0.25) making the 20% shift (x0.8) needed in Fig. 11a reasonable.
At time of writing, NEST's compromise solution to the contradiction is to follow the predictions of not just Lindhard but all existing first-principles models, as well as the most recent data (ARIS) suggesting a flat-ish L y building into a mild increase with increasing NR energy, but at a higher level than the traditional ∼10 photons/keV (0.25 L e f f x40), better matching the claims of high L y at low E [175]. (Earlier versions would switch between differing, mutually exclusive solutions.) Fig. 11b suggests an S2-only scale may ironically be more reliable for (NR-based) DM searches with LAr. Contradictions are fewer between data and different models, including NEST, even at other energies, and multi-ms e − lifetime may be easier to achieve through high-level purification [159,206]; however, a note of caution that this may be due to the paucity of non-zero-field data, for measuring Q y , at multiple E-fields. Columnar recombination, not currently simulated, which changes the yield depending on electric field orientation, is a further complication [200]. The combined-E scale essentially removes (fit to all data) Figure 10. (a.) The (E-field-independent) total number of quanta N q per keV for NR in liquid argon: L y + Q y , or (N ph + N e− )/E. The best-fit power law, used in the current NEST's NR model for LAr, is (11.10 ± 1.10)E 0.087±0.025 , surprisingly similar to LXe, with 1σ/2σ error bands in green/yellow. Kimura data [167] are given as fit to data in original paper; SCENE and ARIS points are taken from [200] and [175]. (b.) A model review, collected from [132,143,201] of Lindhard/Lindhard-like approaches. Solid lines assume 45 photons/keV for the (0 V/cm) L y of ER in LAr, with upper and lower dashes covering 50 and 40 respectively to span the uncertainty in light, before addition with Q y . (c.) As in (a, b) NEST repeated here, with two additional model comparisons, from PARIS (used by DS [202]) and [203].  Figure 11. (a.) S1 peak and (b.) S2 peak for 16.9 keV NR in LAr from Cao et al. [200]. Data are in black with errors, with original MC in red (fit borders for it as vertical blue lines) and NEST overlaid in grey over original paper plots: defaults solid, while altered to match data in dash. Large noise levels needed in NEST are comparable to those assumed by SCENE (R 1 and R 2 in legends imply S1 30%, S2 25%). the effect of these and any other recombination fluctuations, which cause the variance in the original numbers of photons and electrons to be identical "at birth" prior to any fluctuations due to propagation and/or detection [36,51]. Such a scale will also remove the effect of delta rays (ER) on the ultimate energy resolution achievable, if driven by anti-correlated fluctuations (recombination).
S2 is generally easier to measure, however, than the S1 is: electrons drift upward in a TPC along nearly-straight lines (slight diffusion occurs) from the liquid to the wires or to the wires followed by gas, where many photons are produced per electron. Losses due to the impurities along the drift length can be quantified simply, with an exponential. On the other hand, scintillation photons are produced in all directions and are affected by not only attenuation (not necessarily exponential, but difficult to quantify analytically) but geometric collection efficiency driven by reflection and refraction, and QE. As energy decreases, Q y increases for both NR and ER, for both LAr and LXe, at least down to the keV level before turning around, while L y appears to decrease toward 0. For this reason, NEST models are created using the total yield first, then charge yield, and light reconstructed by subtraction in the code. (The same is true for LXe.)

Liquid Argon Summaries
In conclusion, the key points of the LAr ER section, coupled to reasonable/common detector parameter values including/especially g 1 and E-field E , are: • A combined S1 + S2 scale continues to reconstruct ER energies best for DM/neutrino experiments, due to anti-correlation between channels, but not if g 1 is very low ( 1%) or g 2 very high (e.g., 2-phase TPC). An additional challenge is created by sitting on top of a continuous background like the beta decay of 39 Ar for combined energies, but noise in Q can make S1 more favorable. • dE/dx is more important than just E at the GeV scales of greatest relevance to neutrino projects and it is most commonly reconstructed utilizing dQ/dx (ignoring dS/dx). • A correction (∼0.8) must be inserted into the simulation of charge yields for use in the traditional Q-only scale, lowering the Q that is output, if the delta-ray production threshold is set above the e − -ion thermalization radius O(1 µm) in MC. Energy resolution may also be affected, not just mean yields, and high-energy, low-dE/dx (MIP) interactions are not immune to this problem due to secondary particle production, handled with e.g. Geant4. • Due to differences in delta rays and other secondaries, an analytical fit may be impossible across all particle types, leading to different recombination probabilities even if you consider only the averages versus dE/dx or energy.
• Either escape probability or recombination can be modeled as a function of the dE/dx (or the LET, which includes the effects of density).
Next, we summarize the key points of the LAr NR section. We note that because yields change slowly with increasing field especially for the dense tracks of NR that our S1 examples from two-phase TPCs (DS, ARIS, SCENE) should be relevant/applicable to 0 V/cm single-phase detectors as well.
• While possible to measure for only approximately monoenergetic peaks, a summation of the few available N ph plus N e− data sets results in evidence for NR anti-correlation (akin to ER's) and modest agreement with Lindhard. This is important for both DM and CEνNS. • Due to uncertainty in the scintillation yield, an S2-only scale may be beneficial, but exploration of combined E may still be interesting in the future (as stated above). Non-zero-field measurements are not as plentiful for charge yields as zero-field light-only ones for NR in liquid argon.

Discussion and General Conclusions
We have reviewed mainly LXe and LAr, in 2-phase TPCs, and manage to extract insights spanning dark matter and neutrinos, proving once again the remarkable consistency obtained across numerous data sets, and the utility of NEST to probe at least the simple reconstruction methods reviewed. From an historical perspective, it is intriguing that the usage of noble elements in the DM field began with scintillation-only E measurements, with charge primarily used for position reconstruction, while for neutrinos the opposite occurred: ionization-only E scale, with initial scintillation used as a trigger for event activity of interest. Moving forward in both fields, it will be interesting to see how charge and light are combined together to improve E measurements further than what has already been achieved.
The new insights we have gleaned from our review and meta-analyses with our own simulations, based on the global, cross-experiment framework of NEST, first for liquid xenon, include: 6. A comprehensive compilation of all existing data and models for NR in terms of total yield not just light, beyond 0 V/cm.

Funding:
The research of Prof. Levy and Szydagis at the University at Albany SUNY (the State University of New York) was funded by the U.S. Department of Energy (DOE) under grant number DE-SC0015535. Prof. Mooney and his students, Alex Flesher and Justin Mueller, were funded through university start-up funds. Ms. Kozlova was funded through the Russian Science Foundation (contract number 18-12-00135) and Russian Foundation for Basic Research (projs. 20-02-00670a).