Next Article in Journal
Ab-Initio Calculations of Level Energies, Oscillator Strengths and Radiative Rates for E1 Transitions in Beryllium-Like Iron
Previous Article in Journal
Special Issue on Spectral Line Shapes in Plasmas
Previous Article in Special Issue
Validation and Implementation of Uncertainty Estimates of Calculated Transition Rates
 
 
Article

Estimating Relative Uncertainty of Radiative Transition Rates

Atomic Spectroscopy Group, Quantum Measurements Laboratory, National Institute of Standards and Technology (ret.); Gaithersburg, MD 20899-8422, USA
Atoms 2014, 2(4), 382-390; https://doi.org/10.3390/atoms2040382
Received: 31 January 2014 / Revised: 16 September 2014 / Accepted: 9 October 2014 / Published: 25 November 2014

Abstract

We consider a method to estimate relative uncertainties of radiative transition rates in an atomic spectrum. Few of these many transitions have had their rates determined by more than two reference-quality sources. One could estimate uncertainties for each transition, but analyses with only one degree of freedom are generally fraught with difficulties. We pursue a way to empirically combine the limited uncertainty information in each of the many transitions. We “pool” a dimensionless measure of relative dispersion, the “Coefficient of Variation of the mean,” \(C_{V}^{n} \equiv s/(\bar{x}\sqrt{n})\). Here, for each transition rate, “s” is the standard deviation, and “\(\bar{x}\)” is the mean of “n” independent data sources. \(C_{V}^{n}\) is bounded by zero and one whenever the determined quantity is intrinsically positive.) We scatter-plot the \(C_{V}^{n} \)as a function of the “line strength” (here a more useful radiative transition rate than transition probability). We find a curve through comparable \(C_{V}^{n} \)as that envelops a specified percentage of the \(C_{V}^{n} \)s (e.g. 95%). We take this curve to represent the expanded relative uncertainty of the mean. The method is most advantageous when the number of determined transition rates is large while the number of independent determinations per transition is small. The transition rate data of Na III serves as an example.
Keywords: line strengths; normalized; relative uncertainty; transition probabilities line strengths; normalized; relative uncertainty; transition probabilities
Our goal is to optimize uncertainty estimates for published radiative transition rates by “pooling” relative uncertainties of these rates within a spectrum. Most evaluated transition rates have been accurately determined by two or fewer sources (experimental and/or theoretical). In such cases it is generally difficult to make useful distribution-based estimates of uncertainty for individual transitions. Fortunately, there are many transitions in any given atomic spectrum. Here we propose a graphical method for empirically estimating relative uncertainties by “pooling” relative standard deviations of transition rates within a spectrum.
As we shall see, this “heterogeneous” pooling can result not only in improved uncertainty estimates of transition rates, but also in detecting outliers and interpolating transition rate uncertainties that have only one reference-quality determination. First we must find an appropriate measure of relative dispersion.
Consider a given transition with an ensemble {xi} of “n” independent determinations of radiative line strength. Each ensemble has a mean “ x ¯ ” and unbiased standard deviation “s”. The Coefficient of Variation, C V , is defined as:
C V s x ¯    ( x ¯ 0 )
For our purposes, it will be more useful to consider a slightly modified quantity.
Consider a large-n population {xi}. Say we take different samples “j” from this population. The central limit theorem holds that the mean of such samples { x ¯ j } will have a standard deviation of approximately s / n , the standard deviation of the mean. Here, x ¯ j = i x ij / n j is the jth sample mean and s j 2 = i ( x ij x ¯ j ) 2 / ( n j 1 ) . Therefore, as a matter of nomenclature, we refer to:
C V n s n x ¯ = C V n
as the Coefficient of Variation of the mean. The properties of C V n are discussed in Kelleher [1]. This article presents the required proofs and considers in detail the advantages and limitations of heterogeneous pooling.
For a population {xi} of size n, C V n has the useful limit (with condition):
C V n [ 0 , 1 ]   if   { x i > 0 > } .
That is, the coefficient of variation of the mean is intrinsically bounded by zero and one if all determined quantities (such as line strengths (S)) are positive. This normalization property of C V n is useful. One knows automatically the quantitative significance of C V n with regard to one. Also, in Equation (3), the upper limit is independent of n. This makes it feasible to pool multiple sources with a different n value.
For each transition rate, when n = 2:
C V n = 2 = x > x < x > + x <
Note that the upper bound on Equation (4a) is 1 whenever x> is positive.
In general, for each transition rate j having nj independent determinations, and mean x ¯ j , the Coefficient of Variation of the mean is:
C V n ( j ) = 1 x ¯ j [ i ( x ij x ¯ j ) 2 n j ( n j 1 ) ] 1 / 2
As for a specific transition rate, we choose to pool the variation of the line strength (S), rather than transition probabilities or oscillator strengths. This is necessary because S is proportional to the radial matrix element only. Thus, transition energies and statistical weights do not enter (the former can be quite large). This allows us to pool transitions with different transition energies and statistical weights. (The term “transition rate” is used loosely here to refer collectively to the three radiative entities, S, A, and f. Each of the three can be converted to either of the others by using expressions in [2], for example.)
Below we describe the three steps of this empirical pooling method. We begin by scatter-plotting the C V n of different transitions of the same spectrum as a function of the logarithm of the mean line strength. Next, we fit a least-squares curve through the C V n data to determine the “slope” of the data. Finally, using this slope, we iteratively derive a second curve which envelops the specified fraction p of C V n data points. We take this envelope curve to represent the expanded relative uncertainty of the mean, U p + rel ( log S ¯ ) . (The term “expanded” refers to the specified value of p (e.g. 95%); the “+” denotes the upper bound. To the extent possible, we use the terminology of the International Organization for Standardization [3]). This “GUM” does not consider the coefficient of variation. We use Tp+ to indicate the upper confidence bound of the Coefficient of Variation, and U p + rel to designate the expanded relative uncertainty of the mean. Such plots can also illuminate systematic trends and outliers.
In Figure 1, we plot the C V n for 192 radiative transition rates of the spectral lines of doubly ionized sodium (Na III). These rates were calculated by two different theoretical approaches: the “Multi-configuration Hartree–Fock” method, (Tachiev and Froese Fischer [4]), and the “Configuration Interaction” method (McPeake and Hibbert [5]). The C V n for each transition in this figure, given by Equation (4a), contains only one degree of freedom (n = 2, ν = 1). Only transitions from energy levels above 415,000 cm−1 are included in this figure.
The two parameters of the lower fit curve are determined initially by least square fitting. Except in special cases, there is no a priori relationship between C V n and the sample mean. However, for C V n satisfying Equation (3), any fitting functions Φ0 must have asymptotic bounds of one and zero. We have chosen:
Φ 0 ( x ¯ ) = 1 2 erfc [ β 1 log 10 ( x ¯ x ¯ 1 / 2 ) ] .
See Figure A1 in Appendix A. Here erfc is the complimentary error function, erfc = 1 – erf. An algorithm for computing erf is given in the appendix. x ¯ 1 / 2 and β are fit parameters analogous to “intercept” and “slope”, respectively. Φ0( x ¯ ) has asymptotic values of one and zero, consistent with Equation (3). This ad hoc fit curve has the equivalent functional form to the cumulative distribution function of the log-normal distribution.
C V n ( j ) 1 2 erfc [ β 1 [ LS ] log 10 ( S ¯ j x ¯ 1 / 2 [ LS ] ) ] Least   Squares   ( LS )   fit   to   scatter-plot
After the scatter-plot, the second step in our heterogeneous pooling procedure involves the determination of parameters β[LS], and x ¯ 1 / 2 [ LS ] . We perform a Levenberg–Marquardt nonlinear LS fit for Equation (6) using the implementation in the “C” language by Press et al. [6]. This enables us to evaluate the two parameters for the LS curve (lower). (An equation like Equation (6) exists for each “Coefficient of Variation of the mean,” C V n ( j ) and Line Strength Sj. In Equation (6), the “ ” symbol is meant to indicate a LS “best-fit” for the two LS fit parameters).
C V n ( j ) < 1 2 erfc [ β 1 [ LS ] log 10 ( S ¯ j x ¯ 1 / 2 [ p ] ) ]   for   p   %   of   C V n ( j )
Finally, keeping this same fit value of β[LS], we estimate an expanded relative uncertainty of the mean, U p + rel ( log S ¯ ) , by iterating x ¯ 1 / 2 [p] in Equation (7) to determine a second, higher curve such that a specified fraction p of the C V n s fall beneath it. (Using a different β for the LS and U p + rel ( log S ¯ ) curves would lead to the physically impossible consequence that the two curves would cross at some point).
Figure 1. The mean line strength ( S ¯ ) vs. the Coefficient of Variation of the mean, C V n , for radiative transition rates in Na III for which the energy of the upper level is greater than 415, 000 cm−1. Seven outliers, three of which are off the scale, were given very small weights. Because only two data sources are used, the C V n for each transition is given by Equation (4a).
Figure 1. The mean line strength ( S ¯ ) vs. the Coefficient of Variation of the mean, C V n , for radiative transition rates in Na III for which the energy of the upper level is greater than 415, 000 cm−1. Seven outliers, three of which are off the scale, were given very small weights. Because only two data sources are used, the C V n for each transition is given by Equation (4a).
Atoms 02 00382 g001
This empirical method for “covering” any specified percentage of C V n does not require random data or the existence of any particular pdf. It does depend on the validity of heterogeneously pooling the C V n of different line strengths within the spectrum. We take the curve U p + rel to be an empirical analog to the upper confidence bound of the mean for the normally-distributed Coefficient of Variation of the mean, T p + / n , with the distinction that U p + rel incorporates, to some level of approximation, the effects of both random and systematic errors. (A website by Verrill [7] calculates T p + for Normal and log-Normal distributions). LS fit curves can generally be determined more precisely than, for example, 95% envelope curves, because the latter are heavily influenced by the small percentage of points lying outside them. If relative uncertainties for any “outliers” are assigned, this should be done independently of the heterogeneous pooling considered here.
This heterogeneous pooling method has proven effective for critical evaluations of transition rates in recent NIST compilations of more than 90 different atomic and ionic spectra, for which we use p = 0.90. (See, for example, Kelleher and Podobedova [2].) We apply a correspondence table between the value on the envelope curve, U p + rel ( log S ¯ ) , and the letter-grade given in the compilation. We also interpolate this envelope curve when only one datum source is available (n = 1) for a given transition. (The number of degrees of freedom vanishes for C V n , but not for the curve U p + rel ( log S ¯ ) .)
Thus far we have described a method for estimating the envelope curve that represents the expanded relative uncertainty of the mean as a function of the mean line strength, U p + rel ( log S ¯ ) . The key assumption in the heterogeneous pooling of the coefficient of variation of the mean (relative standard deviations of the mean), is that pooled C V n s are comparable. Below we consider ways to discern whether any members C V n are not appropriate contributors to U p + rel ( log S ¯ ) . They may belong to a separate subgroup or they may be outliers. The latter usually appear as isolated points on a C V n curve.
Comparison of Figure 2 with Figure 3 illustrates the importance of pooling comparable C V n s. The data in Figure 3 are limited to transitions from lower-lying levels, for which the energy of the upper level is less than 415,000 cm−1. These levels belong to the same Na III spectrum, but are more widely spaced than the upper levels of the Figure 2 transitions. Thus level “mixing” is less for transitions from lower-lying levels. This results in smaller computational discrepancies between the two methods. If we had not separated out the data for Figure 3, we would have overestimated the relative uncertainty for this data.
Figure 2. The C V n data points are the same as Figure 1, to which two curves have been added. Seven outliers, three of which are off the scale, were given small weights. Because only two data sources were used, the C V n for each transition is given by Equation (4a). For the lower curve, parameters β[LS] and x ¯ 1 / 2 [LS] in Equation (6) were evaluated by a LS fit, and β[LS] was found to be 4.2 (“slope”). Keeping the same β, x ¯ 1 / 2 [p] was then adjusted in Equation (7) until 95% of the points (excluding outliers) lie under the upper curve, for which x ¯ 1 / 2 [p] = 5.1 × 10−5 (“intercept”). Compare to Figure 3.
Figure 2. The C V n data points are the same as Figure 1, to which two curves have been added. Seven outliers, three of which are off the scale, were given small weights. Because only two data sources were used, the C V n for each transition is given by Equation (4a). For the lower curve, parameters β[LS] and x ¯ 1 / 2 [LS] in Equation (6) were evaluated by a LS fit, and β[LS] was found to be 4.2 (“slope”). Keeping the same β, x ¯ 1 / 2 [p] was then adjusted in Equation (7) until 95% of the points (excluding outliers) lie under the upper curve, for which x ¯ 1 / 2 [p] = 5.1 × 10−5 (“intercept”). Compare to Figure 3.
Atoms 02 00382 g002
Figure 3. Coefficient of Variation of the mean ( C V n ) for radiative transitions in Na III in which the energy of the upper level is less than 415,000 cm−1, in contrast to Figure 1 and Figure 2. One value was weighted as an outlier. Each point is the C V n for a different radiative transition. These data were taken from the same sources as in Figure 2, and the scales are the same. Ninety-five percent of the points lie beneath the upper curve, for which β = 13.8 and x ¯ 1 / 2 = 2.3 × 10−19.
Figure 3. Coefficient of Variation of the mean ( C V n ) for radiative transitions in Na III in which the energy of the upper level is less than 415,000 cm−1, in contrast to Figure 1 and Figure 2. One value was weighted as an outlier. Each point is the C V n for a different radiative transition. These data were taken from the same sources as in Figure 2, and the scales are the same. Ninety-five percent of the points lie beneath the upper curve, for which β = 13.8 and x ¯ 1 / 2 = 2.3 × 10−19.
Atoms 02 00382 g003
Subsets of C V n can sometimes lie significantly below the envelope curve, U p + rel ( log S ¯ ) . It may be that some of these are best treated separately from heterogeneous pooling, if justified by expert knowledge or self-consistency of C V n within the subset. We can test for subgroups of C V n s by computing for each transition rate the ratio of its C V n to the envelope curve ( C V n / U p + rel ), and sorting the ratio.
In the curve fitting, all C V n values are given a weight of one except those which lie more than a factor of three above or below the U p + rel ( log S ¯ ) envelope curve. Such “outliers” are given a weight inversely proportional to the square of their distance from this envelope curve, and thus have negligible influence on the fit. Distant outliers can destabilize a LS solution.
C V n s that lie above the pooled envelope (but do not qualify as outliers) can also be identified by inspection of the C V n plot, such as in Figure 2 and Figure 3. For the fraction 1-p of C V n values that lie above the envelope curve, we approximate their relative uncertainties by equating them to their individual C V n s.
In summary, we describe a pooling method to estimate expanded relative uncertainties of the mean for atomic transition rates. We consider cases where a small number of independent determinations have been made for a substantial number of transition rates from the same spectrum. Let C V n s / ( x ¯ n ) where, for each transition, s is the standard deviation and x ¯ is the mean of n independent determinations of the transition rate. Scatter-plotting C V n s from these sources vs. log ( S ¯ ) , typically enables one to heterogeneously pool them. By finding a curve which covers a specified fraction “p” (e.g. 95%) of the C V n , one estimates the expanded relative uncertainty of the mean, U p + rel , as a function of log S ¯ . It includes random and systematic errors. The U p + rel ( log S ¯ ) envelope curve can also facilitate the detection of outliers and interpolation of relative uncertainties for which only one transition strength has been determined. Estimating relative uncertainties for theoretical transition rates of Na III serves as an example of the method.

Acknowledgments

The author gratefully acknowledges vital contributions by Mark Vangel and William Guthrie.

Appendix A:

Figure A1 displays Equation (5) as a function of x ¯ / x ¯ 1 / 2 for seven values of β ranging from 1.5 to 10. The only required properties of Equation (5) are that it has the appropriate asymptotes and has a weak monotonic dependence on x ¯ 1 / 2 . Except for poor data, smaller Φ0( x ¯ ) prevail (i.e., C V n <≈ 0.3). Φ0 has proven robust in fitting a wide range of different data types, including those that span essentially the entire C V n range of 0 to 1. Other functional forms may work better for different data sets. The whole procedure could also be performed graphically, without the explicit use of a parametric fitting function.
Figure A1. The fitting function Φ0( x ¯ ), Equation (5), for the Coefficient of Variation of the mean C V n as a function of x ¯ / x ¯ 1 / 2 for seven values of β. The mean is represented by x ¯ and x ¯ 1 / 2 is the value of the mean for which C V n = 0.5.
Figure A1. The fitting function Φ0( x ¯ ), Equation (5), for the Coefficient of Variation of the mean C V n as a function of x ¯ / x ¯ 1 / 2 for seven values of β. The mean is represented by x ¯ and x ¯ 1 / 2 is the value of the mean for which C V n = 0.5.
Atoms 02 00382 g004
Anomalous cases can arise, for example, when the C V n does not approach zero at asymptotically large values of x ¯ . In some cases we add a term that is constant as x ¯ approaches zero and vanishes as Φ0( x ¯ ) approaches one, such as:
Φ ( x ¯ ) = Φ 0 ( x ¯ ) + γ [ 1 Φ 0 ( x ¯ ) ] .
We recommend using Equation (8) only when the data clearly indicate an asymptote greater than zero. This can happen when one data set (usually inferior) lies consistently above the others. It is unlikely to be a good candidate for pooling. One should always plot the individual candidates for pooling, to check against systematic differences.

Appendix B:

Error Function:
The error function is defined as:
erf ( z ) = 2 π 0 z e t 2 d t .
An efficient numerical recipe, accurate to 1.5 × 10−7, is given by Hastings [8] (slightly modified here):
erf(z) = 1 – p exp(−z2),
where:
p = (0.254829592 + (−0.284496736 + (1.421413742 + (−1.453152027 + 1.061405429)))),
and
f = 1 1 + 0.3275911 | z |
also, if z < 0, erf(z) = −erf(z).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kelleher, D.E.; (National Institute of Standards and Technology, Gaithersburg, USA). Empirical Estimates of Relative Uncertainty. Unpublished work. 2014. [Google Scholar]
  2. Kelleher, D.E.; Podobedova, L.I. Atomic transition probabilities of sodium and Magnesium. A critical complication. J. Phys. Chem. Ref. Data 2008, 37, 267–706. [Google Scholar] [CrossRef]
  3. International Organization for Standardization. Guide to the Expression of Uncertainty in Measurement (GUM); International Organization for Standardization: Geneva, Switzerland, 1995; Section 6, Annex G. [Google Scholar]
  4. Tachiev, G.; Froese Fischer, C. Breit Pauli energy levels, lifetimes, and transition data: Na II and Na III. Available online: http://nlte.nist.gov/MCHF/view.html (accessed on 30 December 2002).
  5. McPeake, D.; Hibbert, A. Transitions in Na III. J. Phys. B 2000, 33, 2809–2832. [Google Scholar] [CrossRef]
  6. Press, W.H.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. Numerical Recipes in C; Cambridge University Press: Cambridge, UK, 1992. [Google Scholar]
  7. Verrill, S. Exact confidence bounds for a normal distribution coefficient of variation. Available online: http://www1.fpl.fs.fed.us/covnorm.dcd.html (accessed on 30 December 2002).
  8. Hastings, C. Approximations for Digital Computers; Princeton University Press: Princeton, NJ, USA, 1955. [Google Scholar]
Back to TopTop