# Improving the Utilization of STRmix™ Variance Parameters as Semi-Quantitative Profile Modeling Metrics

^{*}

## Abstract

**:**

_{10}of each variance parameter against the log

_{10}of the template amount of the highest-level contributor (Tc) for a large set of mixture data amplified under standard conditions. We observed nonlinear trends in these plots, which we regressed to fourth-order polynomials, and used the regression data to establish typical ranges for the variance parameters over the Tc range. We then compared the typical variance parameter ranges to log

_{10}(variance parameter) v log

_{10}(Tc) plots for mixtures amplified and interpreted under a variety of challenging conditions. We observed several distinct patterns to variance parameter shifts in the challenged data interpretations in comparison to the unchallenged data interpretations, as well as distinct shifts in the unchallenged variance parameters away from their prior gamma distribution modes over specific ranges of Tc. These findings suggest that employing empirically determined working ranges for variance parameters may be an improved means of detecting whether aberrations in the interpretation were meaningful enough to trigger greater scrutiny of the electropherogram and genotype interpretation.

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Construction, Amplification, Capillary Electrophoresis, and Analysis of Unchallenged DNA Samples

#### 2.2. Construction, Amplification, CE, and Analysis of Challenged DNA Samples

**Table 1.**Unchallenged single-source samples. Upon examination of the CE data, a signal-saturated peak was observed in one of the 8 ng replicate amplifications; for purposes of plotting, data for this amplification was grouped with the saturated mixtures in Table 6.

Study | Number of Samples | Input Amounts | Replicate Amplifications | Replicate STRmix Interpretations |
---|---|---|---|---|

Single-source, nominal-input | 14 | 2 ng | 1 | 10 |

Single-source dilution series (higher level) | 2 | 8 ng, 4 ng, 2 ng, 1 ng, 500 pg, 250 pg | 2 | 1 |

Single-source dilution series (lower level) | 2 | 125 pg, 63 pg | 4 | 1 |

**Table 2.**Composition of unchallenged mixtures. The input amounts listed are for total DNA. Note that, to include STRmix run-to-run variation in the dataset, ten replicate interpretations were performed for each mixture, at each template level.

Donor Number (Donor Set) | Mixture Ratio | Input Amounts | Replicate Amplifications |
---|---|---|---|

2-person (set 1) | 9:1 | 2 ng, 1 ng, 870 pg, 750 pg, 500 pg, 380 pg, 250 pg, 125 pg, 63 pg | 2 |

2-person (set 1) | 49:1 | 2.5 ng, 1.9 ng, 1.25 ng, 625 pg, 313 pg | 2 |

2-person (set 1) | 99:1 | 2.5 ng, 1.25 ng, 625 pg | 2 |

2-person (set 2) | 1:1 | 800 pg, 400 pg, 200 pg, 100 pg, 50 pg, 25 pg | 1 |

2-person (set 2) | 3:1 | 800 pg, 400 pg, 348 pg, 300 pg, 200 pg, 152 pg, 100 pg, 50 pg, 25 pg | 1 |

3-person (set 1) | 3:2:1 | 1.2 ng, 600 pg, 522 pg, 450 pg, 300 pg, 228 pg, 150 pg, 75 pg, 38 pg | 2 |

3-person (set 1) | 10:5:1 | 3.2 ng, 1.6 ng, 1.4 ng, 1.2 ng, 800 pg, 608 pg, 400 pg, 200 pg, 100 pg | 2 |

3-person (set 1) | 100:100:4 | 1.28 ng, 625 pg, 325 pg | 2 |

3-person (set 2) | 1:1:1 | 1.2 ng, 600 pg, 300 pg, 150 pg, 75 pg, 38 pg | 1 |

3-person (set 2) | 3:2:1 | 1.2 ng, 522 pg, 300 pg, 150 pg, 38 pg | 2 |

3-person (set 2) | 10:5:1 | 3.2 ng, 1.4 ng, 800 pg, 400 pg, 100 pg | 2 |

3-person (set 2) | 100:100:4 | 1.28 ng, 638 pg, 319 pg | 2 |

4-person (set 1) | 4:3:2:1 | 2 ng, 1 ng, 870 pg, 750 pg, 500 pg, 380 pg, 250 pg, 125 pg, 63 pg | 2 |

4-person (set 1) | 10:5:2:1 | 3.6 ng, 1.8 ng, 1.6 ng, 1.4 ng, 900 pg, 684 pg, 450 pg, 225 pg, 113 pg | 2 |

4-person (set 1) | 100:100:100:6 | 1.28 ng, 625 pg, 325 pg | 2 |

4-person (set 2) | 1:1:1:1 | 1.6 ng, 800 pg, 400 pg, 200 pg, 100 pg, 50 pg | 1 |

4-person (set 2) | 4:3:2:1 | 2 ng, 870 pg, 500 pg, 250 pg, 63 pg | 2 |

4-person (set 2) | 10:5:2:1 | 3.6 ng, 1.6 ng, 900 pg, 450 pg, 113 pg | 2 |

4-person (set 2) | 100:100:100:6 | 1.28 ng, 638 pg, 319 pg | 2 |

**Table 3.**Sample composition of inhibited mixtures. All inhibited mixtures were amplified and interpreted once at a total DNA input amount of 3 ng.

Donor Number | Mixture Ratio | Treatment |
---|---|---|

2-person | 3:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM |

10:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM | |

3:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL | |

10:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL | |

3-person | 3:2:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM |

10:5:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM | |

3:2:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL | |

10:5:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL | |

4-person | 4:3:2:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM |

10:5:2:1 | Hematin: 400 µM, 475 µM, 550 µM, 625 µM, 700 µM | |

4:3:2:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL | |

10:5:2:1 | Humic acid: 200 ng/µL, 250 ng/µL, 300 ng/µL, 350 ng/µL, 400 ng/µL |

Dry Heat Treatment Number | Dry Heat Exposure Time |
---|---|

1 | 5.75 h |

2 | 12.13 h |

3 | 19.42 h |

4 | 27.73 h |

5 | 37.32 h |

6 | 48.50 h |

7 | 61.70 h |

8 | 77.52 h |

9 | 96.83 h |

10 | 120.93 h |

11 | 151.85 h |

12 | 192.97 h |

13 | 250.32 h |

14 | 335.88 h |

**Table 5.**Sample composition of degraded samples. Dry heat treatment numbers refer to those in Table 4 and, for mixtures, are listed in order according to which treated components were paired together. All degraded single-source samples were amplified at a DNA input amount of 2 ng, and all degraded mixtures were amplified at a total DNA input of 8 ng.

Donor Number | Mixture Ratio | C1 Dry Heat Treatments | C2 Dry Heat Treatments | C3 Dry Heat Treatments | C4 Dry Heat Treatments |
---|---|---|---|---|---|

Single source #1 | - | 1,3,4,5,6,7,9,13,10 | - | - | - |

Single source #2 | - | 1,2,3,4,5,6,8,10,14 | - | - | - |

Single source #3 | - | 1,2,4,5,6,9,10,11,12 | - | - | - |

Single source #4 | - | 1,2,4,5,8,9,11,12,13 | - | - | - |

2-person | 3:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | - | - |

10:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | - | - | |

3-person | 3:2:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | 1,2,4,5,6,9,10,11,12 | - |

10:5:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | 1,2,4,5,6,9,10,11,12 | - | |

4-person | 4:3:2:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | 1,2,4,5,6,9,10,11,12 | 1,2,4,5,8,9,11,12,13 |

10:5:2:1 | 1,3,4,5,6,7,9,13,10 | 1,2,3,4,5,6,8,10,14 | 1,2,4,5,6,9,10,11,12 | 1,2,4,5,8,9,11,12,13 |

**Table 6.**Sample composition of signal-saturated mixtures. Note that the sets of donors used for these amplifications are the same as the donor sets in Table 1.

Donor Number (Donor Set) | Mixture Ratio | Input Amounts (Total DNA) |
---|---|---|

2-person (set 1) | 9:1 | 28 ng |

2-person (set 1) | 99:1 | 25.5 ng |

2-person (set 2) | 1:1 | 29.3 ng |

2-person (set 2) | 3:1 | 20.9 ng |

3-person (set 1) | 3:2:1 | 24 ng |

3-person (set 1) | 10:5:1 | 29 ng |

3-person (set 1) | 100:100:4 | 28.5 ng |

3-person (set 2) | 1:1:1 | 20.4 ng |

3-person (set 2) | 3:2:1 | 17.8 ng |

3-person (set 2) | 10:5:1 | 12.7 ng |

3-person (set 2) | 100:100:4 | 17.1 ng |

4-person (set 1) | 4:3:2:1 | 32.5 ng |

4-person (set 1) | 10:5:2:1 | 29.4 ng |

4-person (set 1) | 100:100:100:6 | 30 ng |

4-person (set 2) | 1:1:1:1 | 20.3 ng |

4-person (set 2) | 4:3:2:1 | 15.4 ng |

4-person (set 2) | 10:5:2:1 | 11.9 ng |

4-person (set 2) | 100:100:100:6 | 22.0 ng |

**Table 7.**Sample composition of mixtures analyzed with STRmix set to one or two contributors less than the ground truth number, as indicated.

Ground Truth Donor Number (Donor Set) | STRmix Donor Number Setting (NOC-1 or NOC-2) | Mixture Ratio | Input Amounts (Total DNA) |
---|---|---|---|

2-person (set 1) | 1 | 49:1 | 313 pg |

2-person (set 1) | 1 | 99:1 | 2.5 ng, 1.25 ng |

3-person (set 1) | 2 | 3:2:1 | 1.2 ng, 600 pg, 522 pg, 450 pg, 300 pg, 228 pg, 150 pg, 75 pg, 38 pg |

3-person (set 2) | 2 | 3:2:1 | 38 pg |

3-person (set 1) | 2 | 10:5:1 | 800 pg, 608 pg, 400 pg, 200 pg, 100 pg |

3-person (set 2) | 2 | 10:5:1 | 100 pg |

3-person (set 1) | 2 | 100:100:4 | 1.28 ng, 625 pg, 325 pg |

3-person (set 2) | 2 | 100:100:4 | 1.28 ng, 625 pg, 325 pg |

4-person (set 1) | 2 | 4:3:2:1 | 125 pg, 63 pg |

4-person (set 1) | 2 | 10:5:2:1 | 113 pg |

4-person (set 2) | 2 | 10:5:2:1 | 113 pg |

4-person (set 1) | 3 | 4:3:2:1 | 2 ng, 1 ng, 870 pg, 750 pg, 500 pg, 380 pg, 250 pg |

4-person (set 2) | 3 | 4:3:2:1 | 63 pg |

4-person (set 1) | 3 | 10:5:2:1 | 3.6 ng, 1.8 ng, 1.6 ng, 1.4 ng, 684 pg, 450 pg, 225 pg |

4-person (set 2) | 3 | 10:5:2:1 | 450 pg |

4-person (set 1) | 3 | 100:100:100:6 | 1.28 ng, 638 ng, 319 ng |

4-person (set 2) | 3 | 100:100:100:6 | 1.28 ng, 625 ng, 325 ng |

**Table 8.**Sample composition of cell line DNA mixtures. The cell lines listed in the first column correspond to the mixture ratios listed in column 2.

Donor Number | Mixture Ratio | Input Amounts | Replicate Amps |
---|---|---|---|

2-person (CEPH 1347-02, HL60) | 9:1 | 2 ng, 1 ng, 870 pg, 750 pg, 500 pg, 380 pg, 250 pg, 125 pg, 63 pg | 2 |

49:1 | 2.5 ng, 1.9 ng, 1.25 ng, 625 pg, 313 pg | 2 | |

99:1 | 2.5 ng, 1.25 ng, 625 pg | 2 | |

1:1 | 800 pg, 400 pg, 200 pg, 100 pg, 50 pg, 25 pg | 1 | |

3:1 | 800 pg, 400 pg, 348 pg, 300 pg, 200 pg, 152 pg, 100 pg, 50 pg, 25 pg | 1 | |

3-person (2800 M, HL60, CEPH 1347-02) | 3:2:1 | 1.2 ng, 600 pg, 522 pg, 450 pg, 300 pg, 228 pg, 150 pg, 75 pg, 38 pg | 2 |

10:5:1 | 3.2 ng, 1.6 ng, 1.4 ng, 1.2 ng, 800 pg, 608 pg, 400 pg, 200 pg, 100 pg | 2 | |

100:100:4 | 1.28 ng, 625 pg, 325 pg | 2 | |

1:1:1 | 1.2 ng, 600 pg, 300 pg, 150 pg, 75 pg, 38 pg | 1 | |

3:2:1 | 1.2 ng, 522 pg, 300 pg, 150 pg, 38 pg | 2 | |

10:5:1 | 3.2 ng, 1.4 ng, 800 pg, 400 pg, 100 pg | 2 | |

100:100:4 | 1.28 ng, 638 pg, 319 pg | 2 |

## 3. Results

#### 3.1. Trends in Allele and Stutter Variances with Increasing Peak Height

_{10}(variance parameter) against log

_{10}(Tc) for the unchallenged data set in relation to allele variance, reverse stutter variance, and forward stutter variance, respectively. To varying degrees, the data trends across the full Tc range are not linear for all three variances. Using the LINEST function of Excel, we found that a fourth-order polynomial regression provided the best visual fit to each dataset. Quantitative assessments of how well the regression predicts the log

_{10}(variance) can be found in Figure S1, Table 9 and Table S3. No significant deviations from normality around the regression lines were observed in the residuals of the log

_{10}(variance) data when evaluated using the Jarque-Bera test [13] (see Table 9; all p > 0.01). Figure S1 displays the 99% 2-sided confidence intervals around the regression lines. Table S3 includes the coefficients of determination (R

^{2}) and F statistic p-values for the overall regressions, as well as the 99% 2-sided confidence intervals and T statistic p-values for the individual polynomial regression coefficients (β

_{i}

_{=0to4}). All R

^{2}values are below 0.5, but the F statistics suggest they are all significantly different from 0 (p < 0.01). The individual coefficients for the allele and reverse stutter regressions are also significantly different from 0 (p < 0.01; 99% confidence intervals never included 0). For the forward stutter, only coefficient β

_{4}is significantly different from 0. This is consistent with a visual assessment of the forward stutter data, where a clear bias away from the gamma mode is observed, but Tc appears to have no effect on the variance for log

_{10}(Tc) < 3.

**Table 9.**A summary of fourth-order polynomial regression information for the allele, reverse stutter, and forward stutter variance plots from Figure 2 (corresponding to red lines).

Allele | Reverse Stutter | Forward Stutter | |
---|---|---|---|

Polynomial regression formula | y = 0.1122x^{4} − 1.2206x^{3} + 4.8535x^{2} − 8.2618x + 5.5312 | y = −0.2892x^{4} + 3.3258x^{3} − 13.619x^{2} + 23.456x − 13.404 | y = −0.0348x^{4} + 0.313x^{3} − 1.0891x^{2} + 1.7016x − 0.1678 |

Jarque-Bera test for normality of the residuals | p = 0.1503 | p = 0.2395 | p = 0.0275 |

99th Percentile (+2.326 SD) | +0.2156 | +0.2755 | +0.1779 |

**Table 10.**A summary of prior gamma distribution information for the allele, reverse stutter, and forward stutter variance parameters used to generate the data plotted in Figure 2 (corresponding to black lines).

Allele | Reverse Stutter | Forward Stutter | |
---|---|---|---|

α | 3.891 | 1.557 | 1.526 |

β | 1.131 | 6.436 | 4.552 |

Mode | 3.270 | 3.585 | 2.394 |

99th Percentile | 11.16 | 37.24 | 26.06 |

#### 3.2. Trends in Allele and Stutter Variance under Challenging Amplification/Interpretation Conditions

_{10}(variance parameter) v log

_{10}(Tc) for the challenged datasets allowed for a direct visualization of variance shifts.

_{10}(variance parameter) v log

_{10}(Tc) plots for inhibited mixtures. From these plots it is apparent that exposure to an inhibitor increased the reverse stutter variances above the 99th percentile line from the unchallenged plot for a majority (~56.67%) of treated mixtures, while leaving the allele variance and forward stutter variance largely unchanged or even lower than the regression line. A very small percentage (3.33%) of reverse stutter variance parameters exceeded the prior gamma 99th percentile line, and no allele or forward stutter variance data fell above this line.

_{10}(variance parameter) v log

_{10}(Tc) plots for mixtures with an underestimated NOC, overlaid with the bands of expected variation from the unchallenged plots. Increased allele variance was observed for a substantial number of these mixtures; overall, ~28.30% of the mixtures had diagnostic values above the unchallenged 99th percentile line, mostly for higher level contributors, while the reverse and forward stutter variance plots tracked the unchallenged regression closely, with only ~3.77% of each data set above the unchallenged 99th percentile line. The only data above the line defined by the 99th percentile of the prior gamma distributions was a small percentage (~1.89%) of the allele variance diagnostic values.

_{10}(variance parameter) v log

_{10}(Tc) plots for degraded mixtures. A moderate proportion of all three variance parameters for these mixtures (~21.11% for alleles, ~23.33% for reverse stutters, and ~12.22% for forward stutters) were above the unchallenged 99th percentile line. Much smaller proportions of each variance parameter exceeded the 99th percentile of the prior gamma distribution (6.67% for alleles, 2.22% for reverse stutters, and 1.11% for forward stutters).

_{10}(variance parameter) v log

_{10}(Tc) plots for signal-saturated mixtures. These mixtures showed the most striking deviation from expectation, with ~40.00% of allele variance parameters, ~80.00% of reverse stutter variance parameters, and 70.00% of forward stutter diagnostics exceeding the unchallenged 99th percentile line. Many of these variances also exceeded the prior gamma distribution 99th percentile line (~45.00% for allele, ~75.00% for reverse stutter, and ~5.00% for forward stutter).

## 4. Discussion

_{10}(variance parameter) v log

_{10}(Tc) plots for the unchallenged data demonstrates the value of establishing an empirically determined working range for STRmix variance parameters instead of assuming that the observed average variance parameter values will always align with the prior gamma distribution.

_{10}(Tc) value of ~2.79, or a Tc of ~617 RFU, corresponds to the approximate point at which stutters begin to be detected. Considering a typical range of reverse stutter ratios to be ~0.05 to 0.1, a Tc of ~617 RFU would equate to reverse stutter peak heights in the range of ~31–62 RFU, which straddles our detection thresholds of 35–71 RFU. Notice that beyond log

_{10}(Tc) values of ~2.79, the reverse stutter variance parameters once again move upward, while the allele variance parameters stay relatively flat. A plausible explanation for this trend is that expected stutter peak heights in STRmix are determined by static per-allele stutter ratios and therefore do not have the same degree of model flexibility as expected allele peak heights, which vary with the STRmix template parameter during interpretation. Similar to the reverse stutter variance at log

_{10}(Tc) value of ~2.79, the forward stutter variance parameter values begin significantly trending down at a log

_{10}(Tc) value of ~3, or a Tc value of ~1000 RFU, which would equate to forward stutter peak heights of ~1–50 RFU (typical forward stutter ratio range of ~0.001 to 0.05), again straddling our detection thresholds and suggesting a change in the modeling fit.

_{10}(variance parameter) v log

_{10}(Tc) plots of the various challenged datasets also show distinct data patterns that can be associated with biological causes or elements of the STRmix profile model. For the inhibited data set, the most affected of the three variance parameters was reverse stutter, likely because the inhibition required STRmix to model undetected reverse stutter peaks across the profile due to reduced locus yields; in contrast, most of the alleles were still detected at the affected loci. More moderate but significant effects on the reverse stutter variance parameters were observed in the degraded data, which also requires STRmix to model undetected reverse stutter at higher molecular weight loci as peak heights decrease with degradation. However, moderate effects on the allele variance parameters were also observed; these effects are attributable to the difficulty of modeling high levels of degradation, particularly if the value of the exponential decay term in the STRmix degradation model approaches its user-defined ceiling (which occurred with many of the highly degraded mixtures in this set) [14]. The cell line and underestimated NOC data have similar effects on the variance parameters, in that the allele variance was the most affected of the three. This is a sensible result, given the apparent allele modeling issues with the cell line data and the intralocus allele imbalances that may result from NOC underestimation. Signal saturation, meanwhile, often had a pronounced effect on all three variance parameters. At the peak heights observed in saturated mixtures, tolerance for any peak height deviation from expectation is extremely low, and such deviation is more likely with the loss of linearity between peak height and template, leading to a cascade of effects on the variance parameters. However, not all of the saturated mixtures resulted in elevated variances, because there was variation both in the total number of off-scale peaks detected and the degree of saturation. The more nuanced variance expectations we have presented here can be useful in determining whether the extent of signal saturation observed in a profile has had a discernable effect on its interpretation.

_{10}(allele variance parameter) v log

_{10}(Tc) regression, which serves as a prompt for closer scrutiny of the interpretation, as well as contributing to an explanation for the aberrant LR result. Figure 9 and Figure 10 are two further examples of how the unchallenged regression data could be applied for routine use in the assessment of STRmix variance parameters from a case result. The interpretations assessed in both figures are from the inhibited mixture data set. In both cases, the allele variance parameter is below the 99th percentile of the unchallenged regression, but in Figure 9, neither stutter parameter is flagged as high, while in Figure 10 both are flagged. Notice in particular how similar the reverse stutter parameters are between the two interpretations; despite this similarity, a higher threshold for the reverse stutter variance parameter was applied to the 2-person data because it had a higher Tc than the 3-person data.

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Clayton, T.M.; Whitaker, J.P.; Sparkes, R.; Gill, P. Analysis and interpretation of mixed forensic stains using DNA STR profiling. Forensic. Sci. Int.
**1998**, 91, 55–70. [Google Scholar] [CrossRef] [PubMed] - Taylor, D.; Bright, J.A.; Buckleton, J. The interpretation of single source and mixed DNA profiles. Forensic. Sci. Int. Genet.
**2013**, 7, 516–528. [Google Scholar] [CrossRef] [PubMed] - Taylor, D.; Buckleton, J.; Bright, J.A. Factors affecting peak height variability for short tandem repeat data. Forensic. Sci. Int. Genet.
**2016**, 21, 126–133. [Google Scholar] [CrossRef] [PubMed] - Bille, T.W.; Weitz, S.M.; Coble, M.D.; Buckleton, J.; Bright, J.A. Comparison of the performance of different models for the interpretation of low level mixed DNA profiles. Electrophoresis
**2014**, 35, 3125–3133. [Google Scholar] [CrossRef] [PubMed] - Bieber, F.R.; Buckleton, J.; Budowle, B.; Butler, J.M.; Coble, M.D. Evaluation of forensic DNA mixture evidence: Protocol for evaluation, interpretation, and statistical calculations using the combined probability of inclusion. BMC Genet.
**2016**, 17, 125. [Google Scholar] [CrossRef] [PubMed][Green Version] - Buckleton, J.; Bright, J.A.; Gittelson, S.; Moretti, T.; Onorato, A.J.; Bieber, F.R.; Budowle, B.; Taylor, D. The Probabilistic Genotyping Software STRmix: Utility and Evidence for its Validity. J. Forensic. Sci.
**2019**, 64, 393–405. [Google Scholar] [CrossRef] [PubMed] - Bleka, Ø.; Storvik, G.; Gill, P. EuroForMix: An open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts. Forensic. Sci. Int. Genet.
**2016**, 21, 35–44. [Google Scholar] [CrossRef][Green Version] - Bright, J.A.; Taylor, D.; Curran, J.; Buckleton, J. Developing allelic and stutter peak height models for a continuous method of DNA interpretation. Forensic. Sci. Int. Genet.
**2013**, 7, 296–304. [Google Scholar] [CrossRef] [PubMed] - STRmix v2.8 User’s Manual (September 2020); Institute of Environmental Science and Research Limited: Wellington, New Zealand, 2020.
- Russell, L.; Cooper, S.; Wivell, R.; Kerr, Z.; Taylor, D.; Buckleton, J.; Bright, J. A guide to results and diagnostics within a STRmix report. WIREs Forensic. Sci.
**2019**, 1, e1354. [Google Scholar] [CrossRef] - Butler, J.M.; Iyer, H.; Press, R.; Taylor, M.K.; Vallone, P.M.; Willis, S. DNA Mixture INTERPRETATION: A NIST Scientific Foundation Review. NISTIR 8351-DRAFT; 2021. Available online: https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8351-draft.pdf (accessed on 17 November 2022).
- Duke, K.; Cuenca, D.; Myers, S.; Wallin, J. Compound and conditioned likelihood ratio behavior within a probabilistic genotyping context. Genes
**2022**, 13, 2031. [Google Scholar] [CrossRef] [PubMed] - How to Perform a Normality Test in Excel (Step-by-Step). Available online: https://www.statology.org/normality-test-excel/ (accessed on 12 December 2022).
- Duke, K.; Myers, P. Systematic evaluation of STRmix™ performance on degraded DNA profile data. Forensic. Sci. Int. Genet.
**2020**, 44, 102174. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**Examples of prior gamma distributions for the allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) parameters found in a STRmix v2.8 Interpretation Report. The average value of each variance parameter for the completed interpretation is indicated with a black dot on each distribution.

**Figure 2.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the unchallenged data set, overlaid with fourth-order polynomial regression lines (red solid lines) and the prior gamma modes (solid black lines). Additionally, shown are the 99th percentiles of the normal distributions around the fourth-order polynomial regressions (dotted red lines) and the 99th percentiles of the prior gamma distributions for the variance parameters (dotted black lines).

**Figure 3.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the inhibited data set.

**Figure 4.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the underestimated NOC data set.

**Figure 5.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the degraded data set.

**Figure 6.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the signal-saturated data set.

**Figure 7.**Plots of log

_{10}(variance parameter) v log

_{10}(Tc) for allele variance (

**a**), reverse stutter variance (

**b**), and forward stutter variance (

**c**) of the cell line DNA mixture set.

**Figure 8.**Prior gamma distributions for allele (

**a**), reverse stutter (

**b**) and forward stutter (

**c**) variance parameters, along with the prior modes and the STRmix Interpretation Report variance parameter values for an 870 pg 9:1 mixture that resulted in a 0 LR for the true minor contributor. The allele variance parameter for the interpretation is flagged as high in this case.

**Figure 9.**Prior gamma distributions for allele (

**a**), reverse stutter (

**b**) and forward stutter (

**c**) variance parameters, along with the prior modes and the STRmix Interpretation Report variance parameter values for a 3 ng 2-person 10:1 mixture spiked with 475 µM hematin (Tc = 6882). None of the three variance parameters is flagged as high in this case.

**Figure 10.**Prior gamma distributions for allele (

**a**), reverse stutter (

**b**) and forward stutter (

**c**) variance parameters, along with the prior modes and the STRmix Interpretation Report variance parameter values for a 3 ng 4-person 10:5:2:1 mixture spiked with 625 µM hematin (Tc = 4097). Both the reverse and forward stutter variance parameters are flagged as high in this case.

**Table 11.**Percentage of variance parameter values exceeding bands of expected variation for allele variance, reverse stutter variance, and forward stutter variance under unchallenged and challenged amplification and/or interpretation conditions. The first two benchmarks for each variance type are based on information in Table 9, while the third and fourth benchmarks are based on information in Table 10.

% Greater Than… | Unchallenged | Inhibited | NOC −1 or −2 | Degraded | Saturated | Cell Lines | |
---|---|---|---|---|---|---|---|

Allele Variance | Polynomial Regression | 48.90% | 21.67% | 67.92% | 53.33% | 55.00% | 90.50% |

99th Percentile | 1.05% | 3.33% | 28.30% | 21.11% | 40.00% | 21.84% | |

Mode | 47.09% | 41.67% | 58.49% | 84.44% | 100.00% | 80.38% | |

99th Percentile | 0.00% | 0.00% | 1.89% | 6.67% | 45.00% | 0.00% | |

Reverse Stutter Variance | Polynomial Regression | 50.91% | 100.00% | 39.62% | 57.78% | 85.00% | 45.60% |

99th Percentile | 1.15% | 56.67% | 3.77% | 23.33% | 80.00% | 1.92% | |

Mode | 95.28% | 100.00% | 96.23% | 98.89% | 100.00% | 97.37% | |

99th Percentile | 0.00% | 3.33% | 0.00% | 2.22% | 75.00% | 0.00% | |

Forward Stutter Variance | Polynomial Regression | 53.15% | 48.33% | 28.30% | 44.44% | 100.00% | 37.71% |

99th Percentile | 1.00% | 16.67% | 3.77% | 12.22% | 70.00% | 0.51% | |

Mode | 99.76% | 100.00% | 100.00% | 96.67% | 95.00% | 100.00% | |

99th Percentile | 0.00% | 0.00% | 0.00% | 1.11% | 5.00% | 0.00% |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Duke, K.; Myers, S.; Cuenca, D.; Wallin, J. Improving the Utilization of STRmix™ Variance Parameters as Semi-Quantitative Profile Modeling Metrics. *Genes* **2023**, *14*, 102.
https://doi.org/10.3390/genes14010102

**AMA Style**

Duke K, Myers S, Cuenca D, Wallin J. Improving the Utilization of STRmix™ Variance Parameters as Semi-Quantitative Profile Modeling Metrics. *Genes*. 2023; 14(1):102.
https://doi.org/10.3390/genes14010102

**Chicago/Turabian Style**

Duke, Kyle, Steven Myers, Daniela Cuenca, and Jeanette Wallin. 2023. "Improving the Utilization of STRmix™ Variance Parameters as Semi-Quantitative Profile Modeling Metrics" *Genes* 14, no. 1: 102.
https://doi.org/10.3390/genes14010102