Detection of Low-Frequency Mutations and Identification of Heat-Induced Artifactual Mutations Using Duplex Sequencing

We present a genome-wide comparative and comprehensive analysis of three different sequencing methods (conventional next generation sequencing (NGS), tag-based single strand sequencing (e.g., SSCS), and Duplex Sequencing for investigating mitochondrial mutations in human breast epithelial cells. Duplex Sequencing produces a single strand consensus sequence (SSCS) and a duplex consensus sequence (DCS) analysis, respectively. Our study validates that although high-frequency mutations are detectable by all the three sequencing methods with the similar accuracy and reproducibility, rare (low-frequency) mutations are not accurately detectable by NGS and SSCS. Even with conservative bioinformatical modification to overcome the high error rate of NGS, the NGS frequency of rare mutations is 7.0 × 10−4. The frequency is reduced to 1.3 × 10−4 with SSCS and is further reduced to 1.0 × 10−5 using DCS. Rare mutation context spectra obtained from NGS significantly vary across independent experiments, and it is not possible to identify a dominant mutation context. In contrast, rare mutation context spectra are consistently similar in all independent DCS experiments. We have systematically identified heat-induced artifactual variants and corrected the artifacts using Duplex Sequencing. Specific sequence contexts were analyzed to examine the effects of neighboring bases on the accumulation of heat-induced artifactual variants. All of these artifacts are stochastically occurring rare mutations. C > A/G > T, a signature of oxidative damage, is the most increased (170-fold) heat-induced artifactual mutation type. Our results strongly support the claim that Duplex Sequencing accurately detects low-frequency mutations and identifies and corrects artifactual mutations introduced by heating during DNA preparation.


Introduction
Next-generation sequencing (NGS) has rapidly transformed entire areas of basic research and therapeutic applications by making large scale genomic studies feasible through reduced cost and faster turnaround time [1,2]. Conventional NGS has been extensively used to study clonal (high-frequency) mutations, but not subclonal (low-frequency) mutations. A major impediment in investigating subclonal (low-frequency) mutations is that conventional NGS methods have high error rates (10 −2 to 10 −3 ), which obscure true mutations that occur less frequently than errors [3,4]. These subclonal mutations may account for the genetic heterogeneity of tumors and tumor recurrence, as well as provide a reservoir for the rapid development of resistance to chemotherapy [5].
Conventional sequencing technologies sequence only a single strand of DNA. In contrast, Duplex Sequencing examines both strands of DNA and scores mutations only if they are present on both strands of the same DNA molecule as complementary substitutions. This significantly reduces sequencing error rates to <5 × 10 −8 [6][7][8][9]. In the first report of Duplex Sequencing, accuracy and sensitivity of mutation detection were demonstrated mainly in M13mp2 bacteriophage by comparing untreated/control DNA and DNA incubated with hydrogen peroxide, a radical generator, in the presence of iron [6].
While overall frequencies and types of mutations from conventional NGS, SSCS, and Duplex Sequencing have been compared in previous studies [5,10], those studies focused on detection limits of low-frequency mutations only and did not compare the sequencing methods' ability to detect high-frequency mutations. In addition, influences of neighboring nucleotide base context on mutations (mutation context spectra) have not been investigated.
In the current study, we systematically compared mutation frequencies, types, positions, and sequence context spectra of the whole mitochondrial (mt) DNA in human breast epithelial cells using three different sequencing protocols: conventional NGS, tag-based single strand consensus sequencing (e.g., SSCS), and Duplex Sequencing. We applied the three sequencing methods to categorize and analyze high-frequency and low-frequency mutations, separately. Furthermore, analyses were done with several independent DNA library preparation experiments of an identical biological sample to evaluate the detection consistency, reproducibility, and validity of each sequencing method. Heating samples, a common practice in preparing DNA for molecular biology experiments, can introduce such artifactual mutations [11]. Herein, we present heat-induced artifactual mutations identified using Duplex Sequencing and specific nucleotide contexts that contribute to a high level of heat-induced artifactual mutations.

Results
Duplex Sequencing generates both SSCS and DCS analysis results. In Duplex Sequencing, both strands of DNA are individually tagged and strands with identical tag sequence, the product of the same DNA template, are grouped together after PCR amplification. SSCS analysis differs from DCS analysis in that complementary tag sequences are not identified, and so complementary strands are not grouped [6]. The SSCS method represents a tag-based single strand sequencing procedure and is comparable to that of Safe Sequencing System (SafeSeqS) in that each single-stranded DNA molecule is uniquely labeled before PCR amplification, allowing strands of the same derivatives to be grouped [9,12].
The average number of nucleotides sequenced at each genome position (depth) of all conventional NGS, SSCS, and DCS analyses were calculated as the total number of nucleotides sequenced divided by the mtDNA size of 16,569 bases. The average depths for conventional NGS, SSCS, and DCS analyses for normal human breast cells and immortalized cells are presented in Tables S1 and S2. The highest depths of conventional NGS, SSCS and DCS that were processed under the same data processing conditions were 458441, 40421 and 6803, respectively (Table S1).
As an attempt to overcome the high error rates of conventional NGS, more conservative bioinformatical conditions than those applied for SSCS and DCS, referred to as conventional NGS (Q30 r ) hereinafter (See Section 4.4.2. Materials and Methods), were applied to conventional NGS datasets. Results of conventional NGS (the same bioinformatical conditions as to SSCS and DCS) and conventional NGS (Q30 r ) are presented in Supplementary Materials Section, Figures S1-S4.   Table S3) and rare (0-1%: Figures 2-5, Figures S3-S6, Tables S1, S2, S4-S7) mutations based on the mutation occurrence (%) at each genome position. Mutation frequencies were calculated by dividing the number of variants by the total number of nucleotides sequenced.

Homoplasmic Mutations are Detectable by All Three Methods (Conventional NGS, Tag-Based Single Strand Sequencing, and Duplex Sequencing) with the Similar Accuracy and Reproducibility
The overall frequencies ( Figure 1A) of homoplasmic point mutations and frequencies of each mutation type ( Figure 1B) are almost identical across all independent experiments of conventional NGS (Q30 r ), SSCS, and DCS analyses.
In our study, 35 identical homoplasmic unique mutations were detected in all independent experiments regardless of sequencing methods used (Table S3). Taken together, all three sequencing methods are accurate enough to study highly prevalent mutations such as germline mutations of the nuclear genome and homoplasmic mutations of the mitochondrial genome.

Rarely Occurring Mutations are Neither Accurately Detectable by Conventional NGS Methods nor Tag-Based Single Strand DNA Sequencing, but are Accurately Detectable by Duplex Sequencing
Rare mutation frequencies of immortalized human breast cells were determined using conventional NGS, SSCS, and DCS methods. The average rare mutation frequencies of the independent experiments are significantly lower in SSCS (1.30 × 10 −4 ) and DCS (1.04 × 10 −5 ) by 5-fold and 67-fold, respectively than that of conventional NGS (Q30 r ) (7.00 × 10 −4 ) ( Figure 2A, Table S1). This indicates that Duplex Sequencing removes false-positive artifactual mutations and significantly reduces the rare mutation frequencies. For this study, we have defined homoplasmic (90-100%: Figure 1, Figures S1 and S2, The overall frequencies ( Figure 1A) of homoplasmic point mutations and frequencies of each mutation type ( Figure 1B) are almost identical across all independent experiments of conventional NGS (Q30 r ), SSCS, and DCS analyses.
In our study, 35 identical homoplasmic unique mutations were detected in all independent experiments regardless of sequencing methods used (Table S3). Taken together, all three sequencing methods are accurate enough to study highly prevalent mutations such as germline mutations of the nuclear genome and homoplasmic mutations of the mitochondrial genome.

Rarely Occurring Mutations are Neither Accurately Detectable by Conventional NGS Methods nor
Tag-Based Single Strand DNA Sequencing, but are Accurately Detectable by Duplex Sequencing Rare mutation frequencies of immortalized human breast cells were determined using conventional NGS, SSCS, and DCS methods. The average rare mutation frequencies of the independent experiments are significantly lower in SSCS (1.30 × 10 −4 ) and DCS (1.04 × 10 −5 ) by 5-fold and 67-fold, respectively than that of conventional NGS (Q30 r ) (7.00 × 10 −4 ) ( Figure 2A, Table S1). This indicates that Duplex Sequencing removes false-positive artifactual mutations and significantly reduces the rare mutation frequencies.
The frequencies of rare mutations are highly variable in independent experiments analyzed with conventional NGS (Q30 r ) (Figure 2A), whereas frequencies of rare mutations show reproducible results in independent experiments of DCS of Duplex Sequencing (Figure 2A,C). It is noted that conventional NGS (Q30 r ) datasets were processed under more conservative conditions (See Section 4.4.2. Materials and Methods) than those of SSCS and DCS; however, these bioinformatical modifications only lowered rare mutation frequency by, on average, 35% ( Figures S3, S4, and Table S1). Furthermore, variations in rare mutation frequencies are inconsistent between "the conventional NGS results with the bioinformatical default conditions" and "the conventional NGS results with the more conservative bioinformatical modification (Q30 r )" across the four independent experiments. Before applying more conservative conditions of the bioinformatical modifications, the rare mutation frequencies of experiments #2 and #4 are significantly greater than those of experiments #1 and #3 respectively ( Figure  S3). Moreover, the frequencies are more comparable for experiments #1 and #2 compared to #3 and #4 ( Figure S3). However, after the application of bioinformatical modification (Q30 r ), the rare mutation frequencies of experiments #1 and #3 are significantly greater than those of experiments #2 and #4, respectively and the frequencies of experiments #3 and #4 are comparable ( Figure S3). This indicates that the bioinformatical modification alone is not possible to overcome the high error rate of NGS.  Table S1). Furthermore, variations in rare mutation frequencies are inconsistent between "the conventional NGS results with the bioinformatical default conditions" and "the conventional NGS results with the more conservative bioinformatical modification (Q30 r )" across the four independent experiments. Before applying more conservative conditions of the bioinformatical modifications, the rare mutation frequencies of experiments #2 and #4 are significantly greater than those of experiments #1 and #3 respectively ( Figure S3). Moreover, the frequencies are more comparable for experiments #1 and #2 compared to #3 and #4 ( Figure S3). However, after the application of bioinformatical modification (Q30 r ), the rare mutation frequencies of experiments #1 and #3 are significantly greater than those of experiments #2 and #4, respectively and the frequencies of experiments #3 and #4 are comparable ( Figure S3). This indicates that the bioinformatical modification alone is not possible to overcome the high error rate of NGS.
Frequencies of each type of rare mutations reveal other significant differences between conventional NGS (Q30 r ), SSCS, and DCS analyses. In conventional NGS (Q30 r ) results, C > T/G > A transitions and C > A/G > T transversions are identified at high frequencies ( Figure 2D-G). In SSCS results, C > A/G > T transversions are the most predominant mutation type followed by C > T/G > A transitions ( Figure 2H,I). In contrast, DCS results indicate that C > T/G > A transitions and C > A/G > T transversions are no longer predominant and no particular type is more prominent than others ( Figure 2J,K). Our data suggest that C > A/G > T transversions appear to be the most prevalent type of artifactual variants that are scored by both conventional NGS (Q30 r ) ( Figure 2D-G) and SSCS ( Figure 2H,I) methods. Frequencies of each type of rare mutations reveal other significant differences between conventional NGS (Q30 r ), SSCS, and DCS analyses. In conventional NGS (Q30 r ) results, C > T/G > A transitions and C > A/G > T transversions are identified at high frequencies ( Figure 2D-G). In SSCS results, C > A/G > T transversions are the most predominant mutation type followed by C > T/G > A transitions ( Figure 2H,I). In contrast, DCS results indicate that C > T/G > A transitions and C > A/G > T transversions are no longer predominant and no particular type is more prominent than others ( Figure 2J,K). Our data suggest that C > A/G > T transversions appear to be the most prevalent type of artifactual variants that are scored by both conventional NGS (Q30 r ) ( Figure 2D-G) and SSCS ( Figure 2H,I) methods.
Proportions (%) of each type of rare mutations were analyzed. The prevalent rare mutation types differ under the three sequencing methods. C > A/G > T transversions are the most dominant type of rare mutation with conventional NGS (Q30 r ) data ( Figure 3A); however, the fraction (%) of C > A/G > T transversions vary widely across the four independent conventional NGS experiments. In contrast, comparable fractions (%) of each mutation type are observed in both DCS independent experiments ( Figure 3A).
Influences of neighboring bases on mutations were examined by conducting a mutation context spectra analysis. This analysis identifies bases immediately 5 and 3 to a mutated base (i.e., the mutation appears at the second position of each trinucleotide) and enables classifying observed substitutions into 96 categories (4 bases × 6 substitutions × 4 bases) [13,14]. Significant variations are observed for rare mutation context spectra of conventional NGS (Q30 r ) data across four independent experiments ( Figure 3B-E) and it is not possible to identify a dominant mutation context among them. In contrast, rare mutation context spectra are similar in all independent DCS experiments. For example, C > T transitions in contexts ACA and ACT occur at persistently high proportions in DCS data ( Figure 3H,I).
Influences of neighboring bases on mutations were examined by conducting a mutation context spectra analysis. This analysis identifies bases immediately 5′ and 3′ to a mutated base (i.e., the mutation appears at the second position of each trinucleotide) and enables classifying observed substitutions into 96 categories (4 bases × 6 substitutions × 4 bases) [13,14]. Significant variations are observed for rare mutation context spectra of conventional NGS (Q30 r ) data across four independent experiments ( Figure 3B-E) and it is not possible to identify a dominant mutation context among them. In contrast, rare mutation context spectra are similar in all independent DCS experiments. For example, C > T transitions in contexts ACA and ACT occur at persistently high proportions in DCS data ( Figure 3H,I).

Duplex Sequencing Identifies and Corrects the Heat-Induced Artifactual Variants Introduced During DNA Sample Preparation
We investigated which specific types of artifactual variants are introduced during DNA sample preparation such as heat treatments and to what extent these artifacts can be corrected by Duplex Sequencing. DNA was isolated from normal human breast primary cells (II) and an aliquot of DNA was incubated at 65 • C for 9 h. Unheated DNA served as the control. Libraries of heated and control DNA were prepared for Duplex Sequencing. To identify heat-induced specific variant types, we performed both SSCS and DCS analyses. The average SSCS and DCS depths of the whole mtDNA genome were similar for control DNA and heated DNA: SSCS (control: 12,257 and heated: 11,622) and DCS (control 2248 and heated: 2510) (Table S2). To closely examine the heat-induced artifactual variants, which are not detectable or distinguishable by conventional sequencing methods, we investigated the rare variants that occur at a frequency of 1% or less using Duplex Sequencing.
The rare mutation frequencies of SSCS are significantly higher than those of DCS ( Figure 4A). This higher SSCS mutation frequency could be due to heat-induced DNA damage and/or errors during PCR-amplification. Such artifactual variants are present on only one of the two DNA strands and thus they are not scored in DCS of Duplex Sequencing. While the incubation of DNA at 65 • C significantly increased the rare mutation frequency in SSCS analysis ( Figure 4A: the first and second bars: p-value < 2.2 × 10 −16 ), both heated and control DNA displayed identical frequencies of rare mutations in DCS analysis ( Figure 4A: the third and fourth bars). Our results clearly indicate that DCS analysis by Duplex Sequencing is not affected by heat-induced DNA damage introduced during DNA sample preparation and correctly represents true mutations. mtDNA genome were similar for control DNA and heated DNA: SSCS (control: 12,257 and heated: 11,622) and DCS (control 2248 and heated: 2510) (Table S2). To closely examine the heat-induced artifactual variants, which are not detectable or distinguishable by conventional sequencing methods, we investigated the rare variants that occur at a frequency of 1% or less using Duplex Sequencing.
The rare mutation frequencies of SSCS are significantly higher than those of DCS ( Figure 4A). This higher SSCS mutation frequency could be due to heat-induced DNA damage and/or errors during PCR-amplification. Such artifactual variants are present on only one of the two DNA strands and thus they are not scored in DCS of Duplex Sequencing. While the incubation of DNA at 65 °C significantly increased the rare mutation frequency in SSCS analysis ( Figure 4A: the first and second bars: p-value < 2.2 × 10 −16 ), both heated and control DNA displayed identical frequencies of rare mutations in DCS analysis ( Figure 4A: the third and fourth bars). Our results clearly indicate that DCS analysis by Duplex Sequencing is not affected by heat-induced DNA damage introduced during DNA sample preparation and correctly represents true mutations. Overall rare mutation frequency (A) and frequencies of rare mutation types (B,C) for heated versus control DNA were determined by SSCS and DCS analyses. Error bars represent the Wilson score 95% confidence intervals. The significant differences in rare mutation frequencies between the control DNA and the heated DNA are indicated (*** p < 5 × 10 −5 by the Chi-square test).

Duplex Sequencing Identifies the Specific Mutation Spectra of Heat-Induced Artifacts
We further examined which specific variant type(s) contributed to the elevated SSCS rare mutation frequency in heated DNA. In SSCS, but not DCS, the heated DNA ( Figure 4B) shows a significant increase in rare variant frequencies of C > A/G > T, C > T/G > A, C > G/G > C versus control DNA ( Figure 4B). In contrast, the 65 °C incubation (heating DNA) did not affect the rare mutation spectra of DCS results ( Figure 4C). For example, the SSCS rare variant frequency of C > A Overall rare mutation frequency (A) and frequencies of rare mutation types (B,C) for heated versus control DNA were determined by SSCS and DCS analyses. Error bars represent the Wilson score 95% confidence intervals. The significant differences in rare mutation frequencies between the control DNA and the heated DNA are indicated (*** p < 5 × 10 −5 by the Chi-square test).

Duplex Sequencing Identifies the Specific Mutation Spectra of Heat-Induced Artifacts
We further examined which specific variant type(s) contributed to the elevated SSCS rare mutation frequency in heated DNA. In SSCS, but not DCS, the heated DNA ( Figure 4B) shows a significant increase in rare variant frequencies of C > A/G > T, C > T/G > A, C > G/G > C versus control DNA ( Figure 4B). In contrast, the 65 • C incubation (heating DNA) did not affect the rare mutation spectra of DCS results ( Figure 4C). For example, the SSCS rare variant frequency of C > A in heated DNA is 3.26 × 10 −5 . This heat-induced artifactual variant type is significantly reduced by 170-fold to 1.88 × 10 −7 in DCS analysis.
Fractions (%) of each type ( Figure 5A) and each context spectrum of rare mutations ( Figure 5B-E) were examined for heated versus control DNA. The heat-induced DNA damage results in increases in C > G/G > C in SSCS analysis (gray-black bars in Figure 5A-C). Out of the 96 possible mutation sequence contexts, 28 are significantly changed after the 65 • C incubation in SSCS analysis ( Figure 5B-C, Table S4). Particularly, CCC, TCC and CCA contexts of C > G mutations showed the most significant increase in the heated DNA compared to the control (unheated) DNA in SSCS analysis ( Figure 5C, Table S4). In contrast, these DNA damage-dependent changes of specific mutation context spectra are not observed in DCS analysis, irrespective of neighboring nucleotides ( Figure 5A,D,E).
( Figure 5B-C, Table S4). Particularly, CCC, TCC and CCA contexts of C > G mutations showed the most significant increase in the heated DNA compared to the control (unheated) DNA in SSCS analysis ( Figure 5C, Table S4). In contrast, these DNA damage-dependent changes of specific mutation context spectra are not observed in DCS analysis, irrespective of neighboring nucleotides ( Figure 5A,D,E).

Independent Experiments of Duplex Sequencing Reproducibly Identify the Heat-Induced Artifactual Mutaiton Profiles
Two independent cell culture and DNA library experiments (I and II) for the incubation of DNA at 65 • C for 9 h were conducted with DNA isolated from normal human epithelial cells (I and II) derived from breast tissue of the same woman. The average SSCS and DCS depths were similar between the cells I and cells II: SSCS (I: 11,622 and II: 11,045) and DCS (I: 2510 and II: 2460) (Table S5).
Rare mutation frequencies of heated DNA of the cells I ( Figure S5) were calculated for both SSCS and DCS, and the analysis reveals the same pattern observed with heated DNA of the cells II used for the main result Figure 4 experiment (Figure S5). Both the overall rare mutation frequencies and the rare mutation frequency of each mutation type are observed at similar levels between the heated DNA of the cells I and II ( Figure S5). Furthermore, fractions (%) of each type ( Figure S6A) and context spectra ( Figure S6B-E) of rare mutations are almost identical, strengthening the finding that Duplex Sequencing is capable of identifying and correcting the heat-induced artifactual mutations and the results are reproducible in independent experiments.

All Identified Heat-Induced Artifactual Variants are Stochastically Occurring Variants Throughout the Whole Mitochondrial Genome
A total of 3383 heat-induced artifactual unique variants were identified, all these are in the variant occurrence (%) range of 0-1% ( Figure 6A). This clearly indicates that all of the heat-induced artifactual variants introduced during DNA preparation are rarely occurring variants, thus not accurately and reliably detectable by conventional DNA sequencing methods.

S5).
Rare mutation frequencies of heated DNA of the cells I ( Figure S5) were calculated for both SSCS and DCS, and the analysis reveals the same pattern observed with heated DNA of the cells II used for the main result Figure 4 experiment (Figure S5). Both the overall rare mutation frequencies and the rare mutation frequency of each mutation type are observed at similar levels between the heated DNA of the cells I and II ( Figure S5). Furthermore, fractions (%) of each type ( Figure S6A) and context spectra ( Figure S6B-E) of rare mutations are almost identical, strengthening the finding that Duplex Sequencing is capable of identifying and correcting the heat-induced artifactual mutations and the results are reproducible in independent experiments.

All Identified Heat-Induced Artifactual Variants are Stochastically Occurring Variants Throughout the Whole Mitochondrial Genome
A total of 3383 heat-induced artifactual unique variants were identified, all these are in the variant occurrence (%) range of 0-1% ( Figure 6A). This clearly indicates that all of the heat-induced artifactual variants introduced during DNA preparation are rarely occurring variants, thus not accurately and reliably detectable by conventional DNA sequencing methods. Of the all identified artifactual variants, about 92% of them are found on coding regions of mtDNA and 69% of them are found within the 13 protein-coding regions of mtDNA ( Figure 6A, Table S6). The percentages of artifactual variants found on coding regions and 13 protein-coding regions of mtDNA closely match the percentages these two regions occupy in the whole mtDNA, which are about 92% and 68%, respectively [15]. In addition, the number of artifactual variants identified shows a strong positive correlation with the sizes of 13 protein-coding genes (Pearson's correlation coefficient r = 0.99 and p = 2.36 × 10 −10 ), which indicates that more artifactual variants are found on larger genes ( Figure 6B). We examined the 13 protein-coding genes of mtDNA to see if any particular gene is relatively more or less prone to heat-induced variants. For each gene, we calculated the percent of variants by dividing the numbers of variants by each gene size (bases). Among the 13 genes, MT-CO2 is slightly more mutated at 23.39% and MT-ND3 is mutated the least at 13.01% (Table S7). The majority (11 out of 13) of these genes are mutated at similar variant occurrences (average 20%), which is consistent with the finding that artifactual variants occur stochastically.

Discussion
In this study, we sequenced the entire mitochondrial DNA genome of human breast cells via three different sequencing protocols: conventional NGS, tag-based single strand sequencing (e.g., SSCS), and Duplex Sequencing. We systematically compared high-frequency and low-frequency mutations obtained from the three methods. We demonstrate advantages of Duplex Sequencing over other sequencing methods for studying rarely occurring mutations. In addition, we identified the heat-induced artifactual variants. Although Duplex Sequencing has been used in a previous study to show an increased level of mutation frequency of small selected regions of nuclear genome in DNA incubated at 65 • C [11], the same temperature used in this study, exact identities of the heat-induced artifactual variants have not been presented. Moreover, while types of artifactual variants have been previously examined, influences of neighboring nucleotide base context on artifactual variants have not been investigated. To our knowledge, this is the first study to present the exact identities of the heat-induced artifactual variants and the specific nucleotide context spectra for these artifacts.
Our data show that rare mutation frequencies are significantly lower in DCS analysis in comparison to conventional NGS and SSCS analyses, suggesting that large number of rare mutations detected by conventional NGS and SSCS are mostly artifacts (Figure 2A). Particularly, C > A/G > T transversions, which have been previously reported to be a predominant result of DNA oxidation [11,16], showed the greatest decrease in frequencies with DCS analyses of Duplex Sequencing ( Figure 2J,K). These findings validate that DCS can identify and correct artifacts and be applied for accurately detecting rarely occurring mutations. Furthermore, the comparison of rare mutation frequencies between multiple independent experiments (Figure 2) demonstrates the ability of DCS in producing reproducible results. However, the rare mutation data by conventional NGS and SSCS shows high variability across independent experiments, which indicates the lack of the capability of conventional NGS and SSCS to produce reliable and reproducible results. The comparison of rare mutation context spectra analyses (Figure 3) further distinguishes DCS from conventional NGS and SSCS by showcasing its advantage to accurately and consistently detect rarely occurring mutations.
Artifactual variants can be generated as a result of copying damaged DNA bases. Such variants are present on only one of the two DNA strands and are scored by conventional NGS and SSCS but not by DCS. In the present study, we have identified heat-induced artifactual variants by performing both SSCS and DCS analyses. Our results indicate that C > A/G > T is the most predominantly enhanced artifactual variant followed by C > T/G > A and C > G/G > C. Previous studies reported that heating DNA can damage DNA bases by forming oxygen free radicals, (specifically 8-hydroxy-2 -deoxyguanine (8-Oxo-dG)), which deaminate cytosine to uracil, and increasing mitochondrial superoxide anion, which also leads to oxidation of DNA [17][18][19]. The 8-Oxo-dG is generated by DNA oxidation under physiopathological conditions or environmental stress. It is also a by-product of normal cellular metabolism [20]. The formation of 8-oxoguanine, particularly 8-oxo-dG has been reported to cause a high level of C > A/G > T mutations [11,16,[20][21][22][23], whereas deamination of cytosine to uracil is known to produce high levels of C > T/G > A mutations [11,19]. In one study, conventional NGS analysis was done on tumors and matching normal tissues of melanoma and an enzyme-linked immunosorbent assay (ELISA) for 8-Oxo-dG found that the CCG > CAG context have a high potential for being a target of DNA oxidation [22]. This context is observed at high mutation frequencies (2-fold increase) in our current study. Thus, it is likely that the high prevalence of C > A/G > T transversions in our data is caused by 8-Oxo-dG, suggesting that the biggest contributor of heat-induced artifactual variants is oxidative damage to mtDNA.
To our knowledge, this study is the first to examine mutation context spectra of heat-induced artifactual variants in the whole mitochondrial DNA of human breast epithelial cells. Our SSCS analysis results showed that out of 96 possible mutation sequence contexts, the fraction of 28 rare mutation sequence contexts were significantly changed after the 65 • C incubation ( Figure 5B,C, Table S4). Among the affected 28 mutation context spectra, CCC, TCC and CCA contexts of C > G variants showed the most significant increase. These mutation contexts could be more prone to DNA damage and may have a high potential for being targeted by molecular reactions which result from the heat-damaged DNA.
In summary, we present a genome-wide comprehensive and comparative analysis of mitochondrial DNA mutations for conventional NGS and tag-based methods (single-strand sequencing and Duplex Sequencing) and demonstrate the identification and removal of heat-induced artifactual variants using Duplex Sequencing. Our results indicate that all of the heat-induced artifactual variants are stochastically occurring rare variants. Thus, these artifactual variants are not accurately detectable by conventional sequencing methods. Even the application of more conservative bioinformatical modification on conventional NGS datasets is not enough to overcome the inherently high error rate of conventional NGS methods. Our data establishes that Duplex Sequencing: (1) accurately and reproducibly detect rare (low-frequency) mutations; (2) is not affected by damage introduced by heating during DNA preparation; (3) identifies and removes the DNA damage-induced artifactual variants.

Cell Culture
Human breast epithelial cells were provided by Drs. Chia-Cheng Chang at Michigan State University in East Lansing, MI, USA. Normal human breast cells used in this study were isolated from breast tissues of a healthy (cancer-free) women 21-29 years of age obtained during reduction mammoplasty at Sparrow Hospital in Lansing, MI, USA. The donors were given ID #13 and #31 for the cells used in this study. Immortalized human breast cells (M13SV1) used for the experiments comparing the three sequencing methods were derived from the parental normal stem cells of woman ID #13 by transforming with SV40 T antigen [24][25][26][27][28][29]. The normal cells used for the heat-induced DNA damage experiments were from woman ID #31. Written consents were received from patients. The use of human breast cells was approved by Michigan State University Institutional Review Board and a Material Transfer Agreement was approved by both Michigan State University and University of Washington. The cells were cultured as previously described [30,31] and were authenticated by short tandem repeat (STR) DNA profiling (Genetica DNA Laboratories, Labcorp brand, Burlington, NC, USA).
For Duplex Sequencing, the sheared DNA was subjected to end-repair with 3 -dT-tailing for adapters with A-overhang or with 3 -dA-tailing for adapter with T-overhang. Before the ligation of DNA with the adapters, DNA was quantitated using quantitative real-time RT-PCR. Then, twenty-fold molar excess of adapter relative to DNA was added for the ligation process and the whole mitochondrial genome was captured using Agilent SureSelect XT target enrichment set protocol version 1.6 (Agilent Technologies Inc., Santa Clara, CA, USA). The DNA libraries were sequenced via Illumina HiSeq 2500 (Illumina Inc., San Diego, CA, USA). A comprehensive and complete list of all materials and equipment used for Duplex Sequencing experiments is available on page 2594 of the manuscript Detecting ultralow-frequency mutations by Duplex Sequencing by Kennedy et al. 2014 [7].
For WES, the extracted DNA was sonicated to approximately 200 bp fragments then used to make a library for paired-end sequencing on an Illumina HiSeq 2500 platform (Illumina Inc., San Diego, CA, USA). The DNA library was captured by SeqCap EZ Exome v2 (Roche NimbleGen Inc., Madison, WI, USA) pools and then was hybridized to biotinylated capture probes.

Heat-Induced DNA Damage
Normal human breast epithelial cells were used for the heat-induced DNA damage experiments (Figures 4-6, Figures S5 and S6). Before DNA library preparation, an aliquot of DNA was incubated at 65 • C for 9 h to induce DNA damage. Then, same experimental protocols were applied for both heated and unheated DNA as described in Section 4.2. Materials and Methods.

Conventional NGS Datasets and conventional NGS, SSCS and DCS Data Processing
Two of four independent conventional NGS datasets (Experiments #1 and #2 in Figure 1 through Figure 3) were obtained by extracting the whole mitochondrial genome data from our WES results. WES is a commonly used next-generation sequencing method that is able to read the entire exome of DNA as well as the mitochondrial genome [34]. The fastq data files for the two WES datasets were processed as previously described [33] with some modifications. Our in-house script was modified to align the reads with the human mitochondria reference file and to include the mitochondrial genome only.
Two additional independent conventional NGS datasets were obtained by modifying our in-house Duplex Sequencing script to simulate conventional NGS processing on DNA libraries prepared for Duplex Sequencing (Experiments #3 and #4 in Figure 1 through Figure 3). The DNA libraries were prepared for Duplex Sequencing; however, the Duplex Sequencing script was modified to proceed only through alignment with a human mtDNA reference but to not do single strand consensus sequence (SSCS) and duplex consensus sequence (DCS) data alignment steps. This negates the individuality of complementary DNA strands when processing sequence data and produces bioinformatically simulated NGS data that is comparable to conventional NGS.
SSCS and DCS datasets were processed as described previously [31]. All datasets were aligned to the Revised Cambridge Reference sequence (rCRS) reference genome, sequence number NC_012920, using BWA and genome analysis toolkit (GATK) software as described previously [31].

Base Quality and PCR Duplicates
An illumina ® base quality score of 30 (Q30) is considered the benchmark for a correct base call in NGS. This score refers to a 1 in 1000 chance of an incorrect base call (error probability of 0.1% versus error probability of 5% with the default score of 13) [35]. Our NGS (Q30 r ) datasets, therefore, were processed with the base quality filtering adjusted to 30 from the default value of 13 by adding "-Q30" to the pileup command (i.e., samtools mpileup -B -d 500,000 -Q30 -f [reference] [input] [output]). However, introduction of artifacts in early stages of PCR amplification are not detectable as errors and are embedded in multiple PCR duplicates [6]. PCR duplicates were removed for conventional NGS (Q30 r ) data by taking the combined sequence-1 and sequence-2 files in bam format and using the command "samtools rmdup -s" of SAMtools software (Genome Research Ltd., Hinxton, UK) [36]. Both bioinformatical modifications accommodate high background error rates of conventional NGS. For DCS and SSCS data analyses, the default base quality score of 13 (error probability of 5%) was used and PCR duplicate removal was not applied since SSCS and DCS analyses have significantly lower error frequencies (1 × 10 −5 and <5 × 10 −8 or 1 × 10 −8 , respectively) than that of conventional NGS (10 −2 to 10 −3 ) [4,9]. Another reason the PCR duplicate removal step is not needed for SSCS and DCS analyses is that the molecular tags mark the duplicates for SSCS and DCS analyses.
The results of conventional NGS analysis obtained under the same bioinformatical conditions as SSCS and DCS (Q13 and no PCR duplicate removal) were presented in Table S1. The effects of bioinformatical modifications (Q13 and no PCR duplicate removal versus Q30 and PCR duplicate removal) on conventional NGS mutation results are presented in Supplementary Materials, Figures S1-S4.

Comparison of Variant Positions
To identify the heat-induced artifactual variants, variant positions were compared after removing the common variants (present in both control and heated DNA) identified by SSCS and DCS analyses. Then, variants found only in the heated DNA from SSCS analysis were considered as heat-induced artifactual variants. Only genome positions that had minimum sequence read (depth) of 20 in both samples were considered.

Counting Mutations
For calculating mutation frequencies, total number of variant reads observed were divided by total number of sequenced reads. For all other analyses, including fractions (%) of each mutation type and mutation context spectra, mutants were scored only once at each position of the genome (i.e., mutants were counted as 1 for each position regardless of number of variant reads observed in that position).

Statistical Analysis
Differences in mutation frequencies and mutation context fractions between control and heated DNA were analyzed by performing a Chi-square test using R (program version 3.4.4). Association of the number of identified unique variants in each of the thirteen protein-coding genes of mtDNA and the sizes of the corresponding thirteen genes was analyzed by Pearson correlation using Sigma Plot (version 12.0, Systat Software, Inc., San Jose, CA, USA). Differences between the groups were considered significant at p < 0.05.