Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology

Knebel, Paytin; Harris, Jacob; Steveson, Isaac; Kearns, Bridger; Todeschini, Andrew S.; Perrett, Lindsay; Anderson, DeLaney; Beltran, Erick; Leary, Bryson; Settle, Jonah; Carlson, Isaac; Christensen, Hudson; Trujano, Alberto; Alton, Abraham B.; Dixon, Ken; Barrott, Jared J.

doi:10.3390/jgbg1010004

Open AccessReview

Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology

by

Paytin Knebel

^1,2,

Jacob Harris

¹,

Isaac Steveson

¹,

Bridger Kearns

^1,2,

Andrew S. Todeschini

¹,

Lindsay Perrett

¹,

DeLaney Anderson

^1,2,

Erick Beltran

¹

,

Bryson Leary

¹,

Jonah Settle

¹,

Isaac Carlson

¹

,

Hudson Christensen

¹,

Alberto Trujano

¹,

Abraham B. Alton

¹,

Ken Dixon

³ and

Jared J. Barrott

^1,2,3,*

¹

Department of Cell Biology & Physiology, Brigham Young University, Provo, UT 84602, USA

²

Simmons Center for Cancer Research, Brigham Young University, Provo, UT 84602, USA

³

Specicare, 690 Medical Park Ln, Gainesville, GA 30501, USA

^*

Author to whom correspondence should be addressed.

J. Genome Biotechnol. Genet. 2026, 1(1), 4; https://doi.org/10.3390/jgbg1010004

Submission received: 7 February 2026 / Revised: 24 March 2026 / Accepted: 31 March 2026 / Published: 3 April 2026

Download

Browse Figures

Review Reports Versions Notes

Abstract

Intratumoral heterogeneity is a defining feature of cancer, yet standard sequencing and reporting practices often overlook somatic variants present at low variant allele frequencies (VAFs), commonly below 5%. Increasing evidence indicates that these rare alleles can represent clinically meaningful subclones involved in tumor evolution, therapeutic resistance, minimal residual disease, and metastatic dissemination. However, detecting and interpreting low-VAF variants is technically and analytically challenging because background error rates, library artifacts, genomic context, and caller assumptions increasingly overlap with true signal as allele fraction decreases. In this review, we integrate biological and clinical evidence supporting the relevance of low-VAFs and evaluate constraints across sequencing strategies, including whole genome and whole exome approaches and deep targeted panels. We discuss why detectability depends strongly on variant class and genome architecture, with SNVs generally more tractable than indels and structural variants. We then summarize practical approaches that improve sensitivity and specificity beyond increasing depth, including proper tissue handling, molecular enrichment, unique molecular identifiers, duplex-consensus methods, advanced error modeling, and orthogonal validation. Finally, we highlight emerging single-cell, spatial, and multiomic technologies that resolve rare variants in a cellular context. Collectively, these advances support incorporating low-VAF detection into precision oncology frameworks.

Keywords:

variant allele frequency; DNA sequencing; precision medicine; cancer; cryopreservation

1. Introduction

Cancer genomes are mosaics of evolving clonal and subclonal populations shaped by mutational processes, chromosomal instability, therapy selection, and microenvironmental pressures [1,2]. Although clinical sequencing workflows have traditionally emphasized variants present at higher variant allele frequencies (VAFs) due to artifacts in clinically available formalin-fixed, paraffin-embedded (FFPE) tissue, a growing body of evidence indicates that biologically and therapeutically consequential alterations often exist below conventional reporting thresholds, frequently at VAFs < 5% [3,4,5]. VAF represents the proportion of sequencing reads supporting a given variant, and low-VAF mutations (≤5%) typically reflect subclonal tumor populations, while very low-VAF variants (≤1%) approach the background error rate of many sequencing platforms and require specialized detection methods. These low-frequency variants can mark emergent resistant subclones, minimal residual disease, and metastatic precursors, yet they are often filtered as technical noise due to declining signal-to-noise ratios at low allele fractions. The challenge is further influenced by tissue preservation, variant class, and genomic context. Single-nucleotide variants (SNVs) are generally detectable at far lower VAFs than insertion and deletions (indels) and structural variants (SVs), while repetitive and low-complexity regions remain difficult even at substantial depth [6]. This review synthesizes current evidence for the biological and clinical significance of low-VAFs, outlines the technical and analytical constraints that limit their detection, and evaluates emerging strategies to improve their interpretation in precision oncology.

Low-frequency variants arise naturally from the evolutionary dynamics of cancer, driven by ongoing mutation, selective pressures from the tumor microenvironment, and therapeutic intervention. As tumors expand and diversify, new subclonal populations emerge while others are suppressed or eliminated, resulting in a constantly shifting distribution of variant allele frequencies. At any given time point, subclones may be present at low or ultralow-VAF either because they represent newly emerging populations or residual clones following treatment [7]. These dynamics create a biological expectation that clinically relevant mutations will often exist at low abundance. However, detecting these variants remains challenging due to sequencing noise, preservation-induced artifacts, and sampling limitations, which can obscure true subclonal signals or generate false positives (see Section 3). Accordingly, understanding how intratumoral heterogeneity gives rise to low-VAF variants provides the foundation for interpreting their clinical significance, as discussed in the following section.

2. Biological and Clinical Context

Different cancer types exhibit substantial intratumoral genetic heterogeneity, manifested as variable gene expression across individual cells and the emergence of genetically distinct subclonal populations [2]. A large-scale analysis of over 5000 tumors from The Cancer Genome Atlas (TCGA) used PhyloWGS to infer clonal structure and demonstrated wide variation in clonal diversity across 32 cancer types, with bladder cancer, lung adenocarcinoma, and ovarian cancer exhibiting the highest mean number of clones per tumor [1]. Importantly, clonal diversity correlated positively with mutation burden and copy number alterations, indicating that both mutational and chromosomal instability drive intratumoral heterogeneity. Because subclones often comprise only a fraction of the tumor mass, many biologically relevant mutations are expected to occur at low-VAFs.

Subclonal heterogeneity is evident across neoplasms and is particularly well characterized in acute lymphoblastic leukemia (ALL), where nearly half of pediatric diagnostic samples harbor exclusively subclonal alterations [8]. Similar patterns have been observed in solid tumors, including hepatocellular carcinoma, where subclone-specific gene expression changes were obscured by bulk RNA sequencing but resolved when subpopulations were examined individually [9]. These findings underscore that bulk analyses systematically underestimate tumor complexity and preferentially miss low-VAF events.

Intratumoral heterogeneity also has clear prognostic implications. Quantitative measures such as Mutant-allele Tumor Heterogeneity (MATH) have been associated with significantly worse overall survival in multiple cancer types, including bladder and pancreatic cancers [10,11]. Moreover, heterogeneity is dynamic: selective pressures from the tumor microenvironment and therapy drive temporal shifts in subclonal composition, allowing low-frequency variants to expand and mediate recurrence or therapeutic resistance [12]. This microcosm of natural selection provides a rationale for how low VAFs can emerge to cause recurrence and resistance in the face of traditional and targeted therapies (Figure 1). Thus, it is expedient to support technologies and tissue management that allow for the most accurate detection of low-VAFs.

Assay choice critically determines whether low-VAF variants are detected. Comparative analyses of whole-genome sequencing (WGS), whole-exome sequencing (WES), transcriptome sequencing, and targeted panels show that each approach yields distinct therapy recommendations [14]. While WGS provides the most comprehensive genomic context, its distributed sequencing depth limits sensitivity for low-VAFs, with reliable detection often falling off below 5% VAF even at high coverage [15]. In contrast, targeted panel sequencing concentrates depth on selected loci, enabling more accurate quantification of low-VAF mutations. Clinical studies consistently show superior detection of subclonal and resistance-associated variants using panels, including EGFR T790M mutations that are frequently present < 5% VAF in lung cancer [4,5,16]. Additional panel-based workflows showed 100% sensitivity for variants above 3% VAF [17], further illustrating the diagnostic strength of focused sequencing methods.

WES represents an intermediate strategy, offering greater coverage of coding regions than WGS and improved sensitivity for low-VAF variants within those regions, but at the expense of detecting structural variants and non-coding alterations. As a result, WES is commonly used in precision oncology research when coding mutations are the primary focus, whereas WGS is reserved for applications requiring broader genomic characterization. However, increased sensitivity comes at the cost of reduced discovery potential. Targeted panels are inherently limited to predefined regions and may miss novel structural variants or unexpected genomic events that are detectable by WGS or transcriptome sequencing [18,19,20]. Collectively, these findings highlight a fundamental tradeoff between genomic breadth and low-VAF sensitivity. Because clinically meaningful subclones often exist at low allele fractions, effective precision oncology requires sequencing strategies that deliberately balance coverage depth and genomic scope to ensure that biologically and therapeutically relevant variants are not overlooked.

3. Technical and Analytical Considerations

3.1. Impact of Tissue Storage Format and FFPE-Induced Artifacts

FFPE preservation, while standard in pathology, chemically alters and degrades genetic material. While protocols can vary with each facility, FFPE typically requires a multi-day process where the tissue is placed in a formalin solution, dehydrated with varying alcohol solutions, and then embedded in paraffin wax. This process preserves tissue architecture, but studies have shown that it can alter genetic material. The formalin fixation process causes DNA crosslinking, fragmentation, and base damage such as cytosine deamination, which lowers DNA quantity and quality and introduces sequencing artifacts when compared to fresh, flash-frozen, or cryopreserved tissues [21,22]. Consequently, FFPE samples consistently exhibit poorer sequencing quality metrics, increased background noise, and greater discordance in variant calls compared with fresh or frozen tissue [23,24]. These effects translate into practical limitations, with next-generation sequencing (NGS) failing in 20–40% of FFPE samples and up to 60% of FFPE tumor specimens rejected in clinical trial settings due to inadequate quality [25].

These limitations disproportionately impact the detection of low-VAFs. DNA damage and reduced library complexity necessitate stringent artifact filtering, which preferentially suppresses low-VAF signals while allowing high-VAF mutations to remain detectable. In contrast, flash-frozen and cryopreserved tissues preserve DNA and RNA in a near-native state, yielding higher molecular weight nucleic acids and improved sequencing performance [23,26].

Empirical evidence underscores the magnitude of this effect. In our recent analysis of 50 matched samples, WGS of cryopreserved tissue consistently detected more structural variants and oncogenic driver mutations than matched FFPE specimens [21]. FFPE samples showed inflated tumor mutational burden (13.7 vs. 6.4 mutations/Mb in frozen tissue), suggesting false-positive artifact calls, and only 43.5% overlap in variants with VAF > 5% between paired samples. Notably, discordance increased further at lower allele fractions, and even clinically actionable mutations were inconsistently detected in FFPE. Similar findings were reported by the 100,000 Genomes Project, where 16% of FFPE-derived samples failed sequencing outright, and concordance with frozen tissue was modest, demonstrating 71% for SNVs and only 44% for copy-number alterations, while high-VAF hotspot mutations showed higher concordance [26].

Alternative preservation strategies highlight the tradeoffs between clinical feasibility and molecular integrity. Fresh tissue provides the highest-quality DNA but is rarely practical in routine workflows due to rapid degradation. Cryopreservation maintains DNA and RNA integrity with minimal chemical modification and preserves cell viability, enabling downstream functional assays, though it requires specialized storage. Flash-freezing provides similar molecular quality but compromises cell viability. Despite logistical challenges, both approaches outperform FFPE in preserving sequence fidelity and enabling reliable detection of low-frequency variants.

Collectively, these findings demonstrate that FFPE preservation introduces both false negatives and false positives and is particularly ill-suited for resolving low-VAF subclonal mutations. Although FFPE-specific error filters can partially rescue high-confidence variants, they further risk discarding true low-frequency events that resemble artifacts. When accurate detection of rare variants is a priority, cryopreserved tissue provides a substantially more reliable substrate for genomic profiling.

3.2. Bioinformatic Pipelines as a Primary Source of Error

Independent of sequencing chemistry, bioinformatic interpretation constitutes a major source of error in low-VAF variant detection. A multi-laboratory assessment using synthetic plasmids encoding challenging pathogenic variants demonstrated that most laboratories generated sufficient sequencing data, yet failed to detect clinically relevant mutations due to limitations in variant-calling pipelines rather than read evidence [27]. Manual review revealed that supporting reads were frequently present but were excluded by overly stringent filters or caller assumptions. Approximately 13% of pathogenic variants across a cohort of more than 470,000 patients met criteria for being “challenging,” reflecting sequence context, repetitive content, or structural complexity.

Benchmarking studies further show that modality-aware variant callers substantially outperform bulk callers repurposed for specialized data types. When bulk variant callers are applied to single-cell or ultra-deep sequencing data without tailored filtering, false-positive rates increase sharply, whereas callers designed for specific sequencing modalities achieve superior performance at allele fractions below 1% [28]. These findings underscore that analytical pipelines must be co-designed with sequencing strategy and expected VAF range.

3.3. Mutational Signatures, Reference Dependence, and Hidden Bias

Analytical bias also arises during the interpretation of mutational signatures. Signature detection depends on prior knowledge of mutational processes, meaning that novel or low-prevalence signatures may remain undetected if absent from reference catalogs [29]. Half of all cataloged variants on ClinVar are classified as variants of unknown significance. Accuracy is further influenced by tumor mutational burden, signature similarity, and exposure strength, with low-level signatures often masked by dominant processes. Systematic differences between WGS and WES have been observed, reinforcing that sequencing strategy can bias downstream biological interpretation, particularly in heterogeneous or low-mutation tumors [29].

3.4. Lack of Standardization and Limits of Detection in Clinical NGS

The absence of standardized practices for NGS assay design, variant calling, and reporting further complicates low-VAF detection. Although targeted panels are increasingly used to identify actionable mutations, clinical workflows often focus on a limited set of predefined genes, and there is no universally accepted threshold for reporting low-VAF variants [30]. As a result, clinically relevant subclonal mutations may be detected but not reported or inconsistently interpreted across laboratories.

Interlaboratory variability further highlights this challenge. While high-confidence variant calls are generally reproducible across centers using standardized pipelines, concordance declines markedly as VAF decreases [31]. This variability is driven in part by the stochastic nature of sequencing coverage, where low read depth and uneven sampling reduce confidence in low-frequency variant calls. A methodological review of clinical NGS workflows demonstrated that even at 100× coverage with a nominal requirement of 10 supporting reads, false-negative rates approached 45%, corresponding to an effective limit of detection near 10% VAF [32]. Increasing depth to 500× improved performance but still resulted in inconsistent detection across laboratories, with reported limits of detection ranging from 5 to 15% VAF. Only at depths exceeding 1650× did low-frequency variants near 3% VAF become reliably detectable, highlighting the escalating cost and diminishing returns of depth-based approaches alone.

In parallel, variability in reporting guidelines further contributes to inconsistency in clinical interpretation. Multiple sets of oncological variant reporting standards have been proposed by different professional organizations, leading to differences in how low-VAF variants are classified and communicated to clinicians [33]. Without harmonized thresholds for detection, validation, and reporting, low-VAF variant interpretation remains highly dependent on institutional practices.

Collectively, these factors demonstrate that low-VAF detection is not only a technical challenge, but also a problem of standardization and reproducibility. Addressing these issues will require coordinated efforts to define consistent limits of detection, establish reporting guidelines, and align bioinformatic pipelines across clinical laboratories.

4. Variant-Class–Specific Challenges in Low-VAF Detection

Building on the technical constraints described in Section 3, the detectability of low-frequency variants is strongly influenced by variant class, as different types of genomic alterations generate distinct signal characteristics and error profiles. Across sequencing platforms and assay designs, a consistent hierarchy in low-VAF detectability emerges: SNVs are most readily detected, followed by indels, while structural variants SVs remain the most challenging.

4.1. SNVs: Highest Sensitivity at Low-VAF

SNVs are consistently the most tractable class at low-VAFs, including below 1% VAF. This reflects the relative simplicity of SNV signals, the maturity of short-read sequencing technologies, and extensive development of error-aware computational frameworks. Methods such as RareVar demonstrate high precision and recall near 1% VAF under deep sequencing conditions, while machine-learning-based approaches like DETexT further extend SNV detectability even at reduced sequencing depths [34,35]. These studies highlight that SNV detection at low-VAF is both technically feasible and increasingly robust, even under conditions of low tumor purity or high intratumoral heterogeneity (Table 1).

4.2. Indels: Intermediate Detectability with Size-Dependent Limitations

Indels exhibit intermediate detectability, with performance strongly influenced by event size and sequencing modality. Short-read callers generally perform well for small indels but show declining sensitivity as insertion or deletion length increases, whereas long-read approaches improve recall across a broader size range [6]. Targeted error-minimization strategies, such as svCapture, have enabled reliable detection of indels and SV junctions near 1% VAF, though false-positive rates increase below this threshold [36]. Importantly, detection of larger indels is further constrained by DNA fragmentation, rendering fresh or frozen tissue a prerequisite for accurate analysis (Table 1).

4.3. Structural Variants: Lowest Sensitivity at Low-VAF

Structural variants represent the most analytically challenging class at low allele fractions. Benchmarking studies using synthetic mosaic samples demonstrate that SV detection sensitivity drops sharply below 5% VAF across sequencing platforms, with recall remaining limited below 1% even at cumulative coverage exceeding 2000× [37]. Long-read technologies outperform short-read approaches by spanning breakpoints and repetitive regions, but gains diminish as VAF decreases and require higher DNA input and cost [37,38]. Short-read WGS, in contrast, frequently fails to anchor reads spanning insertions or complex rearrangements, leading to systematic under-detection of low-frequency SVs [39,40]. While single-cell and breakpoint-targeted approaches can partially overcome these limitations, they introduce substantial experimental complexity and limited throughput [41] (Table 1).

Table 1. Common variants and their descriptions.

Variant Class	Typical Size Range	Relative Detectability	Key Technical Strengths	Major Detection Limitations	Sequencing/Methods with Best Performance	Citations
SNVs	1 bp	Highest (detectable <1%, sometimes ≤0.1%)	Mature error models; extensive algorithmic optimization; effective use of UMIs and ML-based callers	Sensitivity drops sharply at reduced depth; false positives without error modeling	Short-read WGS/WES with error-aware callers (RareVar, DeepVariant); ML-based methods (DETexT); deep targeted panels	[34,35,42]
Indels	1–50 bp (larger indels > 50 bp overlap with SVs)	Intermediate (reliable 1–5%, size-dependent)	Improved performance with long-read sequencing; targeted error suppression reduces artifacts	Poor detection of larger indels in short reads; algorithm performance highly size-dependent	Long-read WGS (PacBio HiFi, ONT); error- minimized targeted capture (svCapture)	[6,36,42]
SVs	≥50 bp (insertions, deletions, inversions, duplications, translocations)	Lowest (sharp sensitivity loss < 5%; limited < 1%)	Long-read sequencing resolves breakpoints; single-cell and breakpoint-based methods improve resolution	High false-positive rates; poor performance in repetitive regions; lack of gold standards; high cost and DNA input	Long-read WGS (PacBio HiFi > ONT > short-read); single-cell breakpoint-based approaches	[37,38,41,43]
All Variant Classes (WES context)	—	Unreliable below 5% VAF	Broad coverage; cost- efficient for high-VAF variants	High false-positive rate at low-VAF; poor concordance across sample types	Not recommended without UMIs or ultra-deep targeting	[44]

4.4. Implications for Low-VAF Variant Interpretation

These variant-class–specific limitations are mirrored in clinical assays. Comparative evaluations of circulating tumor DNA (ctDNA) panel sequencing show consistently higher sensitivity for SNVs, intermediate performance for indels, and markedly reduced sensitivity for SVs at low-VAF, even under optimized sequencing depth and deduplication [42]. RNA fusion sequencing (RNA-FS) panels similarly outperform classical cytogenetics for detecting low-frequency or cryptic fusions, as demonstrated in acute myeloid leukemia, where RNA-FS doubled the detection rate of clinically relevant fusion events compared with karyotyping and fluorescence in situ hybridization (FISH) [45].

Finally, the limitations of WES further emphasize the importance of variant-class–aware strategies. More than half of variants identified by WES below 5% VAF fail orthogonal confirmation, with concordance dropping below 1% for low-VAF calls in FFPE samples [44]. While high-VAF driver mutations remain detectable, subclonal variants, particularly indels and SVs, are disproportionately affected, reinforcing that standard WES is poorly suited for confident low-VAF detection without molecular error correction or ultra-deep targeting.

Collectively, these findings establish that variant class is a primary determinant of low-VAF detectability. While advances in error suppression and sequencing technology have made low-frequency SNVs increasingly accessible, indels and especially SVs remain constrained by biological complexity, sequencing chemistry, and analytical limitations. For precision oncology, this hierarchy has direct clinical implications. Actionable subclonal events may be reliably detected or entirely missed depending on both the variant type and the chosen assay. Therefore, effective strategies require deliberate alignment of sequencing modality, tissue quality, and analytical framework with the variant classes most relevant to patient management.

5. Approaches to Improve Low-VAF Detection

Accurate identification of low-frequency variants plays a pivotal role in applications such as early cancer detection, minimal residual disease monitoring, and longitudinal tracking of tumor evolution. Beyond merely increasing sequencing depth, several complementary strategies have emerged that significantly improve both sensitivity and specificity. Among these, molecular enrichment, molecule-level tagging, duplex consensus sequencing, advanced computational pipelines, and orthogonal assay validation have each demonstrated strong performance in detecting variants below 1% VAF. Collectively, these methods provide a robust framework for overcoming the limitations of standard NGS in clinical and research contexts.

Selective enrichment of rare alleles before sequencing can substantially improve mutant detection, particularly through approaches such as molecular barcoding with unique molecular identifiers (UMIs), which suppress sequencing errors and enable detection of low-frequency variants [46]. Blocker displacement amplification (BDA) strategies use oligonucleotide blockers to suppress wild-type amplification and increase the relative abundance of mutant templates. Studies implementing long blocker displacement amplification (LBDA) have achieved reliable detection of mutations down to 0.5% VAF and revealed clinically relevant differences in colorectal cancer patient samples [47]. Expanding this approach, multiplexed BDA (mBDA) assays targeting up to 80 regions demonstrated quantification of rare variants at frequencies as low as 0.019% with only 250× sequencing depth [48]. These results indicate that targeted pre-sequencing enrichment can offer a cost-effective alternative to deep sequencing without compromising detection accuracy.

UMIs and structured barcode systems represent another critical innovation for low-VAF variant detection. Tagging each input molecule with a unique sequence before amplification enables reconstruction of original molecules and elimination of polymerase and sequencing artifacts (Figure 2). UMI-based workflows applied to cfDNA have detected variants down to 0.09% VAF with high confidence [49]. Moreover, structured UMI designs that are engineered to prevent index misassignment allow detection of variants at or below 0.01% VAF [50]. Similarly, barcode-enabled consensus algorithms significantly reduce false positives and show strong concordance with orthogonal assays [51]. These improvements underscore the advantage of molecular tagging for distinguishing authentic low-frequency events from technical noise.

Strand-aware duplex sequencing provides another layer of precision by independently tagging both DNA strands and only calling variants observed on both. This method eliminates most single-strand damage artifacts, achieving theoretical background error rates near 10⁻⁹ per base [52]. In clinical settings, duplex sequencing has revealed clinically actionable variants below 0.01% VAF in pediatric leukemia cases that conventional NGS failed to detect [53]. Complementing these wet-lab innovations, advanced computational pipelines now employ machine learning, contextual error modeling, and depth-aware binomial filters to improve variant classification. For instance, a hybrid pipeline using XGBoost classification identified intrahost human papillomavirus (HPV) variants down to 0.3% VAF with strong precision [54], while benchmarking of variant callers confirms that UMI-aware methods outperform raw-read-based approaches below 1% VAF [55,56].

Finally, droplet digital PCR (ddPCR) remains a powerful orthogonal validation tool, offering near-digital quantification and detection limits as low as 0.005% VAF [57]. When integrated with enrichment, molecular barcoding, duplex-consensus sequencing, and computational filtering, ddPCR provides an independent benchmark for confirming rare variants. Collectively, these strategies represent a shift away from depth-centric sequencing toward holistic optimization of library preparation, error suppression, and analytical interpretation, defining the current standard for accurate low-frequency variant detection in precision genomics.

6. Biological Validation and Clinical Relevance

Low-VAF detection has direct clinical utility across several key applications, including targeted therapy selection, minimal residual disease (MRD) monitoring, early detection of recurrence, identification of resistance mutations, and longitudinal disease tracking through circulating tumor DNA (ctDNA). In these contexts, low-frequency variants provide clinically actionable information by informing therapeutic decisions, predicting relapse prior to radiographic progression, and enabling dynamic assessment of disease burden over time.

The choice of detection modality is closely aligned with the clinical objective. For targeted therapy selection and resistance mutation identification, deep targeted sequencing panels—often incorporating molecular barcoding or duplex-consensus error suppression—provide high sensitivity and specificity across predefined actionable loci. In contrast, applications requiring ultra-high sensitivity, such as MRD detection and early recurrence monitoring, typically rely on ddPCR or ultra-deep, error-corrected sequencing approaches capable of detecting variants at or below ~0.1% VAF. For ctDNA-based disease tracking, targeted sequencing of plasma-derived cfDNA, combined with molecular error suppression and longitudinal sampling, enables non-invasive monitoring of tumor dynamics. Across these use cases, orthogonal validation—commonly via ddPCR or independent targeted assays—remains critical when low-VAF findings directly inform clinical decision-making. While targeted panels and ddPCR are established in clinical workflows, many ultra-deep and error-corrected sequencing approaches remain in transition from research to clinical validation.

These technical considerations have clear clinical consequences. In a targeted sequencing study of more than 5000 tumor samples, a substantial fraction of clinically actionable mutations that included EGFR, KRAS, PIK3CA, and BRAF were present below 5% VAF, a range where standard short-read NGS workflows frequently fail [4]. Notably, patients harboring such low-frequency mutations nonetheless derived clinical benefit from targeted therapy, including a metastatic lung cancer patient with an EGFR T790M mutation at 3–4% VAF who achieved partial remission (Table 2). These cases illustrate that low-VAF variants are not merely technical noise but can be biologically and therapeutically decisive.

Improved detection sensitivity has also reshaped estimates of mutation prevalence and treatment eligibility. Ultra-sensitive detection of ESR1 Y537S and D538G mutations down to 0.003% VAF increased their apparent prevalence in primary breast cancer from ~1% to over 12%, directly expanding the population eligible for selective estrogen receptor degrader (SERD) therapies [58]. Similarly, duplex sequencing in acute myeloid leukemia uncovered extensive subclonal heterogeneity with variants below 1% VAF. These rare allele variants were present in up to 53% of the blast population once the cancer relapsed, suggesting subclonal expansion [59]. These findings highlight the extent of clinically meaningful variation that is systematically missed by conventional sequencing approaches.

Low-VAF detection is particularly impactful in ctDNA analysis. Conventional ctDNA assays often fail to detect variants below 5% VAF due to background noise and dilution by cell-free DNA, especially in early disease or minimal residual disease (MRD) settings [60]. Advanced methods, including joint-genotype modeling, ultra-deep sequencing, and integrative error-suppression strategies, have enabled reliable detection down to 0.1% VAF or lower, with some workflows achieving sensitivity below 0.01% while maintaining high specificity [61,62,63,64]. Across studies, ctDNA levels and measured VAF correlate with tumor burden and clinical outcomes, supporting ctDNA as a quantitative biomarker of disease status when shedding is sufficient [65,66,67,68].

However, ctDNA detection remains constrained by biological factors. In early disease and MRD, tumor-derived DNA may constitute less than 1% of total cell-free DNA (cfDNA), increasing susceptibility to noise and false negatives [69,70]. While high-depth sequencing can partially mitigate these limitations, low shedding tumors and temporal clonal drift can still obscure detection [60,62,71,72,73]. Accordingly, a negative ctDNA result should not be interpreted as absence of disease, and sampling timing relative to tumor evolution is critical [65,74]. Notably, ctDNA VAF has been shown to correlate with survival, further supporting its value in clinical monitoring [75].

Beyond cross-sectional concordance, ultra-low-VAF detection has proven especially valuable for monitoring MRD and therapeutic resistance. Persistence or reemergence of low-frequency ctDNA variants following treatment consistently predicts relapse months before radiographic or clinical progression [76,77,78]. For example, in acute myeloid leukemia, patients with detectable low-VAF ctDNA following remission had significantly higher relapse rates compared to those without detectable variants [79]. Similarly, in colorectal cancer, low-VAF ctDNA detection identified recurrence a median of 112 days earlier than imaging-based diagnosis [80] (Table 2).

Moreover, low-VAF mutations detected in primary tumors or early ctDNA samples often represent pre-existing resistant subclones, such as EGFR, ESR1, or KRAS mutations, that later expand under treatment pressure [5,13,81]. Early identification of these variants enables anticipation of therapeutic failure and supports adaptive treatment strategies [82] (Table 2).

Table 2. Clinically relevant variants and their descriptions.

Gene	Cancer Context	Typical VAF Context	Clinical Implication	Key Finding
TP53 (subclonal)	Chronic lymphocytic leukemia (CLL)	~1–5%	Prognostic/relapse	Subclonal TP53 mutations (~2% VAF) are frequently missed by standard methods but confer similar relapse risk as clonal TP53 alterations [83,84].
TP53 (low-VAF)	Follicular lymphoma (FL)	<10%	Prognostic	Low-VAF TP53 mutations are associated with increased treatment resistance and disease progression [85].
TP53 and KRAS (subclonal)	T-Cell acute lymphoblastic leukemia	<10%	Prognostic/treatment stratification	Subclonal TP53 and KRAS mutations may identify high-risk patients and inform treatment intensification strategies [86].
TP53 (subclonal)	Multiple (CLL, FL)	<10%	Resistance	Subclonal TP53 alterations are consistently associated with treatment resistance across cancer types [87,88].
KRAS (subclonal)	Metastatic Colorectal Cancer	<5%	Resistance	Low-frequency KRAS mutations drive resistance to anti-EGFR therapy and expand under treatment pressure [13,80,89,90,91].
FLT3-ITD (allelic frequency/ratio)	Acute Myeloid Leukemia (AML)	Variable	Prognostic	Higher allelic burden correlates with worse prognosis and increased relapse risk [92,93].
DNMT3A (subclonal)	Acute Myeloid Leukemia (AML)	<10%	Prognostic/relapse	DNMT3A mutations persist in pre-leukemic clones and are associated with increased relapse risk and reduced overall survival [94].
IDH1/IDH2 (subclonal)	Acute Myeloid Leukemia (AML)	<10%	Prognostic	Subclonal IDH mutations contribute to clonal evolution and may influence relapse dynamics [59,95].
NPM1 (low-VAF)	NPM1-mutated AML	MRD-level	MRD marker	Persistent low-VAF NPM1 mutations are widely used as markers of minimal residual disease and relapse prediction [79,96].
BCR-ABL1 (low-VAF)	Chronic Myeloid Leukemia	ultra-low	MRD/resistance	Low-level BCR-ABL1 detection enables early identification of relapse or treatment failure [97,98].
EGFR T790M (low-VAF)	Non-Small Cell Lung Cancer	<5%	Resistance/predictive	Low-VAF T790M mutations predict resistance to EGFR inhibitors and influence response to osimertinib; lower VAF is associated with poorer outcomes [99,100,101].

Together, these studies demonstrate that low-VAFs detected in tissue and ctDNA are biologically meaningful indicators of tumor evolution, relapse risk, and resistance. Their reliable detection is therefore central to precision oncology, informing treatment selection, disease monitoring, and timely therapeutic intervention.

7. Orthogonal Validation of Low-VAF Calls

As VAFs approach the intrinsic error rates of NGS, independent confirmation becomes essential for distinguishing true low-VAFs from technical artifacts. Orthogonal validation strategies, therefore, play a critical role in establishing confidence in low-VAF calls, particularly when such variants inform clinical decision-making or longitudinal disease monitoring.

7.1. Long-Read Sequencing Technology

Long-read sequencing provides one approach to orthogonal validation by resolving genomic contexts that are difficult to interrogate with short reads. Platforms such as PacBio HiFi and Oxford Nanopore Technologies (ONT) differ in their tradeoffs between read length and per-read accuracy. PacBio HiFi sequencing achieves higher base-calling accuracy through circular consensus sequencing, whereas ONT provides substantially longer reads that improve detection of structural variants and complex genomic regions [102,103,104,105]. While long-read approaches are generally less sensitive for ultra-low-VAF detection than error-corrected short-read methods, they provide complementary validation by confirming variant structure and breakpoint resolution in regions prone to alignment ambiguity [106,107].

7.2. ddPCR

Highly sensitive targeted assays remain central to orthogonal confirmation of low-VAF variants. Droplet digital PCR (ddPCR) enables near-digital quantification of mutant alleles through reaction partitioning, achieving detection limits as low as 0.006–0.16% VAF in clinical settings [108]. However, ddPCR requires prior knowledge of the variant and is therefore best suited for confirming predefined, clinically actionable mutations rather than for discovery.

7.3. MIPP-Seq

To improve scalability, multiplexed validation approaches extend beyond single-variant assays. MIPP-Seq (Multiple Independent Primer PCR Sequencing) combines ultra-deep sequencing with multiple nonoverlapping amplicons per locus, reducing locus-specific artifacts through replication and enabling simultaneous confirmation of multiple low-VAF SNVs and indels with reported sensitivity near 0.025% VAF [109]. These approaches are particularly useful when validating larger sets of candidate variants.

7.4. Enriched Sanger Sequencing

Additional enrichment-based methods provide rapid and cost-effective confirmation for targeted variants. Blocker displacement amplification coupled with Sanger sequencing has demonstrated detection of mutations down to ~0.2% VAF with concordance to ddPCR and NGS results [110]. Similarly, strand displacement reaction-based approaches have reported detection near 0.1% VAF with high specificity in targeted applications [111]. While these methods are limited to known variants, they offer efficient confirmation for clinically relevant hotspots.

7.5. CODEC

At the sequencing level, duplex-based error suppression strategies offer an alternative form of internal validation. Concatenating Original Duplex for Error Correction (CODEC) enforces strand concordance, achieving substantially reduced error rates and enabling detection of rare variants using fewer reads than conventional duplex sequencing [112]. Although not an independent assay in the traditional sense, duplex-consistency constraints provide strong internal evidence for distinguishing true variants from technical noise.

7.6. Computational Tools

Computational frameworks further complement experimental validation by prioritizing likely true variants. The Cancer-associated Variant Enrichment (CAVE) method integrates positional and recurrence patterns to estimate the likelihood that low-frequency calls represent biological signal variants for confirmation, achieving strong concordance with ddPCR for variants below 1% VAF [113]. These approaches are most effective when used in combination with experimental validation rather than as stand-alone methods.

Collectively, these orthogonal strategies provide complementary approaches to validate low-VAF variants across clinical and research settings. While no single method is universally applicable, integrating targeted confirmation, duplex-level error suppression, and computational prioritization substantially improves confidence in low-frequency variant calls. As low-VAF detection becomes increasingly incorporated into clinical workflows, orthogonal validation will remain essential for ensuring analytical rigor and preventing misinterpretation of rare but clinically consequential mutations [114,115].

8. Emerging Technologies, Limitations, and Future Directions

Recent advances in sequencing chemistry, amplification fidelity, and analytical frameworks have made it increasingly feasible to resolve low-frequency somatic variants at cellular resolution. Single-cell DNA sequencing, multiomic profiling, and spatial mutation mapping now enable discrimination of true rare variants from technical noise while linking them to cell state, lineage, and microenvironmental context. This is an essential capability for interpreting low-VAF events that are obscured in bulk analyses. A summary of these emerging technologies is provided in Table 3.

Multiomic single-cell approaches exemplify this progress by jointly capturing genomic and transcriptional information from individual cells. Methods such as DEFND-seq (DNA and Expression Following Nucleosome Depletion Sequencing) and related co-sequencing strategies have demonstrated the ability to detect rare SNVs and copy-number variants, resolve low-abundance subclones, and associate these alterations with cell-specific transcriptional programs [116]. Parallel improvements in amplification chemistry, including primary template-directed amplification (PTA), have reduced coverage bias and allele dropout, enabling more uniform genome representation and improved detection of low-frequency SNVs, indels, and SVs in single cells [117,118].

Targeted DNA–RNA co-profiling and breakpoint-aware strategies further enhance sensitivity by concentrating sequencing depth on loci of interest and interrogating variant structure directly. Approaches such as single-cell DNA-RNA sequencing (SDR-seq) enable accurate per-cell zygosity inference for variants present at low allele fractions [119]. Breakpoint-specific amplicon sequencing has demonstrated that read depth alone is insufficient to capture the full spectrum of somatic structural variation, particularly for complex rearrangements [41]. These methods highlight how variant class fundamentally shapes the strategies required for low-VAF detection.

Spatially resolved assays add a critical dimension by mapping rare variants back to histological and microenvironmental context. Spatial mutation profiling has revealed that low-frequency driver mutations can be confined to discrete tumor regions and cell populations, uncovering spatial heterogeneity that is invisible to bulk sequencing [120,121]. Such findings reinforce that low-VAF variants are often biologically localized rather than uniformly distributed across tumors.

Advances in bioinformatics have been equally important in translating these technologies into reliable discovery tools. Modality-aware variant callers and statistical noise-modeling frameworks aggregate weak signals across large single-cell datasets to distinguish true rare variants from artifacts, consistently outperforming bulk callers repurposed for single-cell data at low allele fractions [28,122]. Integrative multiomic platforms further link genomic alterations to phenotypic states, revealing how clonal fitness and selection operate even when subclones persist at low abundance [7,123].

Despite these advances, summarized in Table 3, several barriers remain to the routine implementation of low-VAF detection. Reliable identification of rare variants often requires ultra-deep sequencing coverage, increasing cost, computational burden, and data storage demands. In addition, sequencing and PCR error rates overlap with true low-frequency signals, necessitating robust error-suppression strategies such as molecular barcoding and duplex consensus methods. Detection is further complicated by genomic context, including repetitive regions and complex structural rearrangements, as well as variability in bioinformatic pipelines and a lack of standardized detection thresholds. Finally, approaches that achieve the highest sensitivity often sacrifice genomic breadth, whereas genome-wide methods frequently lack sufficient depth for confident low-VAF detection.

In summary, these emerging technologies demonstrate that robust interpretation of low-frequency variants requires more than increased sequencing depth. By integrating high-fidelity amplification, variant-class–aware detection, spatial context, and multiomic analysis, next-generation workflows place rare variants within their appropriate cellular and evolutionary framework. As these approaches mature and scale, they will play a central role in translating low-VAF detection into actionable biological and clinical insight.

9. Conclusions

Low-frequency somatic variants are a fundamental consequence of intratumoral heterogeneity and tumor evolution rather than peripheral technical artifacts. Across cancer types, variants presenting below conventional reporting thresholds encode critical information about subclonal architecture, therapeutic resistance, minimal residual disease, and relapse risk, and their clinical relevance has been repeatedly demonstrated in both tissue and circulating tumor DNA analyses. This review highlights that sensitivity at VAF < 5% is governed not by sequencing depth alone, but by the combined effects of tissue preservation, read length, genomic context, library preparation, error modeling, and variant-class–specific detectability. A consistent hierarchy emerges in which SNVs are most readily detected at low-VAF, followed by indels, while SVs remain the most challenging. Accordingly, assay selection must be aligned with the biological and clinical questions being addressed, balancing genomic breadth with the sensitivity required to detect clinically meaningful subclonal variation. While WGS and WES provide essential discovery and contextual information, targeted and error-corrected approaches are required for reliable detection of low-VAF variants with direct clinical impact. Moving forward, effective clinical integration of low-VAF information will depend on standardized performance benchmarks, modality-aware bioinformatic frameworks, and scalable orthogonal validation strategies. As single-cell, spatial, and multiomic technologies mature, low-frequency variants can be interpreted within their proper cellular, clonal, and microenvironmental context, enabling precision oncology to progress beyond dominant-clone paradigms toward truly subclonal-informed diagnosis and treatment.

Author Contributions

Conceptualization, K.D. and J.J.B.; methodology, J.J.B.; writing—original draft preparation, P.K., J.H., I.S., B.K., A.S.T., L.P., D.A., E.B., B.L., J.S., I.C., H.C., A.T., A.B.A., K.D. and J.J.B.; writing—review and editing, P.K., J.H., I.S., B.K., A.S.T., L.P., D.A., E.B., B.L., J.S., I.C., H.C., A.T., A.B.A., K.D. and J.J.B.; visualization, H.C. and A.T.; supervision, J.J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data was created.

Acknowledgments

We would like to acknowledge the Simmons Center for Cancer Research for its generous support of student fellowships. Also, publications would not be possible without the financial support from the Department of Cell Biology and Physiology and the College of Life Sciences at Brigham Young University.

Conflicts of Interest

Authors Ken Dixon and Jared Barrott were employed by the company Specicare. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ALL	Acute Lymphoblastic Leukemia
BDA	Blocker Displacement Amplification
CAVE	The Cancer-associated Variant Enrichment
cfDNA	Cell-free DNA
CODEC	Concate-nating Original Duplex for Error
ctDNA	Circulating Tumor DNA
ddPCR	Droplet Digital PCR
DEFND-seq	DNA and Expression Following Nucleosome Depletion Sequencing
DNA	Deoxyribonucleic Acid
FFPE	Formalin-fixed paraffin-embedded
FISH	Fluorescence In Situ Hybridization
HPV	Human papillomavirus
Indels	Insertions or Deletions
LBDA	Long Blocker Displacement Amplification
MATH	Mutant-Allele Tumor Heterogeneity
mBDA	Multiplexed Blocker Displacement Amplification
MIPP-Seq	Multiple Independent Primer PCR Sequencing
MRD	Minimal Residual Disease
NGS	Next-Generation Sequencing
ONT	Oxford Nanopore Technology
PCR	Polymerase Chain Reaction
PTA	Primary Template-directed Amplification
RNA	Ribonucleic Acid
RNA-FS	RNA Fusion Sequencing
SDR-seq	Single-cell DNA-RNA Sequencing
SERD	Selective Estrogen Receptor Degrader
SNV	Single Nucleotide Variants
SV	Structural Variants
TCGA	The Cancer Genome Atlas
UMI	Unique Molecular Identifiers
VAF	Variant Allele Frequency
WES	Whole Exome Sequencing
WGS	Whole Genome Sequencing

References

Raynaud, F.; Mina, M.; Tavernari, D.; Ciriello, G. Pan-cancer inference of intra-tumor heterogeneity reveals associations with different forms of genomic instability. PLoS Genet. 2018, 14, e1007669. [Google Scholar] [CrossRef]
Dentro, S.C.; Leshchiner, I.; Haase, K.; Tarabichi, M.; Wintersinger, J.; Deshwar, A.G.; Yu, K.; Rubanova, Y.; Macintyre, G.; Demeulemeester, J.; et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 2021, 184, 2239–2254.e39. [Google Scholar] [CrossRef]
Starks, E.R.; Swanson, L.; Docking, T.R.; Bosdet, I.; Munro, S.; Moore, R.A.; Karsan, A. Assessing Limit of Detection in Clinical Sequencing. J. Mol. Diagn. 2021, 23, 455–466. [Google Scholar] [CrossRef]
Shin, H.T.; Choi, Y.L.; Yun, J.W.; Kim, N.K.D.; Kim, S.Y.; Jeon, H.J.; Nam, J.Y.; Lee, C.; Ryu, D.; Kim, S.C.; et al. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. Nat. Commun. 2017, 8, 1377. [Google Scholar] [CrossRef]
Sisoudiya, S.D.; Tukachinsky, H.; Keller-Evans, R.B.; Schrock, A.B.; Huang, R.S.P.; Gjoerup, O.; Pishvaian, M.J.; Shroff, R.; Sokol, E.S.; Dennis, L.; et al. Tissue-based genomic profiling of 300,000 tumors highlights the detection of variants with low allele fraction. npj Precis. Oncol. 2025, 9, 190. [Google Scholar] [CrossRef] [PubMed]
Kosugi, S.; Terao, C. Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data. Hum. Genome Var. 2024, 11, 18. [Google Scholar] [CrossRef] [PubMed]
Leppä, A.M.; Grimes, K.; Jeong, H.; Huang, F.Y.; Andrades, A.; Waclawiczek, A.; Boch, T.; Jauch, A.; Renders, S.; Stelmach, P.; et al. Single-cell multiomics analysis reveals dynamic clonal evolution and targetable phenotypes in acute myeloid leukemia with complex karyotype. Nat. Genet. 2024, 56, 2790–2803. [Google Scholar] [CrossRef]
Antić, Ž.; Yu, J.; Van Reijmersdal, S.V.; Van Dijk, A.; Dekker, L.; Segerink, W.H.; Sonneveld, E.; Fiocco, M.; Pieters, R.; Hoogerbrugge, P.M.; et al. Multiclonal complexity of pediatric acute lymphoblastic leukemia and the prognostic relevance of subclonal mutations. Haematologica 2021, 106, 3046–3055. [Google Scholar] [CrossRef] [PubMed]
Jeon, A.J.; Teo, Y.Y.; Sekar, K.; Chong, S.L.; Wu, L.; Chew, S.C.; Chen, J.; Kendarsari, R.I.; Lai, H.; Ling, W.H.; et al. Multi-region sampling with paired sample sequencing analyses reveals sub-groups of patients with novel patient-specific dysregulation in Hepatocellular Carcinoma. BMC Cancer 2023, 23, 118. [Google Scholar] [CrossRef]
Zhou, R.; Liang, J.; Chen, Q.; Tian, H.; Yang, C.; Liu, C. Development and validation of an intra-tumor heterogeneity-related signature to predict prognosis of bladder cancer: A study based on single-cell RNA-seq. Aging 2021, 13, 19415–19441. [Google Scholar] [CrossRef]
Peng, J.; Sun, B.F.; Chen, C.Y.; Zhou, J.Y.; Chen, Y.S.; Chen, H.; Liu, L.; Huang, D.; Jiang, J.; Cui, G.S.; et al. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 2019, 29, 725–738. [Google Scholar] [CrossRef]
Gerlinger, M.; Horswell, S.; Larkin, J.; Rowan, A.J.; Salm, M.P.; Varela, I.; Fisher, R.; McGranahan, N.; Matthews, N.; Santos, C.R.; et al. Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing. Nat. Genet. 2014, 46, 225–233. [Google Scholar] [CrossRef]
Diaz, L.A., Jr.; Williams, R.T.; Wu, J.; Kinde, I.; Hecht, J.R.; Berlin, J.; Allen, B.; Bozic, I.; Reiter, J.G.; Nowak, M.A.; et al. The molecular evolution of acquired resistance to targeted EGFR blockade in colorectal cancers. Nature 2012, 486, 537–540. [Google Scholar] [CrossRef]
Kerle, I.A.; Gross, T.; Kögler, A.; Arnold, J.S.; Werner, M.; Eckardt, J.N.; Möhrmann, E.E.; Arlt, M.; Hutter, B.; Hüllein, J.; et al. Translational and clinical comparison of whole genome and transcriptome to panel sequencing in precision oncology. npj Precis. Oncol. 2025, 9, 9. [Google Scholar] [CrossRef] [PubMed]
Daniels, C.A.; Abdulkadir, A.A.; Cleveland, M.H.; McDaniel, J.H.; Jáspez, D.; Rubio-Rodríguez, L.A.; Muñoz-Barrera, A.; Lorenzo-Salazar, J.M.; Flores, C.; Yoo, B.; et al. Characterization of subclonal variants in HG002 Genome in a Bottle reference material as a resource for benchmarking variant callers. Cell Genom. 2025, 101104. [Google Scholar] [CrossRef] [PubMed]
Sudha, P.; Ahsan, A.; Ashby, C.; Kausar, T.; Khera, A.; Kazeroun, M.H.; Hsu, C.C.; Wang, L.; Fitzsimons, E.; Salminen, O.; et al. Myeloma Genome Project Panel is a Comprehensive Targeted Genomics Panel for Molecular Profiling of Patients with Multiple Myeloma. Clin. Cancer Res. 2022, 28, 2854–2864. [Google Scholar] [CrossRef] [PubMed]
Das, K.; Tay, M.L.I.; Yong, E.Y.; Chuah, K.L. A targeted next-generation sequencing panel for identification of clinically relevant mutation profiles in solid tumours. Sci. Rep. 2025, 15, 20740. [Google Scholar] [CrossRef]
Bora, E.; Caglayan, A.O.; Koc, A.; Cankaya, T.; Ozkalayci, H.; Kocabey, M.; Kemer, D.; Aksoy, S.; Alicikus, Z.A.; Akin, I.B.; et al. Evaluation of hereditary/familial breast cancer patients with multigene targeted next generation sequencing panel and MLPA analysis in Turkey. Cancer Genet. 2022, 262–263, 118–133. [Google Scholar] [CrossRef]
Ramarao-Milne, P.; Kondrashova, O.; Patch, A.M.; Nones, K.; Koufariotis, L.T.; Newell, F.; Addala, V.; Lakis, V.; Holmes, O.; Leonard, C.; et al. Comparison of actionable events detected in cancer genomes by whole-genome sequencing, in silico whole-exome and mutation panels. ESMO Open 2022, 7, 100540. [Google Scholar] [CrossRef]
The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature 2020, 578, 82–93. [CrossRef]
Dixon, K.; Lee, J.H.; Miller, R.; Booker, D.; Anderson, D.; Okojie, J.; Kirkham, M.; Lee, E.K.; Bao, C.; Tuncay, I.O.; et al. Cryopreserved Tissue Biospecimens Offer Superior Quality for Whole-Genome Sequencing of Various Cancers Compared to Paired Formalin-Fixed Paraffin-Embedded Tissues. Int. J. Mol. Sci. 2025, 26, 11038. [Google Scholar] [CrossRef]
Okojie, J.; O’Neal, N.; Burr, M.; Worley, P.; Packer, I.; Anderson, D.; Davis, J.; Kearns, B.; Fatema, K.; Dixon, K.; et al. DNA Quantity and Quality Comparisons between Cryopreserved and FFPE Tumors from Matched Pan-Cancer Samples. Curr. Oncol. 2024, 31, 2441–2452. [Google Scholar] [CrossRef] [PubMed]
Basyuni, S.; Heskin, L.; Degasperi, A.; Black, D.; Koh, G.C.C.; Chmelova, L.; Rinaldi, G.; Bell, S.; Grybowicz, L.; Elgar, G.; et al. Large-scale analysis of whole genome sequencing data from formalin-fixed paraffin-embedded cancer specimens demonstrates preservation of clinical utility. Nat. Commun. 2024, 15, 7731. [Google Scholar] [CrossRef]
Gao, X.H.; Li, J.; Gong, H.F.; Yu, G.Y.; Liu, P.; Hao, L.Q.; Liu, L.J.; Bai, C.G.; Zhang, W. Comparison of Fresh Frozen Tissue with Formalin-Fixed Paraffin-Embedded Tissue for Mutation Analysis Using a Multi-Gene Panel in Patients with Colorectal Cancer. Front. Oncol. 2020, 10, 310. [Google Scholar] [CrossRef] [PubMed]
Steiert, T.A.; Parra, G.; Gut, M.; Arnold, N.; Trotta, J.-R.; Tonda, R.; Moussy, A.; Gerber, Z.; Abuja, P.M.; Zatloukal, K.; et al. A critical spotlight on the paradigms of FFPE-DNA sequencing. Nucleic Acids Res. 2023, 51, 7143–7162. [Google Scholar] [CrossRef]
Robbe, P.; Popitsch, N.; Knight, S.J.L.; Antoniou, P.; Becq, J.; He, M.; Kanapin, A.; Samsonova, A.; Vavoulis, D.V.; Ross, M.T.; et al. Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: Pilot study for the 100,000 Genomes Project. Genet. Med. 2018, 20, 1196–1205. [Google Scholar] [CrossRef]
Lincoln, S.E.; Hambuch, T.; Zook, J.M.; Bristow, S.L.; Hatchell, K.; Truty, R.; Kennemer, M.; Shirts, B.H.; Fellowes, A.; Chowdhury, S.; et al. One in seven pathogenic variants can be challenging to detect by NGS: An analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementation. Genet. Med. 2021, 23, 1673–1680. [Google Scholar] [CrossRef]
Wiens, M.; Farahani, H.; Scott, R.W.; Underhill, T.M.; Bashashati, A. Benchmarking bulk and single-cell variant-calling approaches on Chromium scRNA-seq and scATAC-seq libraries. Genome Res. 2024, 34, 1196–1210. [Google Scholar] [CrossRef]
Abbasi, A.; Alexandrov, L.B. Significance and limitations of the use of next-generation sequencing technologies for detecting mutational signatures. DNA Repair 2021, 107, 103200. [Google Scholar] [CrossRef] [PubMed]
Gong, B.; Li, D.; Kusko, R.; Novoradovskaya, N.; Zhang, Y.; Wang, S.; Pabón-Peña, C.; Zhang, Z.; Lai, K.; Cai, W.; et al. Cross-oncopanel study reveals high sensitivity and accuracy with overall analytical performance depending on genomic regions. Genome Biol. 2021, 22, 109. [Google Scholar] [CrossRef]
Xiao, W.; Ren, L.; Chen, Z.; Fang, L.T.; Zhao, Y.; Lack, J.; Guan, M.; Zhu, B.; Jaeger, E.; Kerrigan, L.; et al. Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat. Biotechnol. 2021, 39, 1141–1150. [Google Scholar] [CrossRef] [PubMed]
Petrackova, A.; Vasinek, M.; Sedlarikova, L.; Dyskova, T.; Schneiderova, P.; Novosad, T.; Papajik, T.; Kriegova, E. Standardization of Sequencing Coverage Depth in NGS: Recommendation for Detection of Clonal and Subclonal Mutations in Cancer Diagnostics. Front. Oncol. 2019, 9, 851. [Google Scholar] [CrossRef]
Pavlick, D.C.; Frampton, G.M.; Ross, J.R. Understanding variants of unknown significance and classification of genomic alterations. Oncologist 2024, 29, 658–666. [Google Scholar] [CrossRef] [PubMed]
Hao, Y.; Xuei, X.; Li, L.; Nakshatri, H.; Edenberg, H.J.; Liu, Y. RareVar: A Framework for Detecting Low-Frequency Single-Nucleotide Variants. J. Comput. Biol. 2017, 24, 637–646. [Google Scholar] [CrossRef]
Zheng, T. DETexT: An SNV detection enhancement for low read depth by integrating mutational signatures into TextCNN. Front. Genet. 2022, 13, 943972. [Google Scholar] [CrossRef]
Wilson, T.E.; Ahmed, S.; Higgins, J.; Salk, J.J.; Glover, T.W. svCapture: Efficient and specific detection of very low frequency structural variant junctions by error-minimized capture sequencing. NAR Genom. Bioinform. 2023, 5, lqad042. [Google Scholar] [CrossRef]
Zhang, Y.; English, A.C.; Paulin, L.F.; Grochowski, C.M.; Maheshwari, S.; Mack, T.; Berselli, M.; Veit, A.D.; Fu, Y.; Park, P.J.; et al. Comprehensive benchmarking of somatic structural variant detection at ultra-low allele fractions. bioRxiv 2025. [Google Scholar] [CrossRef]
Liu, Z.; Roberts, R.; Mercer, T.R.; Xu, J.; Sedlazeck, F.J.; Tong, W. Towards accurate and reliable resolution of structural variants for clinical diagnosis. Genome Biol. 2022, 23, 68. [Google Scholar] [CrossRef]
Pei, Y.; Tanguy, M.; Giess, A.; Dixit, A.; Wilson, L.C.; Gibbons, R.J.; Twigg, S.R.F.; Elgar, G.; Wilkie, A.O.M. A Comparison of Structural Variant Calling from Short-Read and Nanopore-Based Whole-Genome Sequencing Using Optical Genome Mapping as a Benchmark. Genes 2024, 15, 925. [Google Scholar] [CrossRef]
Moustakli, E.; Christopoulos, P.; Potiris, A.; Zikopoulos, A.; Mavrogianni, D.; Karampas, G.; Kathopoulis, N.; Anagnostaki, I.; Domali, E.; Tzallas, A.T.; et al. Long-Read Sequencing and Structural Variant Detection: Unlocking the Hidden Genome in Rare Genetic Disorders. Diagnostics 2025, 15, 1803. [Google Scholar] [CrossRef] [PubMed]
Easton, J.; Gonzalez-Pena, V.; Yergeau, D.; Ma, X.; Gawad, C. Genome-wide segregation of single nucleotide and structural variants into single cancer cells. BMC Genom. 2017, 18, 906. [Google Scholar] [CrossRef]
Li, W.; Huang, X.; Patel, R.; Schleifman, E.; Fu, S.; Shames, D.S.; Zhang, J. Analytical evaluation of circulating tumor DNA sequencing assays. Sci. Rep. 2024, 14, 4973. [Google Scholar] [CrossRef]
Haga, Y.; Sakamoto, Y.; Arai, M.; Suzuki, Y.; Suzuki, A. Long-Read Whole-Genome Sequencing Using a Nanopore Sequencer and Detection of Structural Variants in Cancer Genomes. In Nanopore Sequencing; Humana: New York, NY, USA, 2023; Volume 2632, pp. 177–189. [Google Scholar] [CrossRef]
Yan, Y.H.; Chen, S.X.; Cheng, L.Y.; Rodriguez, A.Y.; Tang, R.; Cabrera, K.; Zhang, D.Y. Confirming putative variants at ≤ 5% allele frequency using allele enrichment and Sanger sequencing. Sci. Rep. 2021, 11, 11640. [Google Scholar] [CrossRef]
Hoffmeister, L.M.; Suttorp, J.; Walter, C.; Antoniou, E.; Behrens, Y.L.; Göhring, G.; Awada, A.; von Neuhoff, N.; Reinhardt, D.; Schneider, M. Panel-based RNA fusion sequencing improves diagnostics of pediatric acute myeloid leukemia. Leukemia 2024, 38, 538–544. [Google Scholar] [CrossRef] [PubMed]
Hirotsu, Y.; Otake, S.; Ohyama, H.; Amemiya, K.; Higuchi, R.; Oyama, T.; Mochizuki, H.; Goto, T.; Omata, M. Dual-molecular barcode sequencing detects rare variants in tumor and cell free DNA in plasma. Sci. Rep. 2020, 10, 3391. [Google Scholar] [CrossRef]
Si, Y.; Wang, X.; Su, X.; Weng, Z.; Hu, Q.; Li, Q.; Fan, C.; Zhang, D.Y.; Wang, Y.; Luo, S.; et al. Extended Enrichment for Ultrasensitive Detection of Low-Frequency Mutations by Long Blocker Displacement Amplification. Angew. Chem. Int. Ed. Engl. 2024, 63, e202400551. [Google Scholar] [CrossRef]
Song, P.; Chen, S.X.; Yan, Y.H.; Pinto, A.; Cheng, L.Y.; Dai, P.; Patel, A.A.; Zhang, D.Y. Selective multiplexed enrichment for the detection and quantitation of low-fraction DNA variants via low-depth sequencing. Nat. Biomed. Eng. 2021, 5, 690–701. [Google Scholar] [CrossRef]
Hermann, B.T.; Pfeil, S.; Groenke, N.; Schaible, S.; Kunze, R.; Ris, F.; Hagen, M.E.; Bhakdi, J. DEEPGEN(TM)-A Novel Variant Calling Assay for Low Frequency Variants. Genes 2021, 12, 507. [Google Scholar] [CrossRef] [PubMed]
Micallef, P.; Santamaría, M.L.; Escobar, M.; Andersson, D.; Österlund, T.; Mouhanna, P.; Filges, S.; Johansson, G.; Fagman, H.; Vannas, C.; et al. Digital sequencing is improved by using structured unique molecular identifiers. Genome Biol. 2025, 26, 37. [Google Scholar] [CrossRef] [PubMed]
Akahori, D.; Inoue, Y.; Inui, N.; Karayama, M.; Yasui, H.; Hozumi, H.; Suzuki, Y.; Furuhashi, K.; Fujisawa, T.; Enomoto, N.; et al. Comparative assessment of NOIR-SS and ddPCR for ctDNA detection of EGFR L858R mutations in advanced L858R-positive lung adenocarcinomas. Sci. Rep. 2021, 11, 14999. [Google Scholar] [CrossRef]
Schmitt, M.W.; Kennedy, S.R.; Salk, J.J.; Fox, E.J.; Hiatt, J.B.; Loeb, L.A. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl. Acad. Sci. USA 2012, 109, 14508–14513. [Google Scholar] [CrossRef]
Pilheden, M.; Ahlgren, L.; Hyrenius-Wittsten, A.; Gonzalez-Pena, V.; Sturesson, H.; Hansen Marquart, H.V.; Lausen, B.; Castor, A.; Pronk, C.J.; Barbany, G.; et al. Duplex Sequencing Uncovers Recurrent Low-frequency Cancer-associated Mutations in Infant and Childhood KMT2A-rearranged Acute Leukemia. Hemasphere 2022, 6, e785. [Google Scholar] [CrossRef] [PubMed]
Mishra, S.K.; Nelson, C.W.; Zhu, B.; Pinheiro, M.; Lee, H.J.; Dean, M.; Burdett, L.; Yeager, M.; Mirabello, L. Improved detection of low-frequency within-host variants from deep sequencing: A case study with human papillomavirus. Virus Evol. 2024, 10, veae013. [Google Scholar] [CrossRef] [PubMed]
Xiang, X.; Lu, B.; Song, D.; Li, J.; Shu, K.; Pu, D. Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data. Sci. Rep. 2023, 13, 20444. [Google Scholar] [CrossRef] [PubMed]
Maruzani, R.; Brierley, L.; Jorgensen, A.; Fowler, A. Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection. BMC Genom. 2024, 25, 827. [Google Scholar] [CrossRef]
Liu, Y.; Han, C.; Li, J.; Xu, S.; Xiao, Z.; Guo, Z.; Rao, S.; Yao, Y. Laboratory-developed Droplet Digital PCR Assay for Quantification of the JAK2 (V617F) Mutation. Glob. Med. Genet. 2024, 11, 132–141. [Google Scholar] [CrossRef]
Hashimoto, Y.; Masunaga, N.; Kagara, N.; Abe, K.; Yoshinami, T.; Tsukabe, M.; Sota, Y.; Miyake, T.; Tanei, T.; Shimoda, M.; et al. Detection of Ultra-Rare ESR1 Mutations in Primary Breast Cancer Using LNA-Clamp ddPCR. Cancers 2023, 15, 2632. [Google Scholar] [CrossRef]
Kamath-Loeb, A.S.; Shen, J.C.; Schmitt, M.W.; Kohrn, B.F.; Loeb, K.R.; Estey, E.H.; Dai, J.; Chien, S.; Loeb, L.A.; Becker, P.S. Accurate detection of subclonal variants in paired diagnosis-relapse acute myeloid leukemia samples by next generation Duplex Sequencing. Leuk. Res. 2022, 115, 106822. [Google Scholar] [CrossRef]
Iams, W.T.; Mackay, M.; Ben-Shachar, R.; Drews, J.; Manghnani, K.; Hockenberry, A.J.; Cristofanilli, M.; Nimeiri, H.; Guinney, J.; Benson, A.B., 3rd. Concurrent Tissue and Circulating Tumor DNA Molecular Profiling to Detect Guideline-Based Targeted Mutations in a Multicancer Cohort. JAMA Netw. Open 2024, 7, e2351700. [Google Scholar] [CrossRef]
Li, S.; Noor, Z.S.; Zeng, W.; Stackpole, M.L.; Ni, X.; Zhou, Y.; Yuan, Z.; Wong, W.H.; Agopian, V.G.; Dubinett, S.M.; et al. Sensitive detection of tumor mutations from blood and its application to immunotherapy prognosis. Nat. Commun. 2021, 12, 4172. [Google Scholar] [CrossRef]
Razavi, P.; Li, B.T.; Brown, D.N.; Jung, B.; Hubbell, E.; Shen, R.; Abida, W.; Juluru, K.; De Bruijn, I.; Hou, C.; et al. High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants. Nat. Med. 2019, 25, 1928–1937. [Google Scholar] [CrossRef]
Zhao, J.; Reuther, J.; Scozzaro, K.; Hawley, M.; Metzger, E.; Emery, M.; Chen, I.; Barbosa, M.; Johnson, L.; O’Connor, A.; et al. Personalized Cancer Monitoring Assay for the Detection of ctDNA in Patients with Solid Tumors. Mol. Diagn. Ther. 2023, 27, 753–768. [Google Scholar] [CrossRef]
Elliott, M.J.; Howarth, K.; Main, S.; Fuentes Antrás, J.; Echelard, P.; Dou, A.; Amir, E.; Nadler, M.B.; Shah, E.; Yu, C.; et al. Ultrasensitive Detection and Monitoring of Circulating Tumor DNA Using Structural Variants in Early-Stage Breast Cancer. Clin. Cancer Res. 2025, 31, 1520–1532. [Google Scholar] [CrossRef]
Stankunaite, R.; George, S.L.; Gallagher, L.; Jamal, S.; Shaikh, R.; Yuan, L.; Hughes, D.; Proszek, P.Z.; Carter, P.; Pietka, G.; et al. Circulating tumour DNA sequencing to determine therapeutic response and identify tumour heterogeneity in patients with paediatric solid tumours. Eur. J. Cancer 2022, 162, 209–220. [Google Scholar] [CrossRef]
Abbosh, C.; Birkbak, N.J.; Wilson, G.A.; Jamal-Hanjani, M.; Constantin, T.; Salari, R.; Le Quesne, J.; Moore, D.A.; Veeriah, S.; Rosenthal, R.; et al. Corrigendum: Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature 2018, 554, 264. [Google Scholar] [CrossRef] [PubMed]
Smith, J.T.; Balar, A.; Lakhani, D.A.; Kluwe, C.; Zhao, Z.; Kopparapu, P.; Almodovar, K.; Muterspaugh, A.; Yan, Y.; York, S.; et al. Circulating Tumor DNA as a Biomarker of Radiographic Tumor Burden in SCLC. JTO Clin. Res. Rep. 2021, 2, 100110. [Google Scholar] [CrossRef] [PubMed]
Kalashnikova, E.; Aushev, V.N.; Malashevich, A.K.; Tin, A.; Krinshpun, S.; Salari, R.; Scalise, C.B.; Ram, R.; Malhotra, M.; Ravi, H.; et al. Correlation between variant allele frequency and mean tumor molecules with tumor burden in patients with solid tumors. Mol. Oncol. 2024, 18, 2649–2657. [Google Scholar] [CrossRef] [PubMed]
Ganesamoorthy, D.; Robertson, A.J.; Chen, W.; Hall, M.B.; Cao, M.D.; Ferguson, K.; Lakhani, S.R.; Nones, K.; Simpson, P.T.; Coin, L.J.M. Whole genome deep sequencing analysis of cell-free DNA in samples with low tumour content. BMC Cancer 2022, 22, 85. [Google Scholar] [CrossRef]
Mizuno, K.; Akamatsu, S.; Sumiyoshi, T.; Wong, J.H.; Fujita, M.; Maejima, K.; Nakano, K.; Ono, A.; Aikata, H.; Ueno, M.; et al. eVIDENCE: A practical variant filtering for low-frequency variants detection in cell-free DNA. Sci. Rep. 2019, 9, 15017. [Google Scholar] [CrossRef]
Hwang, S.; Woo, S.; Kang, B.; Kang, H.; Kim, J.S.; Lee, S.H.; Kwon, C.I.; Kyung, D.S.; Kim, H.P.; Kim, G.; et al. Concordance of ctDNA and tissue genomic profiling in advanced biliary tract cancer. J. Hepatol. 2025, 82, 649–657. [Google Scholar] [CrossRef]
Knappskog, S.; Grob, T.; Venizelos, A.; Amstutz, U.; Hjortland, G.O.; Lothe, I.M.; Kersten, C.; Hofsli, E.; Sundlöv, A.; Elvebakken, H.; et al. Mutation Spectrum in Liquid Versus Solid Biopsies from Patients with Advanced Gastroenteropancreatic Neuroendocrine Carcinoma. JCO Precis. Oncol. 2023, 7, e2200336. [Google Scholar] [CrossRef]
Zhong, J.; Jiang, H.; Liu, X.; Liao, H.; Xie, F.; Shao, B.; Jia, S.; Li, H. Variant allele frequency in circulating tumor DNA correlated with tumor disease burden and predicted outcomes in patients with advanced breast cancer. Breast Cancer Res. Treat. 2024, 204, 617–629. [Google Scholar] [CrossRef]
Bettegowda, C.; Sausen, M.; Leary, R.J.; Kinde, I.; Wang, Y.; Agrawal, N.; Bartlett, B.R.; Wang, H.; Luber, B.; Alani, R.M.; et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 2014, 6, 224ra224. [Google Scholar] [CrossRef]
Shaya, S.; Uche-Ikonne, O.; Kilerci, B.; Stevenson, J.; Greystoke, A.; Cook, N.; Thistlethwaite, F.C.; Carter, L.; Graham, D.M.; Krebs, M.G. Circulating Tumor DNA as a Prognostic Biomarker for Selecting Participants to Early Phase Clinical Trials. J. Immunother. Precis. Oncol. 2025, 8, 222–232. [Google Scholar] [CrossRef]
Tie, J.; Wang, Y.; Tomasetti, C.; Li, L.; Springer, S.; Kinde, I.; Silliman, N.; Tacey, M.; Wong, H.L.; Christie, M.; et al. Circulating tumor DNA analysis detects minimal residual disease and predicts recurrence in patients with stage II colon cancer. Sci. Transl. Med. 2016, 8, 346ra392. [Google Scholar] [CrossRef] [PubMed]
Abbosh, C.; Frankell, A.M.; Harrison, T.; Kisistok, J.; Garnett, A.; Johnson, L.; Veeriah, S.; Moreau, M.; Chesh, A.; Chaunzwa, T.L.; et al. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature 2023, 616, 553–562. [Google Scholar] [CrossRef] [PubMed]
Nguyen, S.T.; Nguyen Hoang, V.A.; Nguyen Trieu, V.; Pham, T.H.; Dinh, T.C.; Pham, D.H.; Nguyen, N.; Vinh, D.N.; Do, T.T.T.; Nguyen, D.S.; et al. Personalized mutation tracking in circulating-tumor DNA predicts recurrence in patients with high-risk early breast cancer. npj Breast Cancer 2025, 11, 58. [Google Scholar] [CrossRef]
Dillon, L.W.; Higgins, J.; Nasif, H.; Othus, M.; Beppu, L.; Smith, T.H.; Schmidt, E.; Valentine, C.C., III; Salk, J.J.; Wood, B.L.; et al. Quantification of measurable residual disease using duplex sequencing in adults with acute myeloid leukemia. Haematologica 2024, 109, 401–410. [Google Scholar] [CrossRef]
Stasik, S.; Mende, M.; Schuster, C.; Mahler, S.; Aust, D.; Tannapfel, A.; Reinacher-Schick, A.; Baretton, G.; Krippendorf, C.; Bornhäuser, M.; et al. Sensitive Quantification of Cell-Free Tumor DNA for Early Detection of Recurrence in Colorectal Cancer. Front. Genet. 2021, 12, 811291. [Google Scholar] [CrossRef] [PubMed]
Martín-Arana, J.; Gimeno-Valiente, F.; Henriksen, T.V.; García-Micó, B.; Martínez-Castedo, B.; Gambardella, V.; Martínez-Ciarpaglini, C.; Palomar, B.; Huerta, M.; Camblor, D.G.; et al. Whole-exome tumor-agnostic ctDNA analysis enhances minimal residual disease detection and reveals relapse mechanisms in localized colon cancer. Nat. Cancer 2025, 6, 1000–1016. [Google Scholar] [CrossRef]
Chakravarty, D.; Johnson, A.; Sklar, J.; Lindeman, N.I.; Moore, K.; Ganesan, S.; Lovly, C.M.; Perlmutter, J.; Gray, S.W.; Hwang, J.; et al. Somatic Genomic Testing in Patients With Metastatic or Advanced Cancer: ASCO Provisional Clinical Opinion. J. Clin. Oncol. 2022, 40, 1231–1258. [Google Scholar] [CrossRef]
Rossi, D.; Gaidano, G. The clinical implications of gene mutations in chronic lymphocytic leukaemia. Br. J. Cancer 2016, 114, 849–854. [Google Scholar] [CrossRef]
Rossi, D.; Khiabanian, H.; Spina, V.; Ciardullo, C.; Bruscaggin, A.; Famà, R.; Rasi, S.; Monti, S.; Deambrogi, C.; De Paoli, L.; et al. Clinical impact of small TP53 mutated subclones in chronic lymphocytic leukemia. Blood 2014, 123, 2139–2147. [Google Scholar] [CrossRef]
Burack, W.R.; Li, H.; Adlowitz, D.; Spence, J.M.; Rimsza, L.M.; Shadman, M.; Spier, C.M.; Kaminski, M.S.; Leonard, J.P.; Leblanc, M.L.; et al. Subclonal TP53 mutations are frequent and predict resistance to radioimmunotherapy in follicular lymphoma. Blood Adv. 2023, 7, 5082–5090. [Google Scholar] [CrossRef]
Kempter, T.; Richter-Pechańska, P.; Michel, K.; Rausch, T.; Erarslan-Uysal, B.; Eckert, C.; Zimmermann, M.; Stanulla, M.; Schrappe, M.; Cario, G.; et al. Subclonal TP53 and KRAS variants combined with poor treatment response identify ultrahigh-risk pediatric patients with T-ALL. Blood Adv. 2025, 9, 1267–1279. [Google Scholar] [CrossRef]
Malcikova, J.; Pavlova, S.; Kunt Vonkova, B.; Radova, L.; Plevova, K.; Kotaskova, J.; Pal, K.; Dvorackova, B.; Zenatova, M.; Hynst, J.; et al. Low-burden TP53 mutations in CLL: Clinical impact and clonal evolution within the context of different treatment options. Blood 2021, 138, 2670–2685. [Google Scholar] [CrossRef] [PubMed]
Nadeu, F.; Delgado, J.; Royo, C.; Baumann, T.; Stankovic, T.; Pinyol, M.; Jares, P.; Navarro, A.; Martín-García, D.; Beà, S.; et al. Clinical impact of clonal and subclonal TP53, SF3B1, BIRC3, NOTCH1, and ATM mutations in chronic lymphocytic leukemia. Blood 2016, 127, 2122–2130. [Google Scholar] [CrossRef] [PubMed]
Misale, S.; Yaeger, R.; Hobor, S.; Scala, E.; Janakiraman, M.; Liska, D.; Valtorta, E.; Schiavo, R.; Buscarino, M.; Siravegna, G.; et al. Emergence of KRAS mutations and acquired resistance to anti-EGFR therapy in colorectal cancer. Nature 2012, 486, 532–536. [Google Scholar] [CrossRef]
Siravegna, G.; Mussolin, B.; Buscarino, M.; Corti, G.; Cassingena, A.; Crisafulli, G.; Ponzetti, A.; Cremolini, C.; Amatu, A.; Lauricella, C.; et al. Clonal evolution and resistance to EGFR blockade in the blood of colorectal cancer patients. Nat. Med. 2015, 21, 795–801. [Google Scholar] [CrossRef]
Takeda, M.; Yoshida, S.; Inoue, T.; Sekido, Y.; Hata, T.; Hamabe, A.; Ogino, T.; Miyoshi, N.; Uemura, M.; Yamamoto, H.; et al. The Role of KRAS Mutations in Colorectal Cancer: Biological Insights, Clinical Implications, and Future Therapeutic Perspectives. Cancers 2025, 17, 428. [Google Scholar] [CrossRef] [PubMed]
Ayala, R.; Carreño-Tarragona, G.; Barragán, E.; Boluda, B.; Larráyoz, M.J.; Chillón, M.C.; Carrillo-Cruz, E.; Bilbao, C.; Sánchez-García, J.; Bernal, T.; et al. Impact of FLT3-ITD Mutation Status and Its Ratio in a Cohort of 2901 Patients Undergoing Upfront Intensive Chemotherapy: A PETHEMA Registry Study. Cancers 2022, 14, 5799. [Google Scholar] [CrossRef]
Smith, C.C.; Levis, M.J.; Perl, A.E.; Hill, J.E.; Rosales, M.; Bahceci, E. Molecular profile of FLT3-mutated relapsed/refractory patients with AML in the phase 3 ADMIRAL study of gilteritinib. Blood Adv. 2022, 6, 2144–2155. [Google Scholar] [CrossRef]
Rothenberg-Thurley, M.; Amler, S.; Goerlich, D.; Köhnke, T.; Konstandin, N.P.; Schneider, S.; Sauerland, M.C.; Herold, T.; Hubmann, M.; Ksienzyk, B.; et al. Persistence of pre-leukemic clones during first remission and risk of relapse in acute myeloid leukemia. Leukemia 2018, 32, 1598–1608. [Google Scholar] [CrossRef]
Ok, C.Y.; Loghavi, S.; Sui, D.; Wei, P.; Kanagal-Shamanna, R.; Yin, C.C.; Zuo, Z.; Routbort, M.J.; Tang, G.; Tang, Z.; et al. Persistent IDH1/2 mutations in remission can predict relapse in patients with acute myeloid leukemia. Haematologica 2019, 104, 305–311. [Google Scholar] [CrossRef]
Li, Y.; Solis-Ruiz, J.; Yang, F.; Long, N.; Tong, C.H.; Lacbawan, F.L.; Racke, F.K.; Press, R.D. NGS-defined measurable residual disease (MRD) after initial chemotherapy as a prognostic biomarker for acute myeloid leukemia. Blood Cancer J. 2023, 13, 59. [Google Scholar] [CrossRef] [PubMed]
Parker, W.T.; Yeoman, A.L.; Jamison, B.A.; Yeung, D.T.; Scott, H.S.; Hughes, T.P.; Branford, S. BCR-ABL1 kinase domain mutations may persist at very low levels for many years and lead to subsequent TKI resistance. Br. J. Cancer 2013, 109, 1593–1598. [Google Scholar] [CrossRef] [PubMed][Green Version]
Marin, A.M.; Wosniaki, D.K.; Sanchuki, H.B.S.; Munhoz, E.C.; Nardin, J.M.; Soares, G.S.; Espinace, D.C.; de Holanda Farias, J.S.; Veroneze, B.; Becker, L.F.; et al. Molecular BCR::ABL1 Quantification and ABL1 Mutation Detection as Essential Tools for the Clinical Management of Chronic Myeloid Leukemia Patients: Results from a Brazilian Single-Center Study. Int. J. Mol. Sci. 2023, 24, 10118. [Google Scholar] [CrossRef] [PubMed]
Arcila, M.E.; Oxnard, G.R.; Nafa, K.; Riely, G.J.; Solomon, S.B.; Zakowski, M.F.; Kris, M.G.; Pao, W.; Miller, V.A.; Ladanyi, M. Rebiopsy of lung cancer patients with acquired resistance to EGFR inhibitors and enhanced detection of the T790M mutation using a locked nucleic acid-based assay. Clin. Cancer Res. 2011, 17, 1169–1180. [Google Scholar] [CrossRef]
Ogawa, K.; Kaneda, H.; Koh, Y.; Matsumoto, Y.; Sawa, K.; Tamiya, M.; Ishikawa, N.; Minami, K.; Suzuki, H.; Eguchi, Y.; et al. Relationship Between T790M Allele Frequency and Therapeutic Effects Before and After EGFR-TKI Administration Using Droplet Digital PCR in Non-small-cell Lung Cancer With EGFR Mutation. Cancer Diagn. Progn. 2025, 5, 285–299. [Google Scholar] [CrossRef]
Reita, D.; Pabst, L.; Pencreach, E.; Guérin, E.; Dano, L.; Rimelen, V.; Voegeli, A.C.; Vallat, L.; Mascaux, C.; Beau-Faller, M. Molecular Mechanism of EGFR-TKI Resistance in EGFR-Mutated Non-Small Cell Lung Cancer: Application to Biological Diagnostic and Monitoring. Cancers 2021, 13, 4926. [Google Scholar] [CrossRef]
Tvedte, E.S.; Gasser, M.; Sparklin, B.C.; Michalski, J.; Hjelmen, C.E.; Johnston, J.S.; Zhao, X.; Bromley, R.; Tallon, L.J.; Sadzewicz, L.; et al. Comparison of long-read sequencing technologies in interrogating bacteria and fly genomes. G3 Genes Genomes Genet. 2021, 11, jkab083. [Google Scholar] [CrossRef]
Wagner, G.E.; Dabernig-Heinz, J.; Lipp, M.; Cabal, A.; Simantzik, J.; Kohl, M.; Scheiber, M.; Lichtenegger, S.; Ehricht, R.; Leitner, E.; et al. Real-Time Nanopore Q20+ Sequencing Enables Extremely Fast and Accurate Core Genome MLST Typing and Democratizes Access to High-Resolution Bacterial Pathogen Surveillance. J. Clin. Microbiol. 2023, 61, e0163122. [Google Scholar] [CrossRef]
Hon, T.; Mars, K.; Young, G.; Tsai, Y.C.; Karalius, J.W.; Landolin, J.M.; Maurer, N.; Kudrna, D.; Hardigan, M.A.; Steiner, C.C.; et al. Highly accurate long-read HiFi sequencing data for five complex genomes. Sci. Data 2020, 7, 399. [Google Scholar] [CrossRef]
Wenger, A.M.; Peluso, P.; Rowell, W.J.; Chang, P.C.; Hall, R.J.; Concepcion, G.T.; Ebler, J.; Fungtammasan, A.; Kolesnikov, A.; Olson, N.D.; et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat. Biotechnol. 2019, 37, 1155–1162. [Google Scholar] [CrossRef] [PubMed]
Mimosa, M.L.; Al-Ameri, W.; Simpson, J.T.; Nakhla, M.; Boissinot, K.; Munoz, D.G.; Das, S.; Feilotter, H.; Fattouh, R.; Saleeb, R.M. A Novel Approach to Detect IDH Point Mutations in Gliomas Using Nanopore Sequencing: Test Validation for the Clinical Laboratory. J. Mol. Diagn. 2023, 25, 133–142. [Google Scholar] [CrossRef]
Kaplun, L.; Krautz-Peterson, G.; Neerman, N.; Stanley, C.; Hussey, S.; Folwick, M.; McGarry, A.; Weiss, S.; Kaplun, A. ONT long-read WGS for variant discovery and orthogonal confirmation of short read WGS derived genetic variants in clinical genetic testing. Front. Genet. 2023, 14, 1145285. [Google Scholar] [CrossRef] [PubMed]
Rassner, M.; Waldeck, S.; Follo, M.; Jilg, S.; Philipp, U.; Jolic, M.; Wehrle, J.; Jost, P.J.; Peschel, C.; Illert, A.L.; et al. Development of Highly Sensitive Digital Droplet PCR for Detection of cKIT Mutations in Circulating Free DNA That Mediate Resistance to TKI Treatment for Gastrointestinal Stromal Tumor (GIST). Int. J. Mol. Sci. 2023, 24, 5411. [Google Scholar] [CrossRef] [PubMed]
Doan, R.N.; Miller, M.B.; Kim, S.N.; Rodin, R.E.; Ganz, J.; Bizzotto, S.; Morillo, K.S.; Huang, A.Y.; Digumarthy, R.; Zemmel, Z.; et al. MIPP-Seq: Ultra-sensitive rapid detection and validation of low-frequency mosaic mutations. BMC Med. Genom. 2021, 14, 47. [Google Scholar] [CrossRef]
Cheng, L.Y.; Haydu, L.E.; Song, P.; Nie, J.; Tetzlaff, M.T.; Kwong, L.N.; Gershenwald, J.E.; Davies, M.A.; Zhang, D.Y. High sensitivity sanger sequencing detection of BRAF mutations in metastatic melanoma FFPE tissue specimens. Sci. Rep. 2021, 11, 9043. [Google Scholar] [CrossRef]
Yu, H.; Han, X.; Wang, W.; Zhang, Y.; Xiang, L.; Bai, D.; Zhang, L.; Weng, Z.; Lv, K.; Song, L.; et al. Modified Unit-Mediated Strand Displacement Reactions for Direct Detection of Single Nucleotide Variants in Active Double-Stranded DNA. ACS Nano 2024, 18, 12401–12411. [Google Scholar] [CrossRef]
Bae, J.H.; Liu, R.; Roberts, E.; Nguyen, E.; Tabrizi, S.; Rhoades, J.; Blewett, T.; Xiong, K.; Gydush, G.; Shea, D.; et al. Single duplex DNA sequencing with CODEC detects mutations with high sensitivity. Nat. Genet. 2023, 55, 871–879. [Google Scholar] [CrossRef] [PubMed]
Yaacov, A.; Lazarian, G.; Pandzic, T.; Weström, S.; Baliakas, P.; Imache, S.; Lefebvre, V.; Cymbalista, F.; Baran-Marszak, F.; Rosenberg, S.; et al. Cancer associated variant enrichment CAVE, a gene agnostic approach to identify low burden variants in chronic lymphocytic leukemia. Sci. Rep. 2024, 14, 21962. [Google Scholar] [CrossRef]
Clark, T.A.; Chung, J.H.; Kennedy, M.; Hughes, J.D.; Chennagiri, N.; Lieber, D.S.; Fendler, B.; Young, L.; Zhao, M.; Coyne, M.; et al. Analytical Validation of a Hybrid Capture-Based Next-Generation Sequencing Clinical Assay for Genomic Profiling of Cell-Free Circulating Tumor DNA. J. Mol. Diagn. 2018, 20, 686–702. [Google Scholar] [CrossRef]
Jones, W.; Gong, B.; Novoradovskaya, N.; Li, D.; Kusko, R.; Richmond, T.A.; Johann, D.J., Jr.; Bisgin, H.; Sahraeian, S.M.E.; Bushel, P.R.; et al. A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency. Genome Biol. 2021, 22, 111. [Google Scholar] [CrossRef]
Olsen, T.R.; Talla, P.; Sagatelian, R.K.; Furnari, J.; Bruce, J.N.; Canoll, P.; Zha, S.; Sims, P.A. Scalable co-sequencing of RNA and DNA from individual nuclei. Nat. Methods 2025, 22, 477–487. [Google Scholar] [CrossRef]
Izydorczyk, M.B.; Kalef-Ezra, E.; Horner, D.W.; Zheng, X.; Holmes, N.; Toffoli, M.; Sahin, Z.; Han, Y.; Mehta, H.H.; Scholz, S.W.; et al. Single cell long read whole genome sequencing reveals somatic transposon activity in human brain. Commun. Biol. 2025, 8, 1627. [Google Scholar] [CrossRef]
Gonzalez-Pena, V.; Natarajan, S.; Xia, Y.; Klein, D.; Carter, R.; Pang, Y.; Shaner, B.; Annu, K.; Putnam, D.; Chen, W.; et al. Accurate genomic variant detection in single cells with primary template-directed amplification. Proc. Natl. Acad. Sci. USA 2021, 118, e2024176118. [Google Scholar] [CrossRef] [PubMed]
Lindenhofer, D.; Bauman, J.R.; Hawkins, J.A.; Fitzgerald, D.; Yildiz, U.; Jung, H.; Korosteleva, A.; Marttinen, M.; Kueblbeck, M.; Zaugg, J.B.; et al. Functional phenotyping of genomic variants using joint multiomic single-cell DNA-RNA sequencing. Nat. Methods 2025, 22, 2032–2041. [Google Scholar] [CrossRef]
Dietz, S.; Harms, A.; Endris, V.; Eichhorn, F.; Kriegsmann, M.; Longuespée, R.; Stenzinger, A.; Sültmann, H.; Warth, A.; Kazdal, D. Spatial distribution of EGFR and KRAS mutation frequencies correlates with histological growth patterns of lung adenocarcinomas. Int. J. Cancer 2017, 141, 1841–1848. [Google Scholar] [CrossRef]
Liu, Y.; Zhu, F.; Li, X.; Guan, X.; Hou, Y.; Feng, Y.; Dong, X.; Li, Y. SpatialSNV: A novel method for identifying and analyzing spatially resolved SNVs in tumor microenvironments. Gigascience 2025, 14, giaf065. [Google Scholar] [CrossRef] [PubMed]
Muyas, F.; Sauer, C.M.; Valle-Inclán, J.E.; Li, R.; Rahbari, R.; Mitchell, T.J.; Hormoz, S.; Cortés-Ciriano, I. De novo detection of somatic mutations in high-throughput single-cell profiling data sets. Nat. Biotechnol. 2024, 42, 758–767. [Google Scholar] [CrossRef] [PubMed]
Song, H.; Weinstein, H.N.W.; Allegakoen, P.; Wadsworth, M.H., 2nd; Xie, J.; Yang, H.; Castro, E.A.; Lu, K.L.; Stohr, B.A.; Feng, F.Y.; et al. Single-cell analysis of human primary prostate cancer reveals the heterogeneity of tumor-associated epithelial cell states. Nat. Commun. 2022, 13, 141. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematic of cancer heterogeneity and emergence of variant allele frequencies. Red and blue cells represent cancer cell clones that exist at low frequency after initial treatments that lead to cancer recurrence. Created in BioRender. Christensen, H. (2026) https://BioRender.com/002rs2m and based on concepts from [12,13].

Figure 2. UMI-based error correction workflow for accurate VAF detection. This schematic illustrates how unique molecular identifiers (UMIs) are used to distinguish true variants from sequencing artifacts. (A) Individual DNA molecules are tagged with unique barcodes (UMIs) prior to amplification. (B) During PCR amplification and sequencing, multiple reads are generated from each original molecule, and errors (e.g., base substitutions) may be introduced. (C) Reads are grouped according to their shared UMI, allowing all sequences derived from the same original DNA molecule to be clustered. (D) Within each UMI group, a consensus sequence is generated using majority voting, which eliminates stochastic sequencing errors that are not consistently observed across reads. (E) The resulting error-corrected consensus sequences are then used for variant calling, significantly improving accuracy for low-frequency variant detection. Created based on concepts from [50,52].

Table 3. Emerging Technologies in Low-VAF Detection.

Technology/Approach	Category	Principle	Strength	Limitations	Applications
Primary Template-Directed Amplification (PTA)	Amplification	Linear amplification from original templates	Reduces allele dropout and bias	Residual amplification artifacts	Single-cell sequencing, low DNA input
Single-cell DNA sequencing	Single-cell genomics	Sequencing individual cell genomes	Eliminates bulk signal dilution	Coverage variability, amplification bias	Clonal architecture, rare subclones
Multiomic single-cell sequencing (DEFND-seq)	Single-cell multiomics	Joint DNA and RNA profiling per cell	Links variants to transcriptional state	High cost, technical complexity	Functional subclone mapping
Targeted single-cell DNA-RNA co-sequencing (SDR-seq)	Targeted single-cell	Focused loci sequencing with RNA profiling	High sensitivity at selected loci	Limited genome-wide scope	Targeted mutation validation
Breakpoint-aware/amplicon-based sequencing	Structural variant detection	Targeted enrichment of rearrangements	Improves detection of complex SVs	Requires known breakpoints	Structural variant characterization
Spatial mutation profiling	Spatial genomics	Maps variants within tissue architecture	Reveals spatial heterogeneity	Lower sensitivity, technical complexity	Tumor microenvironment analysis
Modality-aware variant callers	Bioinformatics	Data-specific variant calling models	Improves signal-to-noise discrimination	Model-dependent performance	Single-cell variant calling
Statistical noise-modeling frameworks	Bioinformatics	Error modeling across datasets	Enhances detection at low allele fractions	Computationally intensive	Rare variant detection
Integrative multiomic analysis platforms	Computational integration	Links genomic and phenotypic data	Enables biological interpretation of variants	Requires multi-layer datasets	Clonal evolution, functional genomics

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Knebel, P.; Harris, J.; Steveson, I.; Kearns, B.; Todeschini, A.S.; Perrett, L.; Anderson, D.; Beltran, E.; Leary, B.; Settle, J.; et al. Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology. J. Genome Biotechnol. Genet. 2026, 1, 4. https://doi.org/10.3390/jgbg1010004

AMA Style

Knebel P, Harris J, Steveson I, Kearns B, Todeschini AS, Perrett L, Anderson D, Beltran E, Leary B, Settle J, et al. Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology. Journal of Genome Biotechnology and Genetics. 2026; 1(1):4. https://doi.org/10.3390/jgbg1010004

Chicago/Turabian Style

Knebel, Paytin, Jacob Harris, Isaac Steveson, Bridger Kearns, Andrew S. Todeschini, Lindsay Perrett, DeLaney Anderson, Erick Beltran, Bryson Leary, Jonah Settle, and et al. 2026. "Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology" Journal of Genome Biotechnology and Genetics 1, no. 1: 4. https://doi.org/10.3390/jgbg1010004

APA Style

Knebel, P., Harris, J., Steveson, I., Kearns, B., Todeschini, A. S., Perrett, L., Anderson, D., Beltran, E., Leary, B., Settle, J., Carlson, I., Christensen, H., Trujano, A., Alton, A. B., Dixon, K., & Barrott, J. J. (2026). Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology. Journal of Genome Biotechnology and Genetics, 1(1), 4. https://doi.org/10.3390/jgbg1010004

Article Menu

Hidden in the Noise: Low-Variant Allele Frequency Mutations and Their Impact on Precision Oncology

Abstract

1. Introduction

2. Biological and Clinical Context

3. Technical and Analytical Considerations

3.1. Impact of Tissue Storage Format and FFPE-Induced Artifacts

3.2. Bioinformatic Pipelines as a Primary Source of Error

3.3. Mutational Signatures, Reference Dependence, and Hidden Bias

3.4. Lack of Standardization and Limits of Detection in Clinical NGS

4. Variant-Class–Specific Challenges in Low-VAF Detection

4.1. SNVs: Highest Sensitivity at Low-VAF

4.2. Indels: Intermediate Detectability with Size-Dependent Limitations

4.3. Structural Variants: Lowest Sensitivity at Low-VAF

4.4. Implications for Low-VAF Variant Interpretation

5. Approaches to Improve Low-VAF Detection

6. Biological Validation and Clinical Relevance

7. Orthogonal Validation of Low-VAF Calls

7.1. Long-Read Sequencing Technology

7.2. ddPCR

7.3. MIPP-Seq

7.4. Enriched Sanger Sequencing

7.5. CODEC

7.6. Computational Tools

8. Emerging Technologies, Limitations, and Future Directions

9. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI