Next Article in Journal
Cell-Type Deconvolution of Equine BALF RNA-Seq: A Critical Comparison with Matched Single-Cell Data
Previous Article in Journal
Mechanotransduction in Marfan Syndrome and Related Aortic Disorders: Insights from Transcriptomic Analyses
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Targeted Genomic Region Masking Supports Accurate Variant Calling While Suppressing Low-Complexity Sequencing Artifacts

by
Chrysoula Kaligerou
,
Athina Tsagkalidou
,
Vasiliki Pogka
,
Dimitrios Christos Tremoulis
and
Timokratis Karamitros
*
Bioinformatics and Applied Genomics Unit, Department of Microbiology, Hellenic Pasteur Institute, 11521 Athens, Greece
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Genes 2026, 17(7), 772; https://doi.org/10.3390/genes17070772
Submission received: 2 June 2026 / Revised: 26 June 2026 / Accepted: 28 June 2026 / Published: 30 June 2026
(This article belongs to the Section Bioinformatics)

Abstract

Background: False-positive variant calls generated within low-complexity regions (LCRs) remain a persistent bottleneck in clinical genomics, complicating downstream analysis. This study evaluates a targeted spatial masking strategy designed to suppress deterministic artifacts in short-read sequencing data, while preserving clinically actionable variants residing outside LCRs. We implemented a selective masking protocol prior to variant calling across analytical reference standards (EQA, NA12878) and two independent breast cancer whole-exome sequencing cohorts (n = 25). Methods: Callsets were evaluated for diagnostic sensitivity, precision gains, mutational signatures, VAF behavior, pseudo-multiallelic noise and ClinVar/dbSNP annotation. Results: The protocol removed thousands of sequencing and alignment artifacts while maintaining the retained biological callset, with negligible disease-associated diagnostic variants detected in the excluded artifact fraction. LCR masking preserved physiological Ti/Tv and Ins/Del profiles in retained calls, resolved pseudo-multiallelic noise, and distinguished excluded artifact calls by distorted mutational and VAF signatures. dbSNP profiling showed cohort-dependent behavior: TCGA-BRCA reproduced an intriguing phenomenon, with excluded calls showing higher dbSNP annotation than retained calls, whereas AURORA showed the opposite direction. Conclusions: These findings demonstrate the potential vulnerability of one-dimensional database annotation for variant authentication and highlight targeted spatial filtration as a critical, early pipeline intervention for high-fidelity clinical genomics of non-LCR-associated germline variants using short reads.
Keywords: variant calling; clinical genomics; low-complexity regions; spatial masking; analytical reference standards; sequencing artifacts; dbSNP paradox variant calling; clinical genomics; low-complexity regions; spatial masking; analytical reference standards; sequencing artifacts; dbSNP paradox

Share and Cite

MDPI and ACS Style

Kaligerou, C.; Tsagkalidou, A.; Pogka, V.; Tremoulis, D.C.; Karamitros, T. Targeted Genomic Region Masking Supports Accurate Variant Calling While Suppressing Low-Complexity Sequencing Artifacts. Genes 2026, 17, 772. https://doi.org/10.3390/genes17070772

AMA Style

Kaligerou C, Tsagkalidou A, Pogka V, Tremoulis DC, Karamitros T. Targeted Genomic Region Masking Supports Accurate Variant Calling While Suppressing Low-Complexity Sequencing Artifacts. Genes. 2026; 17(7):772. https://doi.org/10.3390/genes17070772

Chicago/Turabian Style

Kaligerou, Chrysoula, Athina Tsagkalidou, Vasiliki Pogka, Dimitrios Christos Tremoulis, and Timokratis Karamitros. 2026. "Targeted Genomic Region Masking Supports Accurate Variant Calling While Suppressing Low-Complexity Sequencing Artifacts" Genes 17, no. 7: 772. https://doi.org/10.3390/genes17070772

APA Style

Kaligerou, C., Tsagkalidou, A., Pogka, V., Tremoulis, D. C., & Karamitros, T. (2026). Targeted Genomic Region Masking Supports Accurate Variant Calling While Suppressing Low-Complexity Sequencing Artifacts. Genes, 17(7), 772. https://doi.org/10.3390/genes17070772

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop