Host-Filtered Blood Nucleic Acids for Pathogen Detection: Shared Background, Sparse Signal, and Methodological Limits
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design
2.2. cfRNA Sequencing Reads’ Processing and Host Read Filtering
2.3. Bacterial Taxonomic Classification and Species-Level Abundance Profiling
2.4. Construction of Phylogeny-Aware Community Trees
2.5. Definition of Disease-Associated Taxa and Signature Scores
2.6. Quantification of M. tuberculosis-Derived Reads
2.7. Statistical Analysis and Data Visualization
3. Results
3.1. Host-Filtering Stringency Shapes the Apparent Non-Host cfRNA Fraction More Strongly in TB than in CAD
3.2. Plasma Non-Host cfRNA Communities Are Dominated by a Shared, Low-Complexity Background Across Cohorts
3.3. Background-Derived Signature Scores Show Limited Separation Between Disease and Control Groups
4. Discussion
4.1. Host-Filtering Stringency Has an Asymmetric Impact on Apparent Non-Host cfRNA Between TB and CAD
4.2. Shared Low-Complexity Background Constrains Disease Discrimination in Plasma cfRNA
4.3. Mycobacterial cfRNA Signal Is Extremely Low and Overlaps Between TB-Positive and TB-Negative Plasma
4.4. Methodological, Reporting, and Translational Implications for Blood-Based Metagenomic Pathogen Detection
4.5. Limitations
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
References
- Chiu, C.Y.; Miller, S.A. Clinical metagenomics. Nat. Rev. Genet. 2019, 20, 341–355. [Google Scholar] [CrossRef]
- Xu, C.; Chen, X.; Zhu, G.; Yi, H.; Chen, S.; Yu, Y.; Jiang, E.; Zheng, Y.; Zhang, F.; Wang, J.; et al. Utility of plasma cell-free DNA next-generation sequencing for diagnosis of infectious diseases in patients with hematological disorders. J. Infect. 2023, 86, 14–23. [Google Scholar] [CrossRef]
- Chang, A.; Loy, C.J.; Eweis-LaBolle, D.; Lenz, J.S.; Steadman, A.; Andgrama, A.; Nhung, N.V.; Yu, C.; Worodria, W.; Denkinger, C.M.; et al. Circulating cell-free RNA in blood as a host response biomarker for detection of tuberculosis. Nat. Commun. 2024, 15, 4949. [Google Scholar] [CrossRef]
- Castillo, D.J.; Rifkin, R.F.; Cowan, D.A.; Potgieter, M. The healthy human blood microbiome: Fact or fiction? Front. Cell. Infect. Microbiol. 2019, 9, 148. [Google Scholar] [CrossRef]
- Eisenhofer, R.; Minich, J.J.; Marotz, C.; Cooper, A.; Knight, R.; Weyrich, L.S. Contamination in low microbial biomass microbiome studies: Issues and recommendations. Trends Microbiol. 2019, 27, 105–117. [Google Scholar] [CrossRef] [PubMed]
- Païssé, S.; Valle, C.; Servant, F.; Courtney, M.; Burcelin, R.; Amar, J.; Lelouvier, B. Comprehensive description of blood microbiome from healthy donors assessed by 16S targeted metagenomic sequencing. Transfusion 2016, 56, 1138–1147. [Google Scholar] [CrossRef]
- Whittle, E.; Leonard, M.O.; Harrison, R.; Gant, T.W.; Tonge, D.P. Multi-Method Characterization of the Human Circulating Microbiome. Front. Microbiol. 2019, 9, 3266. [Google Scholar] [CrossRef]
- Salter, S.J.; Cox, M.J.; Turek, E.M.; Calus, S.T.; Cookson, W.O.; Moffatt, M.F.; Turner, P.; Parkhill, J.; Loman, N.J.; Walker, A.W. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 2014, 12, 87. [Google Scholar] [CrossRef]
- Leiby, J.S.; McCormick, K.; Sherrill-Mix, S.; Clarke, E.L.; Kessler, L.R.; Taylor, L.J.; Hofstaedter, C.E.; Roche, A.M.; Mattei, L.M.; Bittinger, K.; et al. Lack of detection of a human placenta microbiome in samples from preterm and term deliveries. Microbiome 2018, 6, 196. [Google Scholar] [CrossRef] [PubMed]
- Wood, D.E.; Salzberg, S.L. Kraken: Ultrafast Metagenomic Sequence Classification Using Exact Alignments. Genome Biol. 2014, 15, R46. [Google Scholar] [CrossRef] [PubMed]
- Wood, D.E.; Lu, J.; Langmead, B. Improved Metagenomic Analysis with Kraken2. Genome Biol. 2019, 20, 257. [Google Scholar] [CrossRef]
- Blanco-Míguez, A.; Beghini, F.; Cumbo, F.; McIver, L.J.; Thompson, K.N.; Zolfo, M.; Manghi, P.; Dubois, L.; Huang, K.D.; Thomas, A.M.; et al. Extending and Improving Metagenomic Taxonomic Profiling with Uncharacterized Species Using MetaPhlAn4. Nat. Biotechnol. 2023, 41, 1633–1644. [Google Scholar] [CrossRef] [PubMed]
- de Goffau, M.C.; Lager, S.; Sovio, U.; Gaccioli, F.; Cook, E.; Peacock, S.J.; Parkhill, J.; Charnock-Jones, D.S.; Smith, G.C.S. Human placenta has no microbiome but can contain potential pathogens. Nature 2019, 572, 329–334. [Google Scholar] [CrossRef]
- de Goffau, M.C.; Lager, S.; Salter, S.J.; Wagner, J.; Kronbichler, A.; Charnock-Jones, D.S.; Peacock, S.J.; Parkhill, J.; Smith, G.C.S. Recognizing the reagent microbiome. Nat. Microbiol. 2018, 3, 851–853. [Google Scholar] [CrossRef]
- Corredor, Z.; Suarez-Molina, A.; Fong, C.; Cifuentes-C, L.; Guauque-Olarte, S. Presence of periodontal pathogenic bacteria in blood of patients with coronary artery disease. Sci. Rep. 2022, 12, 1241. [Google Scholar] [CrossRef]
- Sen, S.K.; Boelte, K.C.; Barb, J.J.; Joehanes, R.; Zhao, X.; Cheng, Q.; Adams, L.; Teer, J.K.; Accame, D.S.; Chowdhury, S.; et al. Integrative DNA, RNA, and Protein Evidence Connects TREML4 to Coronary Artery Calcification. Am. J. Hum. Genet. 2014, 95, 66–76. [Google Scholar] [CrossRef]
- Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An Ultra-Fast All-in-One FASTQ Preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef] [PubMed]
- Langmead, B.; Salzberg, S.L. Fast Gapped-Read Alignment with Bowtie 2. Nat. Methods 2012, 9, 357–359. [Google Scholar] [CrossRef]
- Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map Format and SAMtools. Bioinformatics 2009, 25, 2078–2079. [Google Scholar] [CrossRef] [PubMed]
- Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Springer: New York, NY, USA, 2016. [Google Scholar]
- Pedersen, T.L. ggraph: An Implementation of the Grammar of Graphics for Graphs and Networks. R Package Version 2.2.2.9000, 2025. Available online: https://ggraph.data-imaginist.com/ (accessed on 24 August 2025).
- Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
- Kim, D.; Paggi, J.M.; Park, C.; Bennett, C.; Salzberg, S.L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 2019, 37, 907–915. [Google Scholar] [CrossRef]
- Shen, S.Y.; Singhania, R.; Fehringer, G.; Chakravarthy, A.; Roehrl, M.H.A.; Chadwick, D.; Zuzarte, P.C.; Borgida, A.; Wang, T.T.; Li, T.; et al. Sensitive Tumour Detection and Classification Using Plasma Cell-Free DNA Methylomes. Nature 2018, 563, 579–583. [Google Scholar] [CrossRef]
- Nuzzo, P.V.; Berchuck, J.E.; Korthauer, K.; Spisak, S.; Nassar, A.H.; Ngo, L.; Koroleva, G.; Freije, W.A.; Abou-Alfa, G.K.; Ahn, J.; et al. Detection of Renal Cell Carcinoma Using Plasma and Urine Cell-Free DNA Methylomes. Nat. Med. 2020, 26, 1041–1043. [Google Scholar] [CrossRef] [PubMed]
- Nassiri, F.; Chakravarthy, A.; Feng, S.; Shen, S.Y.; Nejad, R.; Zuccato, J.A.; Voisin, M.R.; Patil, V.; Horbinski, C.; Aldape, K.; et al. Detection and Discrimination of Intracranial Tumors Using Plasma Cell-Free DNA Methylomes. Nat. Med. 2020, 26, 1044–1054. [Google Scholar] [CrossRef] [PubMed]
- Blow, M.J.; Clark, T.A.; Daum, C.G.; Deutschbauer, A.M.; Fomenkov, A.; Fries, R.; Froula, J.; Kang, D.D.; Malmstrom, R.R.; Morgan, R.D.; et al. The Epigenomic Landscape of Prokaryotes. PLoS Genet. 2016, 12, e1005854. [Google Scholar] [CrossRef] [PubMed]
- Simala-Grant, J.L.; Lam, E.; Keelan, M.; Taylor, D.E. Characterization of the DNA Adenine 5’-GATC-3′ Methylase HpyIIIM from Helicobacter pylori. Curr. Microbiol. 2004, 49, 47–54. [Google Scholar] [CrossRef]
- Fang, G.; Munera, D.; Friedman, D.I.; Mandlik, A.; Chao, M.C.; Banerjee, O.; Feng, Z.; Losic, B.; Mahajan, M.C.; Jabado, O.J.; et al. Genome-Wide Mapping of Methylated Adenine Residues in Pathogenic Escherichia coli Using Single-Molecule Real-Time Sequencing. Nat. Biotechnol. 2012, 30, 1232–1239. [Google Scholar] [CrossRef]
- Douvlataniotis, K.; Bensberg, M.; Lentini, A.; Gylemo, B.; Nestor, C.E. No Evidence for DNA N6-Methyladenine in Mammals. Sci. Adv. 2020, 6, eaay3335. [Google Scholar] [CrossRef]
- Stergachis, A.B.; Debo, B.M.; Haugen, E.; Churchman, L.S.; Stamatoyannopoulos, J.A. Single-Molecule Regulatory Architectures Captured by Chromatin Fiber Sequencing. Science 2020, 368, 1449–1454. [Google Scholar] [CrossRef]



| Cohort | CAD (GSE58150) | TB * (GSE255073) |
| Sample type | whole blood RNA | plasma cfRNA |
| Size (n) | 16 | 51 |
| Disease cases/controls, (n) | 8/8 | 30/21 |
| Total reads (million) | 119.7 (96.4–131.4) | 31.5 (14.6–57.0) |
| Host alignment rate (%) | 87.3 (81.09–90.83) | 27.7 (9.08–65.21) |
| non-host reads, relaxed (%) | 6.14 (4.39–9.60) | 8.20 (2.30–13.44) |
| non-host reads, strict (%) | 7.25 (4.99–11.98) | 21.76 (5.44–31.53) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, Z.; Chen, G.; Yang, M.; Wang, S.; Fang, J.; Shi, C.; Gu, Y.; Ning, Z. Host-Filtered Blood Nucleic Acids for Pathogen Detection: Shared Background, Sparse Signal, and Methodological Limits. Pathogens 2026, 15, 55. https://doi.org/10.3390/pathogens15010055
Wang Z, Chen G, Yang M, Wang S, Fang J, Shi C, Gu Y, Ning Z. Host-Filtered Blood Nucleic Acids for Pathogen Detection: Shared Background, Sparse Signal, and Methodological Limits. Pathogens. 2026; 15(1):55. https://doi.org/10.3390/pathogens15010055
Chicago/Turabian StyleWang, Zhaoxia, Guangchan Chen, Mei Yang, Saihua Wang, Jiahui Fang, Ce Shi, Yuying Gu, and Zhongping Ning. 2026. "Host-Filtered Blood Nucleic Acids for Pathogen Detection: Shared Background, Sparse Signal, and Methodological Limits" Pathogens 15, no. 1: 55. https://doi.org/10.3390/pathogens15010055
APA StyleWang, Z., Chen, G., Yang, M., Wang, S., Fang, J., Shi, C., Gu, Y., & Ning, Z. (2026). Host-Filtered Blood Nucleic Acids for Pathogen Detection: Shared Background, Sparse Signal, and Methodological Limits. Pathogens, 15(1), 55. https://doi.org/10.3390/pathogens15010055
