Cell-Free Total Nucleic Acid-Based Genotyping of Aggressive Lymphoma: Comprehensive Analysis of Gene Fusions and Nucleotide Variants by Next-Generation Sequencing

Simple Summary This study aimed to simultaneously demonstrate pathogenic chromosomal translocations and point mutations from both tissue biopsy and peripheral blood (PB) liquid biopsy (LB) samples of aggressive lymphoma patients. Matched samples were analyzed by next-generation sequencing for the same 125 genes. Eight different gene fusions, including the classical BCL2, BCL6, and MYC genes were detected in the corresponding samples with generally good agreement. Besides, mutations of 29 commonly affected genes, such as BCL2, MYD88, NOTCH2, EZH2, and CD79B could be identified in the matched samples at a rate of 16/24 (66.7%). Our prospective study demonstrates a non-invasive approach to identify frequent gene fusions and variants in aggressive lymphomas. In conclusion, PB LB sampling substantially supports the oncogenetic diagnostics of lymphomas, especially at anatomically critical sites (such as the central nervous system). Abstract Chromosomal translocations and pathogenic nucleotide variants both gained special clinical importance in lymphoma diagnostics. Non-invasive genotyping from peripheral blood (PB) circulating free nucleic acid has been effectively used to demonstrate cancer-related nucleotide variants, while gene fusions were not covered in the past. Our prospective study aimed to isolate and quantify PB cell-free total nucleic acid (cfTNA) from patients diagnosed with aggressive lymphoma and to compare with tumor-derived RNA (tdRNA) from the tissue sample of the same patients for both gene fusion and nucleotide variant testing. Matched samples from 24 patients were analyzed by next-generation sequencing following anchored multiplexed polymerase chain reaction (AMP) for 125 gene regions. Eight different gene fusions, including the classical BCL2, BCL6, and MYC genes, were detected in the corresponding tissue biopsy and cfTNA specimens with generally good agreement. Synchronous BCL2 and MYC translocations in double-hit high-grade B-cell lymphomas were obvious from cfTNA. Besides, mutations of 29 commonly affected genes, such as BCL2, MYD88, NOTCH2, EZH2, and CD79B, could be identified in matched cfTNA, and previously described pathogenic variants were detected in 16/24 cases (66.7%). In 3/24 cases (12.5%), only the PB sample was informative. Our prospective study demonstrates a non-invasive approach to identify frequent gene fusions and variants in aggressive lymphomas. cfTNA was found to be a high-value representative reflecting the complexity of the lymphoma aberration landscape.


Introduction
The peripheral blood (PB) of cancer patients represents variable amounts of tumorderived components, including circulating tumor cells (CTCs), cell-free DNA (cfDNA) and cell-free RNAs (cfRNAs) released from tumor foci of any anatomical location. While the frequency of viable tumor cells proved to be limited and highly inconsistent, plasma cfDNA and cfRNA fractions are now considered potential resources for the real-time genetic assessment of the malignant processes. The principle of non-invasive liquid biopsy (LB) has been successfully transferred to clinical diagnostics and the disease monitoring of solid tumors and aggressive lymphomas [1][2][3][4][5][6][7][8]. These studies provided two main insights: variant allele frequencies often correlate with disease status; and surveilling cfDNA might outperform positron emission tomography/computed tomography (PET/CT) scans in terms of sensitivity-a finding that holds great potential for relapse risk assessment-by quantifying minimal residual disease. Cell-free nucleic acids circulate at low quantities in the PB; consequently, identification of genetic aberrations requires high-sensitivity and high-throughput techniques, such as next-generation sequencing (NGS). cfDNA analysis has tremendous clinical potential, especially for patients with lesions that are difficult to access for biopsy sampling (e.g., brain; deep thoracic or abdominal localisations).
Nevertheless, the LB approach has not been investigated in every detail [9]. Traditionally, structural variants such as chromosomal rearrangements resulting in gene fusions are detected by chromosome karyotyping or fluorescence in situ hybridization (FISH), or more recently, by PCR-based methods (IGH-CCND1, IGH-BCL2) following reverse transcription from tumor-derived RNA. As the number of actionable targets grows, the demand for rapid testing from any available specimens, such as the PB also increases.
Recent studies on the utility of an RNA NGS-based assay for lymphoma genotyping using cell-free total nucleic acid (cfTNA) and matched tumor-derived RNA (tdRNA) substrate reported the potential to detect multiple clinically relevant fusion transcripts simultaneously to identify genomic translocations [10,11]. Anchored multiplexed PCR (AMP) was found to be particularly effective for gene fusion detection, and fusions can be identified even without prior knowledge of fusion partners or breakpoints. This technique applies primers, specific for the important genes involved in lymphoma progression, which connect upstream or downstream of an exon-intron boundary and which hybridize to the sequencing adapter [10,12]. Primers are designed for the proximal region of exon-exon junctions involved in the fusions; thus, rearrangements that fall outside the transcribed region of the genes will not be covered. Additional coverage is provided for some targets using supplemental primers.
Our prospective study aimed to demonstrate the utility of the AMP-based NGS technology for cfTNA genotyping in aggressive lymphoma. PB samples were collected and cfTNA was isolated and quantified from newly diagnosed patients. For comparison, tdRNA from tissue samples of the same patients was isolated. Gene fusions and RNA variants from the two matched sample types (cfRNA and tdRNA) were identified and correlated. For this purpose, a gene panel targeting 125 genes commonly involved in lymphoid malignancies (Archer FusionPlex and Miseq platform) was used.

Study Cases and Samples
Lymphoma patients were diagnosed and treated at the Department of Hematology, the University of Debrecen from the period of November 2019 to November 2020. Major criteria for the selection were 1. clinical/histological aggressive features and 2. parallel samples available from both neoplastic tissue and peripheral blood for nucleic acid isolation. Formaldehyde-fixed paraffin-embedded tissue (FFPE) samples were collected from, altogether, 24 patients diagnosed with nodal diffuse large B-cell lymphoma (DLBCL, 7 cases), non-nodal DLBCL (9 cases), primary central nervous system lymphoma (PCNSL, 3 cases), follicular lymphoma grade 3a (FL, 3 cases), Burkitt-lymphoma (BL, one case) and one high-grade peripheral T-cell lymphoma (PTCL, one case) at the Department of Pathology, University of Debrecen. Follicular lymphomas were all diagnosed as grade 3A, which was considered aggressive-type B-cell lymphoma with an increased cell proliferation rate. No peripheral blood and leukemic involvement were observed in any of the cases. All tissue samples were taken from the primary lymphoma sites at initial diagnosis. PB samples from the same patients were collected right after diagnosis for genetic analysis. Sampling was agreed upon and supported by written consent. All protocols have been approved by the author's respective Institutional Review Board for human subjects (IRB reference number: 4941/2018). This study was managed according to the Declaration of Helsinki.

Fluorescence In Situ Hybridization
Fluorescence in situ hybridization (FISH) was performed using MYC, BCL2 and BCL6 break apart probes to detect gene translocation on FFPE samples according to the manufacturer's protocol (Metasystems, Altlussheim, Germany).

Tumor and Cell-Free Nucleic Acid Isolation
H&E-stained slides were selected for molecular analysis with a >20% tumor percentage. Genomic tdRNA was extracted from FFPE tissues using ReliaPrep FFPE Total RNA Miniprep System (Promega, Madison, WI, USA) according to the manufacturer's instructions.
Blood samples were taken in EDTA anticoagulant tubes and were centrifuged at 3000× g for 10 min. 5 ± 0.1 mL plasma was spun down (16,000 g, 10 min) to eliminate cell residues. cfTNA was extracted from PB plasma into 30 µL elution buffer using QIAamp Circulating Nucleic Acid Kit (Qiagen, Hilden, Germany). cfDNA, tdRNA, and cfRNA concentrations were measured by the Qubit dsDNA HS Assay Kit using a Qubit 4.0 Fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).

Next-Generation Sequencing (NGS)
For NGS library preparation, the Archer FusionPlex Lymphoma gene panel (Archer DX, Boulder, CO, USA) was used. Anchored primers were applied for the known translocation partners and reverse primers to hybridize with the sequencing adapters to identify breakpoints and partners [10,11]. A total of 100-250 ng of tdRNA or the matched cfTNA was loaded into the assay. After first-strand cDNA synthesis, a quantitative RT-PCR Pre-seq QC was performed to define the yield of intact RNA in the samples [11]. The final libraries were quantified using a KAPA library quantification kit (Roche, Basel, Switzerland), diluted to a final concentration of 4 nM, and pooled by equal molarity.
For sequencing on the MiSeq System (MiSeq Reagent kit v3, 600 cycles), all libraries were denatured by adding 0.2 nM NaOH and diluted to 40 pM with hybridization buffer from Illumina (San Diego, CA, USA). The final loading concentration was 10 pM libraries and 1% PhiX. Sequencing was conducted according to the MiSeq instruction manual. Captured libraries were sequenced in a multiplexed fashion with a paired-end run to obtain 2 × 150 bp reads with at least 250X depth of coverage. Trimmed fastq files were generated using MiSeq reporter (Illumina, San Diego, CA, USA), which were analyzed with Archer analysis software (version 6.2.; Archer DX, Boulder, CO, USA). For the alignment, the human reference genome GRCh37 (equivalent UCSC version hg19) was built. Molecular barcode (MBC) adapters were used to count unique molecules and characterized sequencer noise, revealing mutations below standard NGS-based detection thresholds. The sequence quality for each sample was assessed and the cutoff was set to 5% (2% in cfRNA samples) variant allele frequency (VAF). Translocations were stated at over a 5-read fusion sequence, with reads comprising at least 10% of the total reads from gene-specific primers. Gene fusion frequency was calculated for fusion transcript reads and the total reads ratio.
The results were described using the latest version of the Human Genome Variation Society nomenclature for either the nucleotide or protein level. Individual gene variants were cross-checked in the COSMIC (Catalogue of Somatic Mutations in Cancer) and ClinVar databases for clinical relevance. We used the gnomAD v.2.1.1 population database to compare the significance of each gene alteration that is included in our Archer NGS analysis system.

Patients and Samples
Samples of, altogether, 24 patients (male/female ratio 13/11) were included in this prospective study. The average age was 57 years, ranging from 31 to 87 years. In total, 45 samples were analyzed, as 21 patients had matched tissue and LB specimens, while in three cases only LB was studied because of insufficient tdRNA yield of tissue needle aspiration (Cases 2, 8, and 11). The study workflow is illustrated in a flow chart ( Figure 1).

Histological Features Including Immunohistochemistry and FISH
Histopathological, IHC, and FISH features are summarized in Table 1. DLBCL with non-nodal origin was overrepresented in our series (12 cases), including lung and central nervous system manifestations. The cell-of-origin classification resulted in 4 germinal center B cell-like (GCB) and 15 non-GCB phenotypes. Two double-hit high-grade B-cell lymphoma cases were also included, featured by simultaneous MYC and BCL2 translocations that were verified using FISH (Case 18,19). H&E-stained slides, MYC, BCL2 IHC, and FISH record one of the double-hit cases, which (Case 18) is presented in Figure 2. Further, two cases with BCL2 and three cases with MYC alterations were included (NGS only detected MYC translocation in Case 3).

cfTNA Concentrations and Pre-Seq QC Assay
The processing of the LB samples resulted in good nucleic acid qualities with a mean cfDNA concentration of 9 ng/mL plasma (range: 1.5-24.6) and mean total cfRNA concentration of 541.8 pg/mL plasma with high yield variability (range: 3.75-1836).
Samples with Seq QC higher than 31 Cq failed. Both fusion transcripts and gene variants detected in tdRNA and matched cfTNA were carefully evaluated and compared.

Gene Fusions Detected by NGS
Gene fusions identified throughout the tissue biopsy-derived tdRNA were generally in good agreement with the results obtained from the matched LB-derived cfTNA samples ( Figure 3). Detected gene fusions are presented in Table 1. BCL2/IGH translocations were detected in all four LB samples derived from FISH and tissue NGS positive cases, and, similarly, the MYC translocation was identified in both sample types in six cases. In a further two cases, MYC fusions could be demonstrated from the cfTNA samples, while tissue biopsies failed due to technical reasons (Case 8, 11). A classical BCL6/IGH fusion was identified from another cfTNA in Case 9. Other gene fusions, such as CCND3/CCND1 and DLEU1/DLEU2, were recovered with 100% concordance from plasma-derived cfTNA. The translocations B2M/TNFR (Case 3), NSL1/BATF3, and P2RY8/ASMTL (both in Case 14) were not represented in the plasma at the time of the sampling, and the frequency of these alterations in tissue was 19%, 31%, and 12%, respectively.  Table 1. Gene fusion frequency was calculated for fusion transcript reads and the total reads ratio. In cases with MYC translocation, the applied NGS panel could not exactly distinguish immunoglobulin genes; therefore, the fusion partner (encoding one of the Ig chains) was not given.

NGS-Based Mutation Profiling
In parallel with gene fusion detection, the RNA-based technology allowed us to capture single nucleotide variants (SNVs) from the same samples (Table 2). SNVs were identified in 23/24 patients (95.8%), and only one case (Case 24) remained free of nucleotide aberration by our method. In three cases, the LB sample was the only informative source for genetic analysis (tissue biopsy insufficient for molecular analysis, Case 2, 8, and 11).
The cfRNA variant allele frequencies (VAF) of tissue biopsy and LB were highly variable with a mean of 40.0% (range: 2.0-88.8) and 24.6% (range: 2.0-96.0), respectively. Pathogenic mutations were detected at a rate of 16/24 (66.7%). Pathogenic variants were obvious in some of the most commonly affected genes in lymphomas, such as BCL2, MYD88, NOTCH2, EZH2, and CD79B, and most of them could be identified in matched LB-originated cfRNA. Some of the SNVs found in tdRNA were not found in cfRNA, and in reverse, some gene variants were detected only in cfRNA (Cases 13, 20, and 23).
The number, type, and allele frequencies of SNVs detected were variable. In general, two or three nucleotide changes were provided. The highest number of nucleotide aberrations demonstrated was 5 (Case 15), while only one case remained negative for SNVs (Case 24). Variants were further categorized according to the clinical significance defined by the COSMIC database. Pathogenic SNVs were referred to as mutations (n = 22 in the 24 patients), while benign alterations were considered neutral (n = 5). Uncertain nucleotide changes (n = 34) were also frequently found. Exact genotypes (SNVs), VAF, and clinical significance from tissue biopsy and matched liquid biopsy samples are compared in Table 2.

Discussion
Due to the continuous diversification of lymphoma classes and the growing therapeutic opportunities, precise pathological and molecular testing is required. Lymphoma genotyping is currently mainly possible following invasive tissue biopsy sampling. However, tissue procurement and subtyping might be complicated or inconclusive due to limitations in sample size, partial involvement, or anatomically difficult sites. Moreover, repeated sampling is increasingly required to follow signs of progression.
In the area of precision oncology, therapies based on molecular genetic findings are increasingly applied. The common NGS platforms usually refer to the detection of SNVs and small insertions and deletions (indels) [16,17]. However, the efficient analysis of larger indels and structural variants such as gene fusions is also evolving. Recurring translocations are disease-specific and the identification of gene fusions is an increasingly important component also in lymphoma diagnostics [18]. Chromosomal breakpoints or aberrant protein expression may be covered by diverse, widely used clinical approaches, including fluorescence in situ hybridization (FISH), immunohistochemistry (IHC), or reverse-transcription PCR (RT-PCR). Neither FISH nor IHC provides fusion partner breakpoint precision and RT-PCR also requires knowledge on potential fusion partners.
The urgent clinical need motivated the release of fast and concentrated gene rearrangement NGS assays. Different platforms including target enrichment for NGS have been published and revised for this purpose [19]. One of the methods of target enrichment is hybridization capture, which elucidates the high versatility of hundreds of genes to the entire human genome [20], requiring long hybridization times, high yield of starting nucleic acids, and specialized design, synthesis, and optimization. A major disadvantage of this method is the lack of unique sequencing start sites, which may result in systematic errors at multiple levels [10].
The ratio of cell-free nucleic acid may be extremely low in the PB-derived plasma sample, which is a unique challenge for application workflows and analysis tools, especially for gene fusion detection. A sufficient read of coverage is essential to detect structural variations in PB plasma.
Several bioinformatics approaches were reported to identify gene rearrangements following sequencing, using disconcordant reads and/or split reads. Existing tools such as Breakdancer [21] use disconcordant mappings, while others such as Socrates [22] and SViCT [23] use split reads, as well. The latter software combines these two approaches. The effectiveness of cell-free nucleic acid mappings was evaluated in detail for solid tumors [23] but not in lymphomas so far. In our study, we primarily used a commercially available analysis software (Archer version 6.2) tool for fusion detection in lymphoma cfRNA.
According to our experience, the AMP target enrichment platform appeared to be a fast and effective way to simultaneously detect gene translocations and nucleotide variants [10,11]. This method guarantees increased confidence not only by determining the gene fusions but also by confirming that the fusion is recognized in transcribed mRNA. We have demonstrated its real-life utility for the detection of gene translocations and point mutations from low amounts of FFPE-derived tdRNA and also plasma-derived cfTNA samples. Genetic subtypes of aggressive lymphomas with distinct genotypic characteristics could be identified in a generally good agreement with lymphoma tissue-based results. Moreover, plasma LB genotyping also allowed for the recovery of fused genes and/or nucleotide variants which were suppressed in the biopsy sample for any reason. Further to technical problems at the tissue level, spatial tumor heterogeneity may significantly contribute to differences in the genetic profile seen in cfTNA samples [3]. In reverse, plasma cfTNA may underrepresent focal aberrations due to the limited release or to minor subclones.
The pathogenic aberrations detected in this series of cases could generally be associated with aggressive phenotype and poor prognosis. According to the COSMIC database, the STAT6 pathogenic mutations (c.1249A > T; p.Asn417Tyr in cases 11, 15 and c.1256A > G; p.Asp419Gly in cases 1, 5, 21) correlate with DLBCL and FL progression, as well.
Aberrations in PAICS, responsible for an enzyme involved in nucleotide biosynthesis were explored in correlation with poor prognosis in DLBCL patients [24]. In our study, DLBCL and one of the FL grade 3A cases showed up with PAICS alterations, which were considered to be SNP after comparison with the NCBI dsSNP database. Mutations of CD79B (Case 1) encoding the B lymphocyte antigen receptor Ig-β component and of MYD88 (Case 6, 16, and 17) are well-known alterations in B-lymphoid malignancies, including PCNSL and leg-type cutaneous DLBCL. Another significant gene is EZH2 (Case 12, 19, and 22), which participates in histone methylation and transcriptional repression and which gained interest as an important therapeutic target in FL [25]. The proto-oncogene serine/threonine-protein kinase PIM1 (Case 16 and 17) and transmembrane protein NOTCH2 (Case 8) gene aberrations are also characteristic for DLBCL (COSMIC). XPO1 (encoding exportin protein) involvement (Case 16) was demonstrated in primary mediastinal B-cell lymphoma (PMBL) and classical Hodgkin lymphoma (cHL) [26]. On the contrary, the mutation of the JAK2 gene best known as a driver in myeloproliferative neoplasias [27] presented with the alteration c.1177C > G; p.Leu393Val (Case 15) and was rather considered non-pathogenic SNP according to NCBI dsSNP search results.
In the PTCL case, only a CCND3/CCND1 fusion was detected and no SNVs were found, although the most frequent PTCL-related genes (e.g., DNMT3A, and IDH2 as well as a new highly prevalent RHOA) were all covered by our NGS panel.
Representative PB samples are of special value in lymphomas developing at critical anatomical localization, e.g., primary CNS lymphoma. In the present series, we were able to demonstrate lymphoma-related translocations and SNVs from the same LB samples of PCNSL patients (cases [15][16][17]. Mutation of the MYD88 gene has been reported in extranodal DLBCL with a high frequency, including PCNSL and leg-type cutaneous DLBCL (Case 6, 16, and 17). Although these results appear to be promising, we also stated that the yields of cfTNA isolated from the plasma of these patients were generally low (mean cfDNA concentration: 1.5 ng/mL plasma-range: 1.1-1.9, and mean total cfRNA concentration: 204 pg/mL plasma-range: 6.18-330). Future large-scale evaluation of LB results is required to demonstrate the exact clinical utility of the method for PCNS-DLBCL diagnostics.

Conclusions
Our prospective study demonstrates a novel non-invasive approach to analyze frequent gene fusions and variants in aggressive lymphomas in one session. Moreover, the approach served with new information in addition to the tissue-derived NGS data and reflected an extended landscape of gene aberrations. Standardized clinical applications using cell-free nucleic acids potentially reflect the spatial tumor heterogeneity and provide novel aspects for the precision treatment of aggressive lymphomas. PB LB sampling may substantially support the diagnostics of processes at anatomically critical sites (such as the CNS) at minimal procedural risks.
Author Contributions: A.M.: conceptualization, methodology, writing-original draft preparation, visualization; R.B.: investigation, data curation, project administration; L.G.: conceptualization, formal analysis, investigation, resources, data curation, writing-review and editing, supervision, funding acquisition; G.M.: conceptualization, histopathology, immunohistochemistry, resources, writing-review and editing, visualization, supervision, funding acquisition. All authors have read and agreed to the published version of the manuscript.