Identification of Somatic Structural Variants in Solid Tumors by Optical Genome Mapping

Genomic structural variants comprise a significant fraction of somatic mutations driving cancer onset and progression. However, such variants are not readily revealed by standard next-generation sequencing. Optical genome mapping (OGM) surpasses short-read sequencing in detecting large (>500 bp) and complex structural variants (SVs) but requires isolation of ultra-high-molecular-weight DNA from the tissue of interest. We have successfully applied a protocol involving a paramagnetic nanobind disc to a wide range of solid tumors. Using as little as 6.5 mg of input tumor tissue, we show successful extraction of high-molecular-weight genomic DNA that provides a high genomic map rate and effective coverage by optical mapping. We demonstrate the system’s utility in identifying somatic SVs affecting functional and cancer-related genes for each sample. Duplicate/triplicate analysis of select samples shows intra-sample reliability but also intra-sample heterogeneity. We also demonstrate that simply filtering SVs based on a GRCh38 human control database provides high positive and negative predictive values for true somatic variants. Our results indicate that the solid tissue DNA extraction protocol, OGM and SV analysis can be applied to a wide variety of solid tumors to capture SVs across the entire genome with functional importance in cancer prognosis and treatment.


Introduction
One of the hallmarks of cancer is genomic instability, which often affects genes controlling cell division and genome integrity. The resulting alterations include single-nucleotide variant (SNV) point mutations as well as structural variants (SVs), in which larger DNA segments undergo chromosomal perturbations such as deletions, insertions, duplications, inversions, and translocations. For instance, recurrent translocations, such as the Philadelphia chromosome, can activate oncogenes but at the same time reveal avenues for implementing or developing effective targeted drug therapies [1][2][3][4]. Likewise, SV identification plays an increasingly important role in cancer diagnosis and prognosis [5,6], and SVs have been shown to play a crucial role in intra-tumoral genetic heterogeneity [7]. Therefore, SV identification and analysis are important to understanding oncogenesis and tumorbehavior.
Short-read sequencing can readily detect many SNVs, but is less successful in detecting SVs, by either alignment-based or assembly-based methods [8]. Since alignment-based approaches rely on mapping reads to unique positions, repetitive and low-complexity genomic regions can lead to misalignment and false-positive SV calls. Additionally, homologous alleles may be incorrectly combined, leading to haploid assembly only representing a single allele or chimeric assemblies mixing alleles. Whole-genome and cytogenetic approaches such as whole-genome sequencing (WGS), karyotyping, fluorescent in situ hybridization (FISH) and CNV microarrays also contain significant limitations. Karyotyping provides a comprehensive view of the entire genome but carries limited resolution of 5 Mb and in most cases requires culturing cells before preparing chromosomes. FISH has a higher resolution but requires prior knowledge as to which loci to test and has limited throughput. CNV microarrays offer a resolution down to multiple Kb but are insensitive to balanced chromosomal aberrations such as translocations and inversions, are unable to detect low-frequency allelic changes, and cannot distinguish tandem duplications from insertions in trans. Finally, WGS has difficulty with de novo genome assembly and resolving duplications and repeated sequences [8][9][10]. Therefore, alternative methods are required to preserve long-range genomic structural information.
Optical genome mapping (OGM) has emerged as a viable option for analyzing large genomes for SVs. OGM preserves long-range information by imaging entire intact molecules of DNA in their native state and, as a result, has contributed to constructing reference genome assemblies, including those for maize, mouse, goat, and humans [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. OGM can detect large (>500 bp) and complex SVs, such as chromothrypsis, that are difficult to detect using traditional short-read sequencing alone. OGM preparation and analysis workflow has been successfully applied to liquid-phase tumor and cell culture SV analyses. For instance, investigators have analyzed primary leukemic cells with OGM to identify previously unrecognized SVs implicated in oncogenesis and patients' survival and have combined OGM with chromosome conformation capture to demonstrate enhancer highjacking resulting from SVs [5,29,30]. Similarly, investigators used OGM to visualize complex gene fusions and novel somatic SVs in liposarcoma, melanoma and other well-studied cancer cell lines [31,32].
Despite its success in visualizing SVs in liquid tumors and cell lines, OGM has not yet seen widespread application in solid tissue tumors, due primarily to the difficulty of obtaining high-quality, high-molecular-weight DNA from solid tumor samples. Nonetheless, previous work has shown the feasibility of high-quality high-molecular-weight DNA isolation and analysis using earlier workflow iterations [33], and recent feasibility studies have shown the importance of OGM application to solid tumor analysis [7,34,35]. Peng et al. demonstrated large SVs not detected by WGS implicated in metastatic lung squamous cell carcinoma [7], and Jaratlerdiri et al. and Crumbaker et al. similarly found SVs impacting oncogenic and tumor-suppressing genes not identified by NGS or WGS alone in prostate cancer [34,35]. However, these previous methods for extracting gDNA from solid tissue were either prohibitively expensive or yielded low quantities of DNA [36]. We demonstrate here the successful implementation of a workflow to generate ultra-high-molecular-weight gDNA and subsequent SV analysis for 20 solid tumor samples comprising a wide variety of solid tissue organ systems.

Tumor Samples
Solid tissue was collected following surgical resection for 10 tumors: four squamous cell carcinomas of the tongue, three anaplastic carcinomas of the thyroid, one liver hepatocellular carcinoma, one lung pleomorphic carcinoma, and one bladder tumor. Patients consented under protocols approved by the Penn State Health Institution Review Board and tissue was flash frozen and stored at −80 • C in the Penn State Institute for Personalized Medicine (IPM). Ten additional fresh frozen solid tumor samples were acquired from BioIVT for the following tumor types: lung adenosquamous carcinoma, liver hepato-cellular carcinoma, bladder papillary urothelial carcinoma, kidney renal cell carcinoma, breast ductal carcinoma in situ, prostate invasive adenocarcinoma, brain anaplastic astrocytoma, ovarian serous carcinoma, colon adenocarcinoma, and papillary thyroid carcinoma. For some of the samples, two or three separate sections of the tumor were excised and processed independently to provide duplicate or triplicate biological replicates.

Bionano Optical Genome Mapping
Ultra-High-Molecular-Weight gDNA Isolation from Solid Tissue. The following protocol is diagrammed in Figure 1 and described in greater detail in a support document from Bionano Genomics (https://bionanogenomics.com/support-page/sp-tissue-and-tumordna-isolation-kit/). Briefly, tissue sections with a target mass of 10 mg were sliced from a frozen parent piece on a sterilized aluminum block over dry ice. The tissues were minced briefly and placed into a 15 mL conical tube on ice containing homogenization buffer (HB) for subsequent blending with a Tissueruptor II (Qiagen). Following tissue disruption, samples were washed in additional HB, poured through a 40 µm filter, and centrifuged to pellets, from which the supernatants were decanted. Pellets were resuspended in Wash Buffer A (Bionano, San Diego, CA, USA) and transferred to microcentrifuge tubes for additional washing. Supernatants were then decanted, and pellets resuspended in residual volume. Proteinase K (Bionano Genomics, San Diego, CA, USA) was added to samples, followed by Lysis and Binding Buffer (LBB, Bi- Pellets were resuspended in Wash Buffer A (Bionano, San Diego, CA, USA) and transferred to microcentrifuge tubes for additional washing. Supernatants were then decanted, and pellets resuspended in residual volume. Proteinase K (Bionano Genomics, San Diego, CA, USA) was added to samples, followed by Lysis and Binding Buffer (LBB, Bionano Genomics, San Diego, CA, USA) and mixed to produce a lysate containing highmolecular-weight DNA. Phenylmethylsulfonyl Fluoride Solution (PMSF, Millipore Sigma) was added to inactivate Proteinase K, followed by Salting Buffer (SB, Bionano Genomics, San Diego, CA, USA).
A single paramagnetic Nanobind Disc (Bionano Genomics, San Diego, CA, USA) was added to the lysate with 100% isopropanol, to facilitate binding and washing of gDNA strands. With gDNA captured on the disc, the supernatants were carefully removed and discs were washed with rounds of ethanol-based wash buffer. Discs were then transferred to clean tubes, where gDNA was eluted in buffer and homogenized at room temperature.
Ultra-High-Molecular-Weight gDNA Isolation from Blood. Previously frozen EDTAstabilized blood aliquots were thawed, inverted to mix, and measured for white blood cell counts (HemoCue, Brea, CA USA, WBC). Blood volumes corresponding to 1.5 × 10 6 cells were transferred to a microcentrifuge tubes, then spun to obtain cell pellets. After removing supernatants, pellets were resuspended in 40 µL Stabilizing Buffer and 50 µL Proteinase K (Bionano Genomics, San Diego, CA, USA). Lysis and Binding Buffer (LBB, Bionano Genomics, San Diego, CA, USA) was then added and mixed to produce a lysate, after which isolation of DNA was performed essentially as described above for tumor tissue.
Direct Label and Staining (DLS). For both tumor-and blood-derived samples, gDNA was labeled in Direct Label and Stain reactions, in which fluorescent labels are enzymatically conjugated to a six-base pair recognition sequence followed by DNA counterstaining. Briefly, 750 ng gDNA was diluted and mixed with a labeling master mix containing DLE-1 Enzyme and DL-Green (Bionano Genomics, San Diego, CA, USA). Reactions were shielded from light and incubated at 37 • C for 2 h. A Proteinase K solution then inactivated the enzyme, and successive membrane adsorption steps were used for cleanup. A portion of each sample was then carried forward into a staining master mix addition, slowly homogenized, and incubated overnight at room temperature.
The DNA concentration of each labeled sample was confirmed within 4-12 ng/µL by High-Sensitivity dsDNA Qubit Assay and then loaded onto a Bionano Saphyr ® Chip (Bionano Genomics, San Diego, CA, USA, Part#20366) and run on the Bionano Saphyr ® instrument, targeting approximately 300× human genome coverage.

Bionano Access and Solve Pipeline
Genome analysis was performed using Rare Variant Analysis in Bionano Access 1.6 and Bionano Solve 3.6, which captures somatic SVs occurring at low allelic fractions. Briefly, molecules of a given sample dataset were first aligned against the public Genome Reference Consortium GRCh38 human assembly. SVs were identified based on discrepant alignment between sample molecules and GRCh38, with no assumptions about ploidy. Consensus genome maps (*.cmaps) were then assembled from clustered sets of at least three molecules that identify the same variant. Finally, the genome maps were realigned to GRCh38, with SV data confirmed by consensus forming final SV calls. SVs were then annotated with known canonical gene set present in GRCh38, as well as estimated population frequency for each structural variant detected by comparing to a custom control database (n = 297) from Bionano Genomics.

Data Comparison
Whole-genome imaging data were compared to the human reference genome GRCh38 (hg38) to retain only those SVs not present in the reference genome. SVs were further filtered to eliminate any variant observed in any of the Bionano control samples or, if available, patient-matched blood. Bionano Access-created csv files containing filtered SVs were analyzed to compare SV content across samples. For tissue samples with associated blood samples, control database filtration efficacy was compared to blood-filtering efficacy at identification of somatic mutations. For duplicate/triplicate samples, filtered SVs were compared to determine intra-sample reliability. For identification of cancer-related genes, the set of genes affected by SVs in each of the samples was compared to the list of genes causally implicated in cancer available in the Cosmic Cancer Gene Census database (v92) [37] (https://cancer.sanger.ac.uk/census).

Results
Patient Clinical Characteristics. Clinical data for the patients from whom tumor samples were acquired are shown in Table 1. A total of 60% (12/20) patients were male, with a mean age of 73.5 years at sample acquisition. A total of 45% (9/20) patients identified as Caucasian, 40% (8/20) as Asian, and 5% (1/20) as Hispanic, with 10% (2/20) not identifying. The majority of IPM-sourced tumor samples were obtained from Caucasian patients (7/10), while the majority of the BioIVT-sourced tumor samples were obtained from patients of Asian ethnicity (8/10). In terms of overall risk factors, 55% (11/20) of patients were self-described current or former tobacco users and 45% (9/20) endorsed some history of alcohol use. The tumor samples consisted of a variety of stages (Table 1). A total of 75% (3/4) of tongue cancer samples and 100% (3/3) anaplastic thyroid cancers were stage IV cancers, while 100% (2/2) lung and (2/2) bladder cancers were stage II. Limited tumor data were available for the commercially available BioIVT-sourced tumor samples.
DNA Quality Metrics: All 20 solid tumors yielded high-molecular-weight gDNA ( Table 2). The average concentration across all samples following gDNA isolation was 120 ng/µL by Broad Range dsDNA Qubit Assay. All eluted gDNA were well above the minimal concentration required for DLS labeling (35 ng/µL) and the average final DNA yields for each tumor ranged from 1.    To determine the efficacy of identifying somatic SVs by filtering against Bionano's database of known polymorphisms, we used as a gold standard the blood samples from four patients from whom we had obtained tongue tumors. That is, we determined the true somatic mutations in each of these four tumors by eliminating those SVs identified in each of the tumors that were also present in the corresponding blood sample. We could then compare those true somatic variants to the list of somatic variants predicted by filtering against the database of polymorphisms. For these four tongue tumor samples, we identified an average of 1474 total SVs per sample.  (Figure 3, left upper panel). Comparing the residual SV sets obtained by filtering against Bionano's control database to the sets of true somatic SVs for each sample demonstrated that the control database filtration exhibited strong statistical accuracy (Figure 3, lower panel). Across the four separate samples, the control database exhibited an average sensitivity of 92% (83-96%) and specificity of 98% (range 97-99%). That is, filtering with the control database retained most of the true somatic mutations while eliminating almost all of the polymorphic SVs. Similarly, the average negative predictive value of the filter was 99.6%, demonstrating that an SV identified as germline was indeed a germline variant, while the positive predictive value of 74% (range 60-81%) indicates that a majority, but not all, the variants identified as somatic are in fact somatic. In other words, the results obtained by filtering SVs against Bionano's control database retained almost all the true somatic mutations. However, several of the SVs identified as somatic were actually germline. Those SVs inaccurately identified as somatic were rare germline variants, predominantly insertions or deletions, essentially private to the patient's genome. As above, we noted that the filtering process did not affect all SV types equally: while most deletions and insertions were flagged as polymorphic and eliminated from the list of somatic mutations, very few duplications and essentially no translocations were identified as polymorphic. This is consistent with observation that few translocations or duplications are stable through meiosis. Duplicate Sample Analysis. We compared SV calls from separate isolates of the same sample to assess consistency and reproducibility of the method, albeit without knowing the extent of tumor heterogeneity of the individual samples. Six samples underwent triplicate analysis, and four samples underwent duplicate analysis (Table 3). After identifying SVs using the Rare Variant Analysis pipeline and filtering them against the Bionano control database of known polymorphisms, we recovered an average of 116 somatic SVs shared among the separate isolates of the same tumor. These comprised an average of 23 insertions, 29 deletions, 10 inversions, 11 duplications and 43 translocations ( Table 3). As noted above, the number of SVs identified in a tumor varied widely across the different tumors examined, with lung, breast, brain and ovarian tumors showing a high level of somatic SVs while the others containing a relative low number of SVs. Moreover, the percentage of SVs shared among different isolates of the same tumor also varied among the different tumor types. However, the percentage of shared SVs and the total number of SVs were uncorrelated. Assuming that the higher values for shared SVs reflect the reproducibility of the method, then we might postulate that the lower shared values represent both the reproducibility and the tumor heterogeneity. That is, we would suggest that the reproducibility of the method across multiple biological replicates is 85-95%, corresponding to the values obtained from those samples with the least variability. Thus, we would suggest that the residual variability in those samples with lower reproducibility (50-75%) reflects heterogeneity of SVs in the tumors. This would suggest that these brain, liver, lung and prostate tumors had a relatively high level of tumor heterogeneity. , specificity (SP) and positive (PPV) and negative predictive values (NPV) for identification of somatic structural variants obtained by filtering total identified SVs to remove those present in a control database of know human polymorphisms. Data obtained by filtering against the control database were compared to those obtained by filtering total SVs to remove those present in the genomes obtained from peripheral blood from the each of the patients from whom the tumors were removed.
Comparing the residual SV sets obtained by filtering against Bionano's control database to the sets of true somatic SVs for each sample demonstrated that the control database filtration exhibited strong statistical accuracy (Figure 3, lower panel). Across the four separate samples, the control database exhibited an average sensitivity of 92% (83-96%) and specificity of 98% (range 97-99%). That is, filtering with the control database retained most of the true somatic mutations while , specificity (SP) and positive (PPV) and negative predictive values (NPV) for identification of somatic structural variants obtained by filtering total identified SVs to remove those present in a control database of know human polymorphisms. Data obtained by filtering against the control database were compared to those obtained by filtering total SVs to remove those present in the genomes obtained from peripheral blood from the each of the patients from whom the tumors were removed. The number and types of somatic variants in a tumor varied substantially across the collection of samples (Figure 4). Several tumor samples, including those from colon, bladder, kidney and all four from thyroid, contained relatively few somatic SVs whereas others, including those from prostate, ovaries, lung and brain, carried a large number of somatic SVs. Since these samples for the most part serve as single representatives of each tumor type, we cannot extrapolate to the tumor types as a whole the contribution of SVs to cancer onset and development for each class of tumor. However, it is noteworthy that the SNV mutational burden in thyroid cancers is among the lowest among all tumor types and that measure of genome instability is mirrored in the low number of somatic SVs in all four of the samples examined [39]. Similarly, the SNV mutational burden in lung cancers is among the highest across all tumor types and both of the lung tumors examined here also carry a high level of somatic SV. Finally, the extent of somatic SVs observed in our collection of tumors does not correlate with either cancer stage nor with obvious lifestyle characteristics (Table 1). For instance, neither smoking nor drinking history has a stronger influence on SV mutation burden than does site of origin of the tumor. However, further data examining the correlation of lifestyle characteristics and tumor stages with SV mutational burden are warranted to assess the impact of these behaviors on SV formation and persistence.
Identification of Cancer Gene Mutations. While, as noted above, we cannot generalize regarding the role of structural variants in onset and progression of different tumor types, our results indicate that we can extract from the structural variant list clinically relevant data on individual tumors that might inform prognosis or treatment options. We examined the somatic structural variants in each tumor sample for those that affected genes previously associated with cancer. In particular, we annotated those genes altered by a structural variant, either by disruption, duplication, deletion or fusion, and intersected that list with the set of cancer-related genes in the Cosmic database (v92) [37]. The resultant list by tumor type is provided in Table 4 and subdivided into oncogenes, tumor suppressor genes and gene fusions. We included only those oncogenes that were potentially activated by duplication or gene fusion and only those tumor suppressor genes that were potentially inactivated by deletion, insertion or fusion. As evident, every tumor sample carried at least one such cancer gene mutation and most contained multiple hits. Several of these genes offer the opportunity for targeted therapies, focused either directly on the oncogene, as would be the case for CDK6 and ERBB2, or at the pathway downstream of the affected gene, as would be the case for BRAF and CDKN2A. Other affected genes, such as MSH2, RAD51B, RAD21 and RAD18, suggest the potential of therapy based on possible ensuing genome instability, such as immunotherapy or PARP inhibitors. Many of these variants would not be readily identified by targeted gene panels generally used for clinical assessment of tumor genomes. Moreover, in many cases, the cancer genes altered by SVs were not previously associated with the cancer type in which we observed it. For instance, we observed a fusion of CDK6 in one of the tongue tumors while it has previously been associated predominantly only with ALL. Similarly, LRP1B is often inactivated in CLL or ovarian cancer, while we find it inactivated by deletion in one of the lung tumors. Thus, the identification of somatic structural variants by OGM could provide useful clinical insights not readily available through standard next-generation sequencing or targeted panels.   Diagrams of somatic structural variants in all the solid tumor genomes, filtered to remove known polymorphisms, showing translocations and inversions in the center, copy number on the inner ring and insertions (green), deletions (orange) inversions (light blue) and duplications (violet) on the next to most outer ring. Chromosomes are ordered sequentially in a clockwise orientation in the outer ring on which are indicated cytological banding patterns and the centromere (red bar).
In addition to identifying individual cancer-related genes in tumor types, our results provide a panoramic view of the entire tumor genome and reveal large-scale genomic features not readily available from standard sequencing techniques. As evident in the results in Figure 4, our data provide a rapid snapshot of the extent of genomic instability in each of the tumors. Such images present an integrated picture of the aneuploidies, translocations, inversions, deletions and insertions, which offers a readily digestible impression of the extent of genetic instability underlying a tumor. Moreover, several large-scale features are evident in these data. For instance, chromothripsis is a massive cluster of chromosomal rearrangements localized to a restricted region of a chromosome, which often results from a single catastrophic event [40]. Figure 5 details a chromothripsis event on a portion of chromosome 5 in one of the lung tumor samples. In fact, such events are readily evident in four of the Circos plots in Figure 4, consistent with previous estimates of 2-3% prevalence across all cancers, albeit with different frequency in different cancers [41]. The detection and mapping of such a feature are difficult to achieve by short-read sequencing [41] but can indicate poor prognosis and the corresponding need for aggressive therapy.   Figure 4, consistent with previous estimate of 2-3% prevalence across all cancers, albeit with different frequency in different cance [41]. The detection and mapping of such a feature are difficult to achieve by short-rea sequencing [41] but can indicate poor prognosis and the corresponding need for aggre sive therapy.

Discussion
In this report, we described the application of optical genome mapping to solid tumors, which we suggest can significantly augment the genomic analysis of such tumors obtained by next-generation sequencing. Genomic analysis of tumors has stimulated major advances in cancer diagnosis, prognosis and treatment, shifting the focus from morphological and histochemical characterization to consideration of the landscape of driver mutations in the tumor [42][43][44]. Somatic driver events in a tumor-point mutations and structural variants (SVs) including insertions, deletions, inversions, translocations and copy number changesare currently identified in solid tumors by some combinations of RNA sequencing and genome sequencing of either targeted gene panels, whole exomes or whole genomes. As noted in this report, OGM can provide a pervasive view of the structural variants in a tumor and the cancer-related genes on which they impinge, thus identifying affected genes agnostically, without prior bias imposed by gene panels.
Some prior studies have begun to demonstrate the utility of Bionano DNA isolation protocols in solid tissue tumor analysis. These include studies of lung squamous cell carcinoma and metastatic prostate carcinoma [7,34,35]. This current report demonstrates the utility of the DNA isolation protocol and SV analysis in a wide variety of solid tissue types, and expands the feasibility of such analysis for previously unused human tissue types. The high DNA yield, high effective coverage, map rate and other molecular quality metrics shown across tumor types confirm how our extraction and analysis workflow can be effectively applied to many solid tissue tumors.
This current DNA isolation protocol carries a number of advantages. Tissue handling can be performed at room temperature. The current protocol showed successful DNA isolation in solid tissue samples of <20 mg, and even as low as 6 mg. The low tissue input requirement carries important applications for rare cancer samples, human tissue biopsy testing and other low-quantity specimen acquisition. Additionally, utilizing the novel paramagnetic Nanobind disks rather than prior agarose gel plugs greatly decreases time needed to complete DNA isolation to only 5 h. The ability to isolate DNA from up to eight simultaneous samples using the current protocol greatly amplifies throughput and reduces tissue-to-data processing time, increasing both laboratory convenience as well as expanding potential for clinical utility where rapid data turnaround is paramount. Furthermore, the strong inter-sample SV correspondence shown by most tissue types in duplicate/triplicate sample analysis demonstrates the reproducibility of this technique; intra-sample heterogeneity of select samples may be attributed to non-tumor normal tissue within some tissue fragments, or attributed to specific cancer subtype, and merits further investigation. Although the isolation protocol described here affords many advantages, there are some limitations to this protocol. While high-quality DNA isolation and OGM SV analysis was obtained for a wide variety of tumor types that were tested, it may not be generalizable to every additional untested solid tumor type. Future directions include continuing to validate this protocol in additional tissue types, and assessing additional tumor samples to assess broader trends in the role of specific OGM-identified SVs in individual cancer subtypes.
In clinical evaluation of liquid tumors such as leukemia, genomic analysis is augmented by karyotyping, which gives a panoramic, albeit low resolution, view of the entire genome. Despite the low resolution, the genome wide view of the structural changes afforded by karyotyping reveals diagnostic features of the tumor that have strong prognostic value. Given the consistent correlation of clinical outcomes with specific mutation classes, the World Health Organization (WHO), National Comprehensive Cancer Network (NCCN) and European Leukemia Net (ELN) agencies developed recommendations for diagnosis and management of acute myeloid leukemia in adults based on the spectrum of somatic point mutations and SVs generally revealed by karyotyping [45]. SVs, particularly translocations and inversions, are major considerations in this diagnosis. Since karyotyping is a very challenging technique to apply to solid tumors, the clinician does not have access to a comparable global view of a solid tumor's genome and the role of SVs in prognosis has likely been underappreciated. Applying OGM broadly to cancer types and correlating SVs revealed by that analysis with clinical outcomes could provide new genomic markers for prognosis and treatment selection.

Conclusions
We demonstrate the utility of a DNA isolation protocol for high-molecular-weight DNA extraction and OGM SV analysis of a wide variety of solid human tumor types on the Bionano Saphyr system, including breast, colon, liver, brain, bladder, kidney, lung, ovary, prostate and thyroid cancer tissue. The system can be used to accurately detect genetic mutation hallmarks in cancer tissue samples, including rearrangements such as translocations, gene fusions and copy number alterations. Somatic SVs can be determined by comparison filtering with the Bionano control sample database, or against a matched pair sample. Importantly, Bionano SV pipelines can detect SVs with complex breakpoint structures that are difficult to detect with other technologies. Our results indicate that the solid tissue DNA extraction protocol can be applied to a wide variety of solid tumors, and that the Saphyr system can capture, in a streamlined workflow, a broad spectrum of SVs. These SVs have functional importance and provide great utility in cancer prognosis and treatment.