An Integrated Approach Including CRISPR/Cas9-Mediated Nanopore Sequencing, Mate Pair Sequencing, and Cytogenomic Methods to Characterize Complex Structural Rearrangements in Acute Myeloid Leukemia

Complex structural chromosome abnormalities such as chromoanagenesis have been reported in acute myeloid leukemia (AML). They are usually not well characterized by conventional genetic methods, and the characterization of chromoanagenesis structural abnormalities from short-read sequencing still presents challenges. Here, we characterized complex structural abnormalities involving chromosomes 2, 3, and 7 in an AML patient using an integrated approach including CRISPR/Cas9-mediated nanopore sequencing, mate pair sequencing (MPseq), and SNP microarray analysis along with cytogenetic methods. SNP microarray analysis revealed chromoanagenesis involving chromosomes 3 and 7, and a pseudotricentric chromosome 7 was revealed by cytogenetic methods. MPseq revealed 138 structural variants (SVs) as putative junctions of complex rearrangements involving chromosomes 2, 3, and 7, which led to 16 novel gene fusions and 33 truncated genes. Thirty CRISPR RNA (crRNA) sequences were designed to map 29 SVs, of which 27 (93.1%) were on-target based on CRISPR/Cas9 crRNA nanopore sequencing. In addition to simple SVs, complex SVs involving over two breakpoints were also revealed. Twenty-one SVs (77.8% of the on-target SVs) were also revealed by MPseq with shared SV breakpoints. Approximately three-quarters of breakpoints were located within genes, especially intronic regions, and one-quarter of breakpoints were intergenic. Alu and LINE repeat elements were frequent among breakpoints. Amplification of the chromosome 7 centromere was also detected by nanopore sequencing. Given the high amplification of the chromosome 7 centromere, extra chromosome 7 centromere sequences (tricentric), and more gains than losses of genomic material, chromoanasynthesis and chromothripsis may be responsible for forming this highly complex structural abnormality. We showed this combination approach’s value in characterizing complex structural abnormalities for clinical and research applications. Characterization of these complex structural chromosome abnormalities not only will help understand the molecular mechanisms responsible for the process of chromoanagenesis, but also may identify specific molecular targets and their impact on therapy and overall survival.


Introduction
Complex structural chromosome abnormalities have been reported in myeloid malignancies such as acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS).Complex and massive chromosomal and genomic rearrangements can be generated by a chromoanagenesis event, which is characterized by the simultaneous occurrence of multiple structural alterations through a single catastrophic cellular event at one or more loci [1].Chromoanagenesis comprises three distinct genomic rearrangements: chromothripsis, chromoanasynthesis, and chromoplexy, with each genomic rearrangement having its mechanism of formation and etiology [2].In chromothripsis, the driving force behind the phenomenon is through multiple double-strand breaks (DSBs) with deletions in a single catastrophic event that subsequently reassembles chromosomal fragments at random to develop complex derivative chromosomes [3][4][5].These chromosomes may include additional gain or loss of genetic material from multiple or single chromosomes that lead to alterations in the genomic structure.Analysis of the breakpoint sequences indicates that the rejoining of DNA fragments is likely through non-homologous end joining (NHEJ) or alternative end joining (alt-EJ) [6][7][8].Random rearrangements in these events often disrupt tumor suppressors and amplify oncogenes present [1].In chromoanasynthesis, chaotic and complex rearrangements lead to an increase in the copy number (CN) of chromosomes due to interference of stability and stress at the replication forks during DNA replication, resulting in replication errors [9][10][11][12].Commonly observed replication errors involve serial fork stalling and template switching (FoSTeS) or microhomology-mediated break-induced relocation (MMBIR) mechanisms that lead to region-focused duplications or triplications at the breakpoint junctions [10,13].In chromoplexy, genomic rearrangement is driven by the multiple inter-and intrachromosomal translocations and deletions at fusion junctions [14].Unlike chromoanasynthesis or chromothripsis, this phenomenon shows little to no copy number alterations.Evidence to date suggests chromothripsis to be the most probable mechanism underlying most genomic rearrangements in cancers [1].
Chromoanagenesis has been seen across many different forms of cancer with a prevalence of 2-3% [3,[15][16][17][18][19][20][21].The frequency is elevated, however, when specific tumors are considered.The frequency of the event has been seen to reach 25% in bone cancers [3] or 18% during the late stages of neuroblastomas [21].In rare instances, chromoanagenesis has been responsible for creating one or many cancer-inducing lesions that provide cellular growth through three key routes.The first is the formation of circular DNA fragments that lack centromeres or telomeres but harbor oncogenes (double minute chromosomes) through chromothripsis and NHEJ, facilitating the amplification of oncogenes and cell proliferation [22].The second is the loss or disruption of tumor suppressor genes through chromothripsis and NHEJ rearrangement.The third is the fusion of oncogenes by joining coding portions of two oncogenes in the same orientation [1].Previous studies have also indicated a strong relationship between chromoanagenesis and TP53 mutations in AML.Between chromoanagenesis and TP53 mutations, whole-genome sequencing (WGS) and microarray analysis revealed germline and somatic inactivation of the TP53 suppressor gene [16].In patients with newly diagnosed multiple myeloma, those with genomic rearrangement through chromothripsis revealed an aggressive disease course and poor prognosis, indicating chromothripsis may defines a rare entity of high-risk patients [20].
Chromosomal abnormalities are important for tumor formation and development.These chromosomal abnormalities are responsible for changes in the expression of or function of RNA and proteins, promoting tumor proliferation that affects the immune system, and amplification or deletion that reshapes the genome and influences tumor progression.Chromosomal abnormalities are a shared characteristic among cancers and are categorized as numerical or structural abnormalities [23].Numerical abnormalities mainly consist of aneuploidy (loss or gain of a region or chromatid) or chromosome instability (CIN) caused by segregation errors during mitosis [24].Aneuploidy can occur as segmental parts of the genome or as a whole.CIN is one of the leading causes of tumor evolution, leading to a poor survival rate in various malignancies.CIN resulting in tri-or tetraploidy has been known to promote oncogenesis and, in most cases, leads to copy number alterations resulting in aneuploidy with tetraploidy as a common temporary state of aneuploidy.Approximately 90% of human solid tumors and approximately 75% of hematopoietic cancers experienced aneuploidy [25].
Structural abnormalities consist of DNA damage in addition to the gain or loss of genomic material, forming derivative chromosomes.Commonly observed abnormalities range from deletions of chromosomal arms and amplification of genomic regions to alterations of multiple chromosomes [26,27].The most frequent changes noted have been from deletions, followed by amplification and then unbalanced translocations [28].Amplification or deletion along the genome has been observed in 88% of cancer samples [29].Other common structural abnormalities include the gain of genetic material on the q arm of chromosome 8 (33% of cancer samples) and the deletion of genetic material on the p arm of chromosomes 8 and 17 (33% and 35% of cancer samples, respectively).Evidence to date revealed chromosome 2 as the least altered, with aberrations of the p and q arms observed in 18% and 16% of cancer samples, respectively [23].Previous data have also shown structural abnormalities associated with immune signatures, with 3p, 8p, 13q, and 17p deletions having a positive correlation and 4q, 5q, and 14q deletions having a negative correlation [23].Structural abnormalities vary in different types of malignancies, with some aberrations seen more frequently and consistently in specific cancers, e.g., the relationship between AML and abnormalities of chromosomes 5 and 7.
Despite improvements in NGS-based genomics technology, the detection of complex structural chromosome abnormalities from short-read sequencing still poses challenges.The challenge of short-read sequencing is within the read length, as a short read length does not allow full representation of the human genome.Short read lengths cause the inability to read certain regions in the human genome, such as centromere regions, telomeres, and acrocentric genomic regions with tandem repeats [35].As a result, a higher mutation rate cannot be read, leaving an incomplete understanding of the human genome [36][37][38][39].Furthermore, short-read sequencing limits our understanding of complex relationships that occur in chromoanagenesis.
Long-read sequencing approaches, such as Oxford nanopore technology (ONT, Oxford, UK), are promising for characterizing chromosomal abnormalities.Although long-read sequencing poses new challenges, many of the shortcomings with short-read sequencing are resolved.Long-read sequencing allows the proper identification of simple structural abnormalities along with a better understanding of long-range structural abnormalities that occur in chromoanagenesis.Long-read sequencing can sequence stretches of DNA of up to hundreds of kilobases in length [40,41].Furthermore, clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated nanopore sequencing is used for amplifying targeted sequences containing our desired genomic region of interest.CRISPR/Cas9 is a novel gene-editing technique that can efficiently induce targeted genetic modifications.Compared to polymerase chain reaction (PCR), it is more cost-efficient and allows a higher mapping quality [42].CRISPR/Cas9 with nanopore sequencing provides greater sensitivity, allowing for real-time sequencing of the DNA, compared to nanopore sequencing itself, which has high error rates [42].This detection method allows structural variants (SVs) to exist within our sequence of interest.For the CRISPR/Cas9 ribonucleoprotein complexes, the sequence of the guide ribonucleic acid (guideRNA) is custom-designed.The guideRNA serves to recognize specific sequences of the DNA, where the ends of the cut site would be ligated to a sequencing adaptor, which then allows the region of interest to be sequenced [42].
Mate pair sequencing (MPseq) allows better detection of chromosomal abnormalities.After the genomic DNA is first fragmented, biotin is added to these ends, which then allows the fragmented DNA to circularize.The circularization method during this process allows for the detection of its SVs [43,44].MPseq's resulting coverage data can also be used to identify the copy number alterations, where we can identify the gain or loss of a copy number variant (CNV) within the genome.Through this technique, we can better identify the relationship among genetic materials even from different chromosomes.
In this study, we used an integrated approach including CRISPR/Cas9-mediated nanopore sequencing, MPseq, and SNP microarray analysis, along with conventional cytogenetic methods (chromosome analysis and FISH), to characterize complex structural chromosome abnormalities (chromoanagenesis involving chromosomes 2, 3, and 7) in AML.We have demonstrated the value of this combination approach in characterizing complex structural abnormalities for clinical and research applications.

Patient Data and Diagnosis of Acute Myeloid Leukemia/Myelodysplastic Syndrome
A male patient presented with shortness of breath and was found to have pancytopenia with circulating blasts on a smear.Acute myeloid leukemia/myelodysplastic syndrome was diagnosed by bone marrow morphology, immunostaining, and flow cytometry.Flow cytometry of the peripheral blood showed 5% phenotypically abnormal cells and an unusual myeloid blast population expressing CD13, CD24, and CD117 with dim partial CD33 and aberrant CD7.A bone marrow biopsy from his right iliac crest showed 20% atypical cells with unusual myeloid phenotype, expressing CD34, bright CD117, variable HLADR, CD38, CD13, and dim partial CD33, along with partial aberrant CD7.
The surgical pathology results from the bone marrow biopsy classified him as AML.The marrow cellularity was 80-90% and showed an increased population of immature cells.There was residual hematopoiesis with prominent developing erythroid forms with left-shifted granulopoiesis.The ratio of myeloid to erythroid precursors was about 1-2:1.Megakaryocytes were decreased in number with some small, hypolobate forms identified.The aspirate contained sheets of blasts with scant to moderate cytoplasm and distinct nucleoli.There were a few immature myeloid elements with a maturation arrest, and there was erythropoiesis showing dysplastic maturation of nuclear budding and irregular nuclear membranes.An iron stain performed on the clot section showed rare ring sideroblasts, accounting for <5% of cells.CD61 stains identified megakaryocytes and highlighted several micromegakaryocytes not clearly seen on routine stains.CD34 stained about 20% of the blasts, although these were more numerous in some areas.Peripheral blood showed circulating blasts and occasional nucleated erythroid precursors.
All procedures followed were in accordance with the ethical standards of the Institutional Committee on Human Experimentation and with the Helsinki Declaration of 1975.The study was approved by the Local Ethics Committee from the Johns Hopkins Hospital (Baltimore, MD), USA.

Cytogenetics Data: Conventional Chromosome Analysis, FISH, and SNP Microarray
Conventional G-banded chromosome studies were performed using standard techniques.At least 20 metaphase cells were analyzed from unstimulated bone marrow aspirate.The abnormal karyotypes were described using the International System for Human Cytogenetic Nomenclature (2020).
FISH was performed on interphase nuclei from cultured bone marrow cells using disease-specific probes, according to the manufacturer's instructions (Abbott Molecular Inc., Des Plaines, IL, USA).The specimen was considered abnormal if the results exceeded the laboratory-established cutoff for each probe set.
Whole-genome single-nucleotide polymorphism (SNP) microarray analysis was performed with DNA extracted from bone marrow specimens by conventional methods (Qiacube).The DNA concentration was assessed using a Qubit fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).The high-resolution microarray platform utilized was the Illumina Infinium CytoSNP-850 K v1.2 BeadChip containing > 850,000 markers (mean spacing, 3.5 kb; Illumina, Inc., San Diego, CA, USA).BeadChips were processed per manufacturer's guidelines and imaged with the Illumina iScan system.Data were analyzed with the CNV Partition 2.4.4.0 algorithm in GenomeStudio version 2010.3 (Illumina) and KaryoStudio version 1.4.3.0 (Illumina).B allele frequency and logR signal intensities were used to examine and identify potential pathogenic regions of genomic imbalance.All analyses were performed using human reference genome assembly hg19 (GRCh37).

Mate Pair Sequencing
DNA extraction and mate pair library preparation methods were performed as previously described [45,46].MPseq data were mapped to the reference genome GRCh38 using BIMA V3 [47], and SVAtools [46] was used to reveal SVs.Detection of SVs by SVAtools combines three algorithmic approaches: read-pair, split-read, and read depth/count.Clustering of the discordant and split-read fragments was performed by SVAtools to reveal SVs.Only a cluster with more than three fragments, passing the mask/filter criteria, and being called by SVAtools is considered a putative junction.

CRISPR/Cas9-Mediated Nanopore Sequencing
The crRNAs for these SVs were designed using Integrated DNA Technologies (IDT)'s design tool and selected for the highest predicted on-target performance with minimal off-target activity (IDT, Inc., Coralville, IA, USA).GuideRNA was assembled as a duplex from synthetic CRISPR RNAs (crRNAs) (Custom designed, IDT, Inc., Coralville, IA, USA) and tracrRNAs (IDT#1072532).The guideRNA duplex was designed to introduce cuts and to target flanking areas of the region of interest.CRISPR/Cas9-mediated nanopore sequencing and data analysis have been described previously [42].Briefly, the guideRNA sequence recognizes DNA sequences around the region of interest, where the CAS-9 protein's endonuclease activity then cuts the 3 ′ end of the recognized sequence.The now free region of interest has its ends ligated to a sequencing adaptor, which then allows the region of interest to be sequenced.Using this method, thirty cr-RNAs were designed to detect twenty-nine chromosomal abnormalities in chromosomes 3 and 7 (Supplemental Table S1).crRNA #28 and #29 were targeted for the same genomic region.crRNAs were designed using Custom Alt-R CRISPR-Cas9 guide RNA (https://www.idtdna.com/site/order/designtool/index/CRISPR_CUSTOM,15 February 2020) and Chopchop (https://chopchop.cbu.uib.no/, 15 February 2020) with CRISPR-Cas9.All analyses were performed using human reference genome assembly GRCh37/hg19, and SVs were reviewed independently by multiple genetic analysts via the Integrative Genomics Viewer (IGV, Broad Institute, Cambridge, MA, USA).Only clusters with more than ten reads at each potential SV breakpoint, with each read having bidirectionality, were considered a putative SV junction.SVs found within 20 kilobases (kbs) of a crRNA sequence were defined as on-target SVs.

Data Comparison among MPseq, Nanopore Sequencing, and SNP Microarray
A comparison of CNV calls from nanopore sequencing and SNP microarray analysis was performed using VIA software version 7.0 (Bionano company, San Diego, CA, USA).SV data by MPseq were converted to the reference genome hg19 (GRCh37) before being compared with SVs detected by nanopore sequencing.All SVs involving chromosomes 2, 3, and 7 by nanopore sequencing were manually reviewed using IGV.

Gene Mutation Panel by Next-Generation Sequencing (NGS)
DNA was extracted by conventional methods per manufacturer's instructions (QI-Acube; Qiagen, Hilden, Germany).The DNA concentration was assessed using a Qubit fluorometer (Thermo Fisher Scientific, Waltham, MA, USA).NGS was performed on extracted genomic DNA, as outlined previously [48,49].Briefly, library preparation was performed using Kapa Roche (Wilmington, MA, USA) reagents, hybrid capture was performed using IDT probes (Coralville, IA, USA), libraries were sequenced using an Illumina NovaSeq (paired-end technology; Illumina, San Diego, CA, USA), and sequences were aligned to GRCh37/hg19.The targeted NGS assay used 40,670 IDT probes to cover a panel of 642 pan-cancer genes [48].The mean read depth was 765× (range 341-1289), and 99.99% of target regions were captured at a level higher than 150×.Sequencing reads were visualized using IGV.As previously described [34], oncogenic somatic variants were considered candidate somatic mutations if (1) variants were present with a minimum variant allele frequency of ≥1%, in at least two alternate reads in both directions, and had an alternate allele base with mean Qscore of ≥11; (2) variants are described in COSMIC and/or ClinVar as being known cancer-associated mutations or mutational hotspots; and (3) variants were classified as deleterious and/or probably damaging by PolyPhen-2 [50] and/or SIFT [51] servers.

Mate Pair Sequencing
MPseq revealed 138 SV breakpoints as putative junctions involving chromosomes 2, 3, and 7, which included 69 regions for chromosome 3, 41 for chromosome 7, and 18 for chromosome 2 (Figure 3, Supplementary Table S3).SVs occurred on the p arm of chromosome 2, on the q arm of chromosome 3, and on the surrounding regions of the centromere

Mate Pair Sequencing
MPseq revealed 138 SV breakpoints as putative junctions involving chromosomes 2, 3, and 7, which included 69 regions for chromosome 3, 41 for chromosome 7, and 18 for chromosome 2 (Figure 3, Supplementary Table S3).SVs occurred on the p arm of chromosome 2, on the q arm of chromosome 3, and on the surrounding regions of the centromere of chromosome 7.These SVs lead to 16 novel gene fusions and 33 truncated genes (Supplementary Table S3). of chromosome 7.These SVs lead to 16 novel gene fusions and 33 truncated genes (Supplementary Table S3).

SVs by Nanopore Sequencing
SVs were analyzed based on the CRISPR/Cas9 crRNA sequences (Supplemental Table S1) in this study, as well as all SVs involving breakpoints on chromosomes 2, 3, and 7. We designed thirty CRISPR/Cas9 crRNA sequences to characterize three chromoanagenesis regions detected by SNP microarray including two losses and ten gains of two chromoanagenesis regions on chromosome 3 as well as two losses and four gains of chromoanagenesis regions on chromosome 7 (Supplementary Table S1).

SVs by Nanopore Sequencing
SVs were analyzed based on the CRISPR/Cas9 crRNA sequences (Supplemental Table S1) in this study, as well as all SVs involving breakpoints on chromosomes 2, 3, and 7. We designed thirty CRISPR/Cas9 crRNA sequences to characterize three chromoanagenesis regions detected by SNP microarray including two losses and ten gains of two chromoanagenesis regions on chromosome 3 as well as two losses and four gains of chromoanagenesis regions on chromosome 7 (Supplementary Table S1).

SVs Based on CRISPR/Cas9 crRNA Sequences
For on-target SVs involving chromosomes 3 and 7 based on CRISPR/Cas9 crRNA sequences, 27 of 29 SVs (93.1%) were on-target by nanopore sequencing (Table 1, Figure 4, supplementary Table S4).Of these 27 SVs, 8 mapped to chromosome 7, and 19 mapped to chromosome 3.The two crRNA sequences that failed to detect their respective SVs had one targeted for chromosome 3 and one targeted for chromosome 7. Chromosome-wide SV analysis of chromosomes 2, 3, and 7 revealed an additional 14 SVs besides these on-target SVs, revealing a total of 41 SVs (Supplementary Table S4).Of the 41 SVs involving chromosomes 2, 3, and 7, 28 SVs (68.3%) were from chromosome 3, 10 SVs (24.4%) were from chromosome 7, and 3 (7.3%)were from chromosome 2.In addition to simple SVs, complex SVs involving over two breakpoints were also revealed (Figure 4A,B).Twenty-five SVs (61.0%) involved sequences at introns/exons of genes, twelve (29.2%) were at intergenic regions, and four SVs (9.8%) involved sequences at both introns and intergenic regions (Table 1, Supplementary Table S4).Forty-one SVs involved 121 breakpoints (Supplementary Table S4).Furthermore, amplification of centromeric / pericentric regions of chromosome 7 was detected.show complex SVs including amplification, gain, and loss detected by nanopore sequencing, which were consistent with SNP microarray findings.Nanopore reads and SNP microarray data were analyzed by the VIA software to generate copy number variants.Chr: chromosome, CNVs: copy number variants, ROIs: regions of interest by CRISPR/cas9 guideRNAs, SVs: structural variants.

Comparison between MPseq and Nanopore Sequencing
For a total of 27 on-target SVs involving chromosomes 3 and 7 detected by CRISPR/Cas9mediated nanopore sequencing, 21 SVs (77.8%) were detected by MPseq with shared SV breakpoints (Table 2, Figure 4, Supplementary Table S4).The remaining 6 on-target SVs detected by CRISPR/Cas9-mediated nanopore sequencing were located at distal/proximal locations of MPseq breakpoints or had low coverage reads.Genomic locations are based on the hg19 (GRCh37) genome assembly.

Copy Number Variant Analysis
CNVs for the 41 SVs involving chromosomes 2, 3, and 7 by nanopore sequencing were analyzed and compared with SNP microarray results using VIA software (Bionano company, San Diego, CA, USA).These SVs had a total of 62 CNVs including 35 gains, 12 amplifications, and 15 losses (Table 2, Supplementary Table S4).Complex CNVs were common (Figure 4C).Gains/amplifications were more frequent than losses.Chromosome 3 had more gains/amplifications (31) than losses (8).Chromosome 7 had 13 gains/amplification and 7 losses.Chromosome 2 had two gains and one amplification.High amplification of the chromosome 7 centromere was found.

DNA Sequences Flanking the SV Breakpoints
To understand the genome architecture at SV breakpoints and the role of unusual DNA sequences such as low-copy repeats or tandem repeats [52,53] in chromoanagenesis, we checked for all repeat elements at the SV breakpoints using RepeatMasker [http://www.repeatmasker.org, 15 August 2023] and Repbase update programs [54].Of the 55 SV breakpoints that were detected by MPseq and had sequencing reads by nanopore sequencing, 19 were intergenic, 35 were at intronic regions, and 1 was at an exon (Supplemental Table S5).A variety of repeats were detected in 32 out of 55 breakpoints (58.2%), including short interspersed nuclear elements (SINEs, a total of 19), long interspersed nuclear elements (LINEs, a total of 12), and long terminal repeat elements (LTRs, a total of 1) (Figure 5).common (Figure 4C).Gains/amplifications were more frequent than losses.Chromosome 3 had more gains/amplifications (31) than losses (8).Chromosome 7 had 13 gains/amplification and 7 losses.Chromosome 2 had two gains and one amplification.High amplification of the chromosome 7 centromere was found.

DNA Sequences Flanking the SV Breakpoints
To understand the genome architecture at SV breakpoints and the role of unusual DNA sequences such as low-copy repeats or tandem repeats [52,53] in chromoanagenesis, we checked for all repeat elements at the SV breakpoints using RepeatMasker [http://www.repeatmasker.org, 15 August 2023] and Repbase update programs [54].Of the 55 SV breakpoints that were detected by MPseq and had sequencing reads by nanopore sequencing, 19 were intergenic, 35 were at intronic regions, and 1 was at an exon (Supplemental Table S5).A variety of repeats were detected in 32 out of 55 breakpoints (58.2%), including short interspersed nuclear elements (SINEs, a total of 19), long interspersed nuclear elements (LINEs, a total of 12), and long terminal repeat elements (LTRs, a total of 1) (Figure 5).Of the 19 breakpoints that were intergenic (9 for chromosome 3 and 10 for chromosome 7), 16 had no significant motifs to note.The remaining three off-gene breakpoints had repeat motifs (L3, L1ME1, and L1MEg/Charlie1a, with the former being on chromosome 7 and the latter two being on chromosome 7).Of the 35 mapped to an intronic region, 6 breakpoints lie on gene regions that have repeated motifs.Two breakpoints came from chromosome 2 on the intron of the gene ATAD2B and had L1MEa repeat motifs; two breakpoints on chromosome 3 on the introns of the genes CFAP91 and SEMA5B had repeat Of the 19 breakpoints that were intergenic (9 for chromosome 3 and 10 for chromosome 7), 16 had no significant motifs to note.The remaining three off-gene breakpoints had repeat motifs (L3, L1ME1, and L1MEg/Charlie1a, with the former being on chromosome 7 and the latter two being on chromosome 7).Of the 35 mapped to an intronic region, 6 breakpoints lie on gene regions that have repeated motifs.Two breakpoints came from chromosome 2 on the intron of the gene ATAD2B and had L1MEa repeat motifs; two breakpoints on chromosome 3 on the introns of the genes CFAP91 and SEMA5B had repeat motifs of L1ME4a, and HAL1, respectively.The remaining two breakpoints lie on chromosome 7 on the intron of the gene POM121 and have L1MEc and L1MB4 repeat motifs.Of the 35 intronic breakpoints, 17 (14 from chromosome 3 and 3 from chromosome 7) lie on non-repetitive regions of the genes ATP6V1A (4), LSAMP (4), MGLL (3), CBLB (1), CASR (1), ROPN1B (1), and AUTS2 (3).Of the 35 intronic breakpoints, 11 (9 from chromosome 3 and 2 from chromosome 7) lie on gene regions with and without repeat motifs: SLC49A4 (4), ALDH1L1 (2), CHCHD6 (3), and GLANT17 (2).One of three regions on the intron of the gene SLC49A4 had a repeat motif (L2a), one of two regions on the intron of the gene ALDH1L1 had a repeat motif (L1MB1), two of three regions on the intron of the gene CHCHD6 had a repeat motif (L1MB7), and one of two regions on the intron of the gene GALNT17 had a repeat motif (MER5A).Two breakpoints on gene CHST13 lie on an intron and an exon, with neither having repeat motifs.

NGS Gene Mutation Panel
The NGS gene mutation panel revealed a homozygous mutation in the TP53 gene (chr17:7574034 C>T; c.994-1G>A) and a DNMT3A mutation (chr2:25464543 A>C; p.V657G).The DNMT3A p.V657G mutation is in the protein's DNA methylase domain.In vitro studies showed that the p.V657G mutation led to DNMT3A inactivation by reduced methyltransferase function and protein stability [55].

Discussion
Although complex structural chromosome abnormalities (chromoanagenesis) have been reported in AML/MDS, this is the first study using CRISPR/Cas9-mediated nanopore sequencing, MPseq, and SNP microarray analysis along with classic cytogenetic methods (conventional chromosome analysis and FISH) to characterize chromoanagenesis events involving chromosomes 2, 3, and 7.The complex chromoanagenesis events in this study not only have multiple gains and a few losses involving chromosomes 2, 3, and 7, but also have amplification of the chromosome 7 centromere and a pseudotricentric chromosome 7.A pseudotricentric chromosome is a tricentric structure in which only one centromere is active.Chromoanagenesis events along with amplification of the chromosome 7 centromere and a pseudotricentric chromosome 7 have not been reported previously.Furthermore, the presence of centromeric repetitive sequences among chromoanagenesis events adds an extra challenge in characterizing and mapping these SV breakpoints.
Unlike NGS and WGS, CRISPR/Cas9-mediated nanopore sequencing allows the enrichment of genomic regions of interest without PCR amplification, which eliminates potential strand biases due to PCR amplification [42].Furthermore, when short-read NGS and WGS are used, it is usually hard to have good coverage of repeat sequences, especially centromeric regions [35].In this study, the amplification of chromosome 7 centromeric regions was detected by long-read nanopore sequencing.Centromeres are vital for genetic stability and inheritance [56].Research on centromeres is limited, as they are not typically studied [57][58][59].Although complex involvement of chromosome 7 centromeric regions in chromoanagenesis has not been reported in AML/MDS, studies in the Cryptococcus species demonstrated that multiple DNA double-strand breaks (DSBs) at centromere-specific retrotransposons can lead to the formation of multiple interchromosomal rearrangements (chromothripsis-like events) [60].
Our CRISPR/Cas9 crRNAs were designed for the detection of chromoanagenesis events as revealed by SNP microarray analysis.The sequence of the guideRNA recognizes the adjacent sequence of our genomic region of interest, where endonuclease activity occurs on the recognized sequence.The resulting genomic regions of interest are then examined using nanopore sequencing.Besides the high percentage of on-target SVs (93.3%), additional SVs involving chromosomes 2, 3, and 7 were detected, some of which are consistent with MPseq data.As a whole-genome approach, it is not surprising that MPseq revealed more SVs compared to a targeted approach by CRISPR/Cas9-mediated nanopore sequencing.
Although multiple mechanisms were previously proposed for rearrangements of the complex genomic structure (chromoanagenesis), chromothripsis followed by NHEJ repair may have implications in this study.The chromoanagenesis event in our patient involves chromosome 3 as revealed by SNP microarray analysis.MPseq and targeted nanopore sequencing using a CRISPR/Cas9 approach further characterize this chromoanagenesis event involving multiple SVs and CNVs of chromosomes 2, 3, and 7, which leads us to speculate that our patient's chromoanagenesis event involves rearrangement of the genomic structure driven by chromothripsis and repaired through NHEJ following extensive DSBs.During NHEJ, small amounts of DNA are removed during the processing phase before being ligated together randomly through DNA ligase.Several DSBs rejoined randomly would result in improper DNA repair and cause translocation of genetic material or rearrangement of the genomic structure that could lead to disruptions of tumor suppressors or amplification of oncogenes.This could explain the patient's observed translocations/rearrangements among chromosomes 2, 3, and 7, and the loss of 7q genetic material forming a derivative chromosome 7.
Given the high amplification of the chromosome 7 centromere, the gain of extra chromosome 7 centromere sequences (tricentric), and multiple gains of genomic material (mainly involving chromosomes 3 and 7), as found in this case, chromoanasynthesis via FoSTes/MMBIR joining [1,2,63] could be another potential mechanism responsible for the formation of the complex derivative chromosome 7. Chromoanasynthesis occurs through DNA replication error, and template switching through FoSTes or MMBIR occurs at a replication fork, forcing replication to use a template of a nearby sequence or chromosome in the nucleus [2].Frequent template switches result in complex rearrangements and re-start replication forks.
Recurrent genetic abnormalities including SVs, especially gene fusions, gene rearrangements, and CNVs are important in aiding AML diagnosis and classification, as well as providing information about the prognosis [30].In general, current clinical genetic diagnostic methods (such as karyotype, FISH, SNP microarray, and short-read-based NGS assays) are incapable of providing high-resolution characterization of SVs and CNVs.Chromoanagenesis contributes to the formation and development of cancer via massive SVs and CNVs, which may disrupt the activity of tumor suppressor genes, activate oncogenes, and/or generate fusion proteins with oncogenic potential.Oncogenic SVs and CNVs along with mutations (such as TP53) promote the survival of cancer cells with massive genetic abnormalities.The identification of SVs and CNVs of chromoanagenesis may be useful for further classifying distinct subtypes in myeloid malignancies.We postulate that these subtypes in the future may be defined by the genomic composition of chromoanagenesis, SVs, CNVs, and mutational status.Comprehensive characterization of SVs and CNVs not only provides insights into the underlying molecular mechanisms of cancer development and may advance further classification of AML subtypes, but also may contribute to the identification of new therapeutic targets and the development of innovative treatment approaches.Frequently, targeted therapies designed to inhibit the activity of specific genes or fusion proteins can be more specific and less toxic than traditional chemotherapy.
Breakpoints of complex chromosomal rearrangements in this study are more frequent in genes compared to intergenic regions.Breakpoints at intronic gene regions seem to be more frequent than breakpoints at exons.Over half of these breakpoints are associated with known repeat elements.It is well known that points of genomic instability can be generated by these repetitive sequences, and these repeat elements may serve as substrates for complex structural rearrangements [70,71].Both LINE sequences and Alu repeats at SV breakpoints are frequent in this case.LINE-1 (L1) is a well-known endogenous mutagen with both DNA endonuclease [72] and reverse transcriptase activities [73].L1 can mobilize not only itself [74,75], but also other retrotransposons such as Alu [76,77].Somatic (tissue-specific) non-allelic recombination between homologous repetitive elements contributes to human diseases.Centromeres and cancer-associated genes are enriched for retroelements that may act as recombination hotspots [78].Retroelement recombination may lead to genomic instability, structural variants, and segmental duplications [35,[78][79][80][81][82].Widespread somatic recombination of L1 and Alu elements may serve as potential mutagens in the genome [78,83].The abundance of L1 and Alu elements at SV breakpoints in our patient may suggest active and inactive retrotransposons involving a chromoanagenesis event.Non-allelic recombination between homologous repetitive elements involving the chromosome 7 centromere and cancer-associated genes may play a role in the formation of this complex derivative chromosome 7.Further studies of SV breakpoint junctions involved in AML chromoanagenesis cases will be necessary to elucidate the role of these endogenous mutagens in chromoanagenesis formation.
Our combination approach serves to characterize the mechanism of this chromoanagenesis event.While this approach is beneficial, there are still some drawbacks, one of which is the accuracy of long-read sequencing.While long reads provide the benefit of better human genome understanding and the ability to access unreadable regions from NGS, their accuracy, cost, and efficiency provide limitations with their usage.Compared to short-read NGS, the accuracy of long reads is low and variable in certain situations [84][85][86][87].This inconsistency in accuracy produces challenges in gene annotations and the complete understanding of a genome [88].However, with polishing tools, the accuracy is improved, although most polishing tools require a reference and, in some cases, are dependent on the short-read sequences of the individual [89,90].This leads to another challenge that long reads pose, their cost and efficiency.Compared to WGS through short-read sequencing, WGS through long-read sequencing is overall expensive and time-consuming, where in some cases, it could take weeks to obtain results [84,85,91].Therefore, further advanced analysis software of long-read sequencing is needed to provide fast personalized oncogenomics in a single sequencing assay (such as nanopore sequencing of native DNA without PCR amplification) to detect large, complex SVs, CNVs, and potential epigenetic modifications via genomic phasing using haplotype-specific methylation calls [92].
In this proof-of-principle study, we demonstrated the feasibility of this integrated approach in an AML patient carrying three chromoanagenesis events.Given the rarity of chromoanagenesis in hematological malignancies and the lack of well-characterized chromoanagenesis events in commonly available cell lines or accessible specimens of cancer patients, the major limitation of this study is a single AML case.While this study identified SVs and CNVs of chromoanagenesis that may be useful for further classifying distinct subtypes in myeloid malignancies, comprehensive studies of the different myeloid subtypes' chromoanagenesis, SVs, CNVs, and molecular profiles and their impact on disease outcomes are needed to inform clinical decision making.Future studies that accumulate more well-characterized chromoanagenesis from multiple centers, obtain comprehensive clinical data, and follow various treatment strategies in patients with myeloid malignancies will shed light on treatment response rates, survival rates, and overall prognosis of these patients.

Conclusions
To our knowledge, our case is the first case with complex chromoanagenesis involving chromosomes 2, 3, and 7 along with a pseudotricentric chromosome 7 centromere and amplification of the chromosome 7 centromere.This report emphasizes the value of performing an integrated approach including long-read nanopore sequencing, MPseq, and cytogenomic methods to characterize complex structural rearrangements in AML.The long reads from the nanopore not only determined simple structural abnormalities but also enabled us to resolve the long-range structure of the complex chromoanagenesis.Sequencing the cancer genome of our patient using CRISPR/Cas9-mediated targeted sequencing on nanopore results detected breakpoints of complex structural chromosome abnormalities that are highly sensitive.The characterization of these complex structural chromosome abnormalities not only will help understand the molecular mechanisms responsible for the formation and development of chromoanagenesis, but also may identify specific molecular targets and their impact on therapy and overall survival.This combination approach in the characterization of chromoanagenesis and other structural abnormalities may be useful for both clinical and research applications.
Author Contributions: Y.S.Z., V.S. and L.J. conceptualized the study and designed the research.L.M., Y.S.Z., V.S., N.L.H. and K.E.P. performed the research.M.P., B.P., K.S. and M.A.G. analyzed the data.M.P., B.P. and M.A.G. wrote the paper.M.P., M.A.G., V.S., L.M., N.L.H., K.E.P., K.S., B.P., L.J. and Y.S.Z.contributed to the scientific discussion, data interpretation, and paper revision.All authors have read and agreed to the published version of the manuscript.

Funding:
The Johns Hopkins Cytogenomics Laboratory is an academic laboratory supported by the Johns Hopkins School of Medicine Department of Pathology.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki.The institutional review board of Johns Hopkins Medicine (protocol code IRB #NA_00002948 on 14 September 2013) approved this study.

Informed Consent Statement:
The Johns Hopkins Medicine institutional review board (IRB) approved this HIPAA-compliant study with a waiver of consent.All data (including bone marrow specimens) was provided to us without patient identification.

Figure 1 .
Figure 1.Cytogenetic data.(A) Karyogram.Red arrows point to an abnormal derivative chromosome 7, and white arrows point to other numerical and structural abnormalities.(B) Interphase FISH revealed amplification of chromosome 7 centromeres (in green color, pointed by red arrows) and deletion of 7q31 (in red color).(C) Metaphase FISH.The derivative chromosome 7 (red arrow) shows multiple signals and amplification of the green centromere signal.The green arrow points to a normal chromosome 7.The right-side inserted box shows the make-up of the derivative chromosome 7 by conventional chromosome analysis and FISH data.The derivative chromosome 7 was pseudotricentric and showed centromere amplification.

Figure 2 .
Figure 2. Genome-wide SNP microarray revealed chromoanagenesis regions on chromosomes 3 and 7 (red circles) and loss of 7q (7q21.11−7q26.3).Chromosome 2 is normal except for two small losses and one gain.Blue dots for genotype (B allele frequency) and red lines for copy number based on probe intensities.

Figure 1 .
Figure 1.Cytogenetic data.(A) Karyogram.Red arrows point to an abnormal derivative chromosome 7, and white arrows point to other numerical and structural abnormalities.(B) Interphase FISH revealed amplification of chromosome 7 centromeres (in green color, pointed by red arrows) and deletion of 7q31 (in red color).(C) Metaphase FISH.The derivative chromosome 7 (red arrow) shows multiple signals and amplification of the green centromere signal.The green arrow points to a normal chromosome 7.The right-side inserted box shows the make-up of the derivative chromosome 7 by conventional chromosome analysis and FISH data.The derivative chromosome 7 was pseudotricentric and showed centromere amplification.

Figure 1 .
Figure 1.Cytogenetic data.(A) Karyogram.Red arrows point to an abnormal derivative chromosome 7, and white arrows point to other numerical and structural abnormalities.(B) Interphase FISH revealed amplification of chromosome 7 centromeres (in green color, pointed by red arrows) and deletion of 7q31 (in red color).(C) Metaphase FISH.The derivative chromosome 7 (red arrow) shows multiple signals and amplification of the green centromere signal.The green arrow points to a normal chromosome 7.The right-side inserted box shows the make-up of the derivative chromosome 7 by conventional chromosome analysis and FISH data.The derivative chromosome 7 was pseudotricentric and showed centromere amplification.

Figure 2 .
Figure 2. Genome-wide SNP microarray revealed chromoanagenesis regions on chromosomes 3 and 7 (red circles) and loss of 7q (7q21.11−7q26.3).Chromosome 2 is normal except for two small losses and one gain.Blue dots for genotype (B allele frequency) and red lines for copy number based on probe intensities.

Figure 2 .
Figure 2. Genome-wide SNP microarray revealed chromoanagenesis regions on chromosomes 3 and 7 (red circles) and loss of 7q (7q21.11−7q26.3).Chromosome 2 is normal except for two small losses and one gain.Blue dots for genotype (B allele frequency) and red lines for copy number based on probe intensities.

Figure 3 .
Figure 3. Mate pair sequencing revealed complex rearrangement involving chromosomes 2, 3, and 7.Only structural variants involving these three chromosomes are shown by genome plot (black lines), and breakpoints are shown by solid light green circles.Red arrows pointed to chromosomes 2, 3, and 7.

Figure 3 .
Figure 3. Mate pair sequencing revealed complex rearrangement involving chromosomes 2, 3, and 7.Only structural variants involving these three chromosomes are shown by genome plot (black lines), and breakpoints are shown by solid light green circles.Red arrows pointed to chromosomes 2, 3, and 7.

Figure 4 .
Figure 4. Summarization of various structural variant breakpoints revealed by CRISPR/Cas9mediated nanopore sequencing, MPseq, and SNP microarray.From left to right, SNP microarray revealed three chromoanagenesis regions: two on chromosome 3q and one on 7q (shown by red circles).Mate pair sequencing data is in a solid red circle, CRISPR/Cas9 nanopore sequencing data