Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Sources
2.2. k-mer Analysis
2.3. Local k-mer Analysis
2.4. Deviation from Average Spectra for Word Sets
2.5. Regions Deviating from Average Spectra (ReDFAS)
2.6. Significance of Correlations
2.7. Principal Component Analysis (PCA)
3. Results
3.1. Translocated Regions on Chromosome 9 Are Visible in Hi-C Data
3.2. Local Deviations from the Average k-mer Spectrum
3.3. Classification of ReDFAS
3.4. Relationship between k-mer Spectrum Deviation and 3D Chromatin Organization
3.5. Relationship between ReDFAS, Breakpoint Regions, and NPCs, PCs, CDSs, ALUs, and L1s
4. Discussion
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zimatore, G.; Tsuchiya, M.; Hashimoto, M.; Kasperski, A.; Giuliani, A. Self-organization of whole-gene expression through coordinated chroamtin structural transision. Biophys. Rev. 2021, 2, 031303. [Google Scholar] [CrossRef] [PubMed]
- Bizzarri, M.; Naimark, O.; Nieto-Villar, J.; Fedeli, V.; Giuliani, A. Complexity in biological organization: Deconstruction (and subsequent restating) of key concepts. Entropy 2020, 22, 885. [Google Scholar] [CrossRef] [PubMed]
- Gorban, A.N.; Tyukina, T.A.; Pokidysheva, L.I.; Smirnova, E.V. Dynamic and thermodynamic models of adaptation. Phys. Life Rev. 2021, 37, 17–64. [Google Scholar] [CrossRef]
- Rowley, M.J.; Corces, V.G. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 2018, 19, 789–800. [Google Scholar] [CrossRef] [PubMed]
- Meaburn, K.J.; Misteli, T.; Soutoglou, E. Spatial genome organization in the formation of chromosomal translocations. Semin. Cancer Biol. 2007, 17, 80–90. [Google Scholar] [CrossRef]
- Dekker, J.; Rippe, K.; Dekker, M.; Kleckner, N. Capturing chromosome conformation. Science 2002, 295, 1306–1311. [Google Scholar] [CrossRef] [PubMed]
- Kolovos, P.; van de Werken, H.J.; Kepper, N.; Zuin, J.; Brouwer, R.W.; Kockx, C.E.; Wendt, K.S.; van IJcken, W.F.; Grosveld, F.; Knoch, T.A. Targeted Chromatin Capture (T2C): A novel high resolution high throughput method to detect genomic interactions and regulatory elements. Epigenet. Chromatin 2014, 7, 10. [Google Scholar] [CrossRef]
- Hofmann, A.; Heermann, D.W. Overview of processing and analyzing Hi-C data on bacteria. In Methods in Molecular Biology; Dame, R., Ed.; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Esa, A.; Edelmann, P.; Kreth, G.; Trakhtenbrot, L.; Amariglio, N.; Rechavi, G.; Hausmann, M.; Cremer, C. Three-dimensional spectral precision distance microscopy of chromatin nano-structures after triple-colour DNA labelling: A study of the BCR region on chromosome 22 and the Philadelphia chromosome. J. Microsc. 2000, 199, 96–105. [Google Scholar] [CrossRef]
- Rowley, J.D. A new consistent chromosomal abnormality in chronic myelogenous leukaemia identified by quinacrine fluorescence and Giemsa staining. Nature 1973, 243, 290–293. [Google Scholar] [CrossRef]
- Kotecki, M.; Reddy, P.S.; Cochran, B.H. Isolation and Characterization of a Near-Haploid Human Cell Line. Exp. Cell Res. 1990, 252, 273–280. [Google Scholar] [CrossRef]
- Oshimura, M.; Freeman, A.I.; Sandberg, A.A. Chromosomes and causation of human cancer and leukemia. XXIII. Near-haploidy in acute leukemia. Cancer 1977, 40, 1143–1148. [Google Scholar] [CrossRef] [PubMed]
- Hong, M.; He, G. The 2016 Revision to the World Health Organization Classification of Myelodysplastic Syndromes. J. Transl. Int. Med. 2017, 5, 139–143. [Google Scholar] [CrossRef] [PubMed]
- Lieberman-Aiden, E.; Van Berkum, N.L.; Williams, L.; Imakaev, M.; Ragoczy, T.; Telling, A.; Amit, I.; Lajoie, B.R.; Sabo, P.J.; Dorschner, M.O.; et al. Comprehensive mapping of long-range interactions reveals foldingprinciples of the human genome. Science 2009, 326, 289–293. [Google Scholar] [CrossRef]
- Erenpreisa, J.; Krigerts, J.; Salmina, K.; Gerashchenko, B.I.; Freivalds, T.; Kurg, R.; Krufczik, M.; Winter, R.; Zayakin, P.; Hausmann, M.; et al. Heterochromatin networks: Topology, dynamics, and function (a working hypothesis). Cells 2021, 10, 1582. [Google Scholar] [CrossRef] [PubMed]
- Sievers, A.; Sauer, L.; Hausmann, M.; Hildenbrand, G. Eukaryotic genomes show strong evolutionary conservation of k-mer composition and correlation contributions between introns and intergenic regions. Genes 2021, 12, 1571. [Google Scholar] [CrossRef] [PubMed]
- Sievers, A.; Bosiek, K.; Bisch, M.; Dreessen, C.; Riedel, J.; Froß, P.; Hausmann, M.; Hildenbrand, G. k-mer content, correlation, and position analysis of genome DNA sequences for the identification of function and evolutionary features. Genes 2017, 8, 122. [Google Scholar] [CrossRef] [PubMed]
- Sievers, A.; Sauer, L.; Bisch, M.; Sprengel, J.; Hausmann, M.; Hildenbrand, G. Moderation of Structural DNA Properties by Coupled Dinucleotide Contents in Eukaryotes. Genes 2023, 14, 755. [Google Scholar] [CrossRef]
- Geggier, S.; Vologodskii, A. Sequence dependence of DNA bending rigidity. Proc. Natl. Acad. Sci. USA 2010, 107, 15421–15426. [Google Scholar] [CrossRef]
- Rohs, R.; West, S.M.; Sosinsky, A.; Liu, P.; Mann, R.S.; Honig, B. The role of DNA shape in protein-DNA recognition. Nature 2009, 461, 1248–1253. [Google Scholar] [CrossRef]
- Yella, V.R.; Bhimsaria, D.; Ghoshdastidar, D.; Rodríguez-Martínez, J.A.; Ansari, A.Z.; Bansal, M. Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif. Nucleic Acids Res. 2018, 46, 11883–11897. [Google Scholar] [CrossRef]
- Kharerin, H.; Bhat, P.J.; Padinhateeri, R. Role of nucleosome positioning in 3D chromatin organization and loop formation. J. Biosci. 2020, 45, 14. [Google Scholar] [CrossRef]
- Falk, M.; Hausmann, M. A paradigm revolution or just better resolution—Will newly emerging superresolution techniques identify chromatin architecture as a key factor in radiation-induced DNA damage and repair regulation? Cancers 2021, 13, 18. [Google Scholar] [CrossRef] [PubMed]
- Deininger, P. ALU elements: Know the SINEs. Genome Biol. 2011, 12, 1. [Google Scholar] [CrossRef]
- Morales, M.E.; White, T.B.; Streva, V.A.; DeFreece, C.B.; Hedges, D.J.; Deininger, P.L. The contribution of ALU elements to mutagenic DNA double-strand break repair. PLoS Genet. 2015, 11, e1005016. [Google Scholar] [CrossRef] [PubMed]
- Dixon, J.R.; Selvaraj, S.; Yue, F.; Kim, A.; Li, Y.; Shen, Y.; Hu, M.; Liu, J.S.; Ren, B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 2012, 485, 376–380. [Google Scholar] [CrossRef]
- Neems, D.S.; Garza-Gongora, A.G.; Smith, E.D.; Kosak, S.T. Topologically associated domains enriched for lineage-specific genes reveal expression-dependent nuclear topologies during myogenesis. Proc. Nat. Acad. Sci. USA 2016, 113, E1691–E1700. [Google Scholar] [CrossRef]
- Pombo, A.; Dillon, N. Three-dimensional genome architecture: Players and mechanisms. Nat. Rev. Mol. Cell Biol. 2015, 16, 245–257. [Google Scholar] [CrossRef]
- Benson, D.A.; Karsch-Mizrachi, I.; Lipman, D.J.; Ostell, J.; Wheeler, D.L. Genbank. Nucl. Acids Res. 2007, 35, D21–D25. [Google Scholar] [CrossRef]
- Calandrelli, R.; Wu, Q.; Guan, J.; Zhong, S. Gitar: An open source tool for analysis and visualization of hi-c data. Genomics 2018, 16, 365–372. [Google Scholar] [CrossRef]
- Rao, S.S.; Huntley, M.H.; Durand, N.C.; Stamenova, E.K.; Bochkov, I.D.; Robinson, J.T.; Sanborn, A.L.; Machol, I.; Omer, A.D.; Lander, E.S.; et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 2014, 159, 1665–1680. [Google Scholar] [CrossRef]
- Sanborn, A.L.; Rao, S.S.; Huang, S.C.; Durand, N.C.; Huntley, M.H.; Jewett, A.I.; Bochkov, I.D.; Chinnappan, D.; Cutkosky, A.; Li, J.; et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Nat. Acad. Sci. USA 2015, 112, E6456–E6465. [Google Scholar] [CrossRef] [PubMed]
- Sievers, A.; Wenz, F.; Hausmann, M.; Hildenbrand, G. Conservation of k-mer composition and correlation contribution between introns and intergenic regions of animalia genomes. Genes 2018, 9, 482. [Google Scholar] [CrossRef]
- Pearson, K. Vii. Note on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 1895, 58, 240–242. [Google Scholar]
- Chor, B.; Horn, D.; Goldman, N.; Levy, Y.; Massingham, T. Genomic DNA k-mer spectra: Models and modalities. Gen. Biol. 2009, 10, R108. [Google Scholar] [CrossRef] [PubMed]
- Hikmat, W.M. Code for the Project. 2022. Available online: https://github.com/whikmat/OligoCode (accessed on 6 April 2022).
- Carette, J.E.; Guimaraes, C.P.; Varadarajan, M.; Park, A.S.; Wuethrich, I.; Godarova, A.; Kotecki, M.; Cochran, B.H.; Spooner, E.; Ploegh, H.L.; et al. Haploid genetic screens in human cells identify host factors used by pathogens. Science 2009, 326, 1231–1235. [Google Scholar] [CrossRef] [PubMed]
- Nichols, W.W.; Murphy, D.G.; Cristofalo, V.J.; Toji, L.H.; Greene, A.E.; Dwight, S.A. Characterization of a new human diploid cell strain, IMR-90. Science 1977, 196, 60–63. [Google Scholar] [CrossRef]
- Melo, J.V.; Gordon, D.; Cross, N.; Goldman, J. The abl-bcr fusion gene is expressed in chronic myeloid leukemia. Blood 1993, 81, 158–165. [Google Scholar] [CrossRef]
- Zheng, H.; Xie, W. The role of 3d genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 2019, 20, 535–550. [Google Scholar] [CrossRef]
- Lu, J.Y.; Chang, L.; Li, T.; Wang, T.; Yin, Y.; Zhan, G.; Han, X.; Zhang, K.; Tao, Y.; Percharde, M.; et al. Homotypic clustering of L1 and B1/Alu repeats compartmentalizes the 3D genome. Cell Res. 2021, 31, 613–630. [Google Scholar] [CrossRef]
- Tonk, V.S.; Wyandt, H.E.; Huang, X.; Patel, N.; Morgan, D.L.; Kukolich, M.; Lockhart, L.H.; Gopalrao, X.; Velagaleti, V.N. Disease associated balanced chromosome rearrangements (DBCR): Report of two new cases. Ann. De Genet. 2003, 46, 37–43. [Google Scholar] [CrossRef]
- Thåström, A.; Lowary, P.T.; Widlund, H.R.; Cao, H.; Kubista, M.; Widom, J. Sequence motifs and free energies of selected natural and non-natural nucleosome positioning DNA sequences. J. Mol. Biol. 1999, 288, 213–229. [Google Scholar] [CrossRef]
- Kaiser, V.B.; Semple, C.A. Chromatin loop anchors are associated with genome instability in cancer and recombination hotspots in the germline. Genome Biol. 2018, 19, 101. [Google Scholar] [CrossRef]
- Tang, S.J. Chromatin Organization by Repetitive Elements (CORE): A Genomic Principle for the Higher-Order Structure of Chromosomes. Genes 2011, 2, 502–515. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Castellanos, M.; Mothi, N.; Muñoz, V. Eukaryotic transcription factors can track and control their target genes using DNA antennas. Nature Commun. 2020, 11, 540. [Google Scholar] [CrossRef]
- Anderson, J.D.; Widom, J. Poly(dA-dT) promoter elements increase the equilibrium accessibility of nucleosomal DNA target sites. Mol. Cell Biol. 2001, 21, 3830–3839. [Google Scholar] [CrossRef]
- Segal, E.; Widom, J. Poly(dA:dT) tracts: Major determinants of nucleosome organization. Curr. Opin. Struct. Biol. 2009, 19, 65–71. [Google Scholar] [CrossRef]
- Gu, Z.; Jin, K.; Crabbe, M.J.C.; Zhang, Y.; Liu, X.; Huang, Y.; Hua, M.; Nan, P.; Zhang, Z.; Zhong, Y. Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome. et al. Enrichment analysis of Alu elements with different spatial chromatin proximity in the human genome. Protein Cell 2016, 7, 250–266. [Google Scholar] [CrossRef]
- Kim, P.; Tan, H.; Liu, J.; Yang, M.; Zhou, X. Fusionai: Predicting fusion breakpoint from DNA sequence with deep learning. Science 2021, 24, 103164. [Google Scholar] [CrossRef]
- Kaiser, V.B.; Taylor, M.S.; Semple, C.A. Mutational Biases Drive Elevated Rates of Substitution at Regulatory Sites across Cancer Types. PLoS Genet. 2016, 12, e1006207. [Google Scholar] [CrossRef]
- Cavalli, G.; Misteli, T. Functional Implications of Genome Topology. Nat. Struct. Mol. Biol. 2013, 20, 290–299. [Google Scholar] [CrossRef]
- Krigerts, J.; Salmina, K.; Freivalds, T.; Zayakin, P.; Rumnieks, F.; Inashkina, I.; Giuliani, A.; Hausmann, M.; Erenpreisa, J. Differentiating breast cancer cells reveal early large-scale genome regulation by pericentric domains. Biophys. J. 2021, 120, 711–724. [Google Scholar] [CrossRef] [PubMed]


 = A-rich;
 = A-rich;  = C-rich;
 = C-rich;  = GC-rich.
 = GC-rich.
   = A-rich;
 = A-rich;  = C-rich;
 = C-rich;  = GC-rich.
 = GC-rich.




| Word Group | Corresponding 5-mers | 
|---|---|
| A-rich | AAAAN, AAANA, AANAA, ANAAA, NAAAA | 
| C-rich | CCCCN, CCCNC, CCNCC, CNCCC, NCCCC | 
| G-rich | GGGGN, GGGNG, GGNGG, GNGGG, NGGGG | 
| T-rich | TTTTN, TTTNT, TTNTT, TNTTT, NTTTT | 
| AT-rich | WWWWW | 
| GC-rich | SSSSS | 
| chr. | Cover ReDFAS | BP in ReDFAS | Cover NPC | BP in NPC | Cover CDS | BP in CDS | Cover Alu | BP in Alu | Cover L1 | BP in L1 | 
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 4% | 9% | 12% | 6% | 1% | 38% | 11% | 3% | 14% | 1% | 
| 2 | 4% | 8% | 17% | 8% | 1% | 39% | 9% | 4% | 18% | 1% | 
| 3 | 4% | 12% | 15% | 7% | 1% | 37% | 9% | 3% | 18% | 1% | 
| 4 | 4% | 14% | 16% | 7% | 1% | 35% | 7% | 3% | 19% | 1% | 
| 5 | 4% | 11% | 16% | 7% | 1% | 36% | 8% | 3% | 19% | 1% | 
| 6 | 4% | 11% | 16% | 8% | 1% | 35% | 9% | 4% | 18% | 1% | 
| 7 | 4% | 7% | 13% | 7% | 1% | 38% | 11% | 3% | 17% | 1% | 
| 8 | 4% | 9% | 18% | 7% | 1% | 39% | 9% | 2% | 18% | 0% | 
| 9 | 4% | 10% | 13% | 4% | 1% | 40% | 9% | 3% | 15% | 1% | 
| 10 | 4% | 7% | 13% | 7% | 1% | 40% | 11% | 3% | 16% | 1% | 
| 11 | 4% | 9% | 13% | 10% | 2% | 37% | 9% | 3% | 17% | 1% | 
| 12 | 4% | 7% | 14% | 6% | 1% | 38% | 11% | 3% | 16% | 0% | 
| 13 | 4% | 12% | 14% | 8% | 1% | 38% | 7% | 2% | 15% | 1% | 
| 14 | 4% | 9% | 15% | 11% | 1% | 41% | 9% | 4% | 14% | 0% | 
| 15 | 3% | 6% | 13% | 8% | 1% | 38% | 10% | 4% | 14% | 0% | 
| 16 | 4% | 6% | 11% | 4% | 2% | 42% | 14% | 3% | 11% | 0% | 
| 17 | 2% | 0% | 11% | 6% | 2% | 40% | 18% | 2% | 10% | 0% | 
| 18 | 2% | 0% | 12% | 4% | 1% | 37% | 8% | 2% | 16% | 1% | 
| 19 | 3% | 0% | 9% | 5% | 4% | 40% | 25% | 2% | 10% | 0% | 
| 20 | 2% | 1% | 12% | 6% | 1% | 38% | 12% | 5% | 14% | 1% | 
| 21 | 2% | 1% | 15% | 4% | 1% | 32% | 7% | 1% | 13% | 0% | 
| 22 | 2% | 0% | 10% | 7% | 2% | 36% | 22% | 2% | 8% | 0% | 
| X | 3% | 12% | 8% | 5% | 1% | 35% | 8% | 3% | 29% | 1% | 
| Y | 4% | 33% | 3% | 13% | 0% | 33% | 4% | 10% | 11% | 0% | 
| Enrichment Tested | Difference [σ] | 
|---|---|
| BP in ReDFAS | 18 | 
| BP in PC | 84 | 
| BP in NPC | −15 | 
| BP in CDS | 261 | 
| BP in Alu | −17 | 
| BP in L1 | −32 | 
| PC in ReDFAS | 13 | 
| NPC in ReDFAS | 5.1 | 
| CDS in ReDFAS | 76 | 
| Alu in ReDFAS | −2.2 | 
| L1 in ReDFAS | −23 | 
| Correlated Map | Empirical Correlation | Reference Correlation | 
|---|---|---|
| ALU | 0.17 ± 0.11 | 0.00 ± 0.09 | 
| L1 | −0.11 ± 0.11 | 0.01 ± 0.09 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hikmat, W.M.; Sievers, A.; Hausmann, M.; Hildenbrand, G. Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome. Genes 2024, 15, 1247. https://doi.org/10.3390/genes15101247
Hikmat WM, Sievers A, Hausmann M, Hildenbrand G. Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome. Genes. 2024; 15(10):1247. https://doi.org/10.3390/genes15101247
Chicago/Turabian StyleHikmat, Wisam Mohammed, Aaron Sievers, Michael Hausmann, and Georg Hildenbrand. 2024. "Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome" Genes 15, no. 10: 1247. https://doi.org/10.3390/genes15101247
APA StyleHikmat, W. M., Sievers, A., Hausmann, M., & Hildenbrand, G. (2024). Peculiar k-mer Spectra Are Correlated with 3D Contact Frequencies and Breakpoint Regions in the Human Genome. Genes, 15(10), 1247. https://doi.org/10.3390/genes15101247
 
        



 
       