Content of and interactions between repetitive elements in programmatically eliminated chromosomes of the sea lamprey (Petromyzon marinus)

The sea lamprey (Petromyzon marinus) is one of few vertebrate species that is known to reproducibly eliminate large fractions of its genome during normal embryonic development. In lamprey, elimination events are initiated at the 6th embryonic cleavage and result in the loss of ∼20% of an embryo’s genomic DNA from essentially all somatic cell lineages (these same sequences are retained in the germline). This germline-specific DNA is lost in the form of large fragments, including entire chromosomes, and available evidence suggests that DNA elimination acts as a permanent silencing mechanism that prevents the somatic expression of a specific subset of “germline” genes. However, reconstruction of eliminated regions has proven challenging due to the complexity of the lamprey karyotype (84 small pairs of somatic chromosomes and ∼100 pairs of germline chromosomes), the exceedingly high repeat content of the genome and even higher repeat content of eliminated fragments. We applied an integrative approach aimed at further characterization of the large-scale structure of eliminated segments, including: 1) in silico identification of germline-enriched repeats; mapping the chromosomal location of specific repetitive sequences in germline metaphases, and verification of repeat specificity to eliminated chromosomes by 3D DNA/DNA-hybridization to embryonic lagging anaphases. Our integrative approach resulted in the discovery of multiple highly abundant repetitive elements that are found exclusively on the eliminated (germline-specific) chromosomes which in turn permitted the identification of 12 individual chromosomes that are programmatically eliminated during early embryogenesis. The fidelity of germline-specific repetitive elements and their distinctive patterning in elimination anaphases are taken as evidence that these sequences might contribute to the specific targeting of chromosomes for elimination and possibly in molecular interactions that mediate their decelerated poleward movement in chromosome elimination anaphases, isolation from the primary nuclei and eventual degradation. AUTHOR SUMMARY Epigenetic silencing methods provide a means of precisely restricting gene expression while maintaining the integrity of the genomic template that encodes this information, and are employed by diverse species throughout the tree of life. Programmed genome rearrangement (PGR) represents a parallel approach that maintains genome integrity across generations but alters the genomes of cells within an organism. To better resolve elimination events that take place during PGR in the sea lamprey (one of few vertebrate species known to undergo large scale PGR) we sought to identify sequences that define specific eliminated chromosomes. Using computational predictions and cytogenetic validation, we identified six new repetitive elements that are restricted to the eliminated chromosomes and permit the identification of twelve distinct eliminated chromosomes. Analysis of these repeats in meiotic testes and in embryos sampled during the process of elimination shows that these repeats localize to specific subcellular regions, and suggest a potential role of these repetitive elements in targeting chromosomes for silencing via elimination.

92 protein coding genes with presumptively critical functions and numerous high copy elements raises 93 the questions as to whether repetitive elements themselves are truly junk targeted for elimination, 94 passive passengers that are simply being carried along for the ride, or perhaps functionally relevant 95 sequences that actively participate in the process of elimination.

96
In the present work, we focused on cytogenetically recognizable aspects of PGR in the sea 97 lamprey. 100 However, the complex morphology of germline metaphase spreads and the presence of numerous 101 small chromosomes have thus far prevented a precise description of the germline karyotype.
102 Moreover, it has remained unclear the degree to which differences in chromosome number arise 103 from wholescale loss of chromosomes vs. breakage and joining of remodeled segments.

104
To further characterize the content and distribution of germline-specific repetitive DNA in 105 the sea lamprey, we performed computational analyses to identify several candidate repetitive 106 elements that appeared to be highly enriched in germline and cytogenetic analyses to more 107 precisely define their locations within both the germline-specific chromosomes during meiosis and 108 within structured lagging chromatin that arises during the execution of PGR. These analyses provide 109 a means of individually identifying chromosomes that are targeted by PGR, provide strong evidence 110 that most (if not all) elimination events are achieved by the wholesale elimination of chromosomes, 111 and identify elements that mark specific subregions of elimination anaphases. Notably, these 112 repetitive elements include one highly-specific and abundant element (Germ2) that appears to mark 113 a persistent zone of interaction between lagging anaphase chromatids that spans their original 114 metaphase plane.

Comparative hybridization indicates the presence of germline-specific repeats
118 To assess whether eliminated chromatin was likely to be enriched with germline-specific 119 repeats, we performed comparative hybridization of repetitive DNAs (C 0 t2 fractions) from somatic 120 (liver) and germline (testes) genomic DNA within intact embryos [19,35]. Hybridizations were carried 121 out on embryos that were actively undergoing elimination at the time of fixation (1.5-2 days post 122 fertilization: dpf). Fluorescence intensity was measured using unprocessed images of cells containing 123 micronuclei (Fig 1, A) and mean integrated fluorescence density of primary nuclei was used as 124 background fluorescence respective to micronuclei. Micronuclei were found to be highly enriched in 125 germline-derived repetitive DNA (p<0.0001, DF = 32) (Fig 1, B). A similar enrichment was also observed 126 in the analysis of ~30 eliminating anaphases: Cy3-labeled germline repeats exhibited sufficiently 127 higher fluorescence intensity within lagging chromatin than FITC-labeled somatic repeats (Fig 1, C 128 and S1 File). In eliminated chromatin, somatically retained repeats were observed to hybridize 129 primarily with pericentromeric and peritelomeric regions forming dot-shaped signals and were largely 130 absent from the internal regions of lagging chromosomes that hybridize with germline repetitive DNA 131 (S1 File). This is interpreted as evidence that germline-specific and somatically-retained chromosomes 132 share a similar complement of pericentromeric repeats and, in conjunction with the observation that 133 the centromeres of eliminated chromosomes exhibit poleward motion during elimination anaphases, 134 indicates that both eliminated and retained centromeres retain a capacity to form functional 135 kinetochores.

136
Previous studies have shown that a majority of micronuclei in 1-3 dpf sea lamprey embryos 137 contain the germline-enriched repeat Germ1 [2]. In order to determine whether germline C 0 t2 138 hybridization patterns could be explained simply by the presence of Germ1, we co-hybridized a Germ1 139 specific probe with labeled C 0 t1 DNA to elimination anaphases (Fig 1, C). Owing to its highly repetitive   Table).

165
All predicted high-copy elements with enrichment scores [log2(standardized sperm 166 coverage/blood coverage)] exceeding 5 and an estimated span exceeding 40 kb, when summed 167 across all copies, were extracted for downstream analysis (S2 Table). Subsequent inspection of these 168 171 predicted elements revealed similarities among subgroups of repeats, and semi-automated 169 clustering revealed that these high-copy repeats could be grouped into 20 distinct clusters, i.e.
170 repeat families. The representatives of 6 clusters, with a combined span size of more than 500Kb, 171 were designated Germ2 -7 (Fig 2 and S2 Table).

172
Examination of the sequences of Germ2 -7 and genomics scaffolds (PIZI00000000.1) 173 containing these repeats revealed that all of these high-copy germline-specific repeats occur as 174 tandem arrays. Each repetitive element appears to contain a short (13-57bp) somewhat conserved 175 core sequence (S3 Table). Tandem arrays of these core sequences are frequently disrupted by small 176 insertions or deletions and, at larger scales, cassettes of tandem repeats are further duplicated as 177 inverted repeats.

178
Arbitrarily chosen representatives of each cluster were selected for primer design for PCR 179 validation. Amplicons generated from these primers yielded a continuous range of fragment sizes 180 (smear) as would be expected for primers designed within tandem repeats, and relative specificity 181 to the germline (S2 File). These same amplicons were used to generate probes (Germ2 -7) for 182 subsequent FISH analyses.

185
The lamprey karyotype is characterized by a large number of small chromosomes, which 186 presents significant challenges in the identification and characterization of individual homologs.
187 Because repeats were predicted to be highly enriched in germline, we performed hybridizations

200
In an attempt to uniquely identify individual germline-specific chromosomes, our six new 201 computationally-predicted germline-specific repetitive elements (Germ2-7) and Germ1 were 202 hybridized to metaphase chromosomes in three successive rounds of hybridization, which allowed 203 us to localize all seven elements and the germline LC probe on the same set of meiotic metaphases.
204 We analyzed at least 40 meiotic metaphase spreads and found that all 6 of the predicted germline-205 specific repeats hybridized exclusively to chromosomes that were marked by the germline LC probe 206 (Fig 4). These hybridization patterns allowed us to verify that these elements are restricted to 207 eliminated chromosomes and provided a means of individually identifying all 12 eliminated 208 chromosomes.

209
These hybridizations were used, in conjunction with reverse DAPI staining patterns, to 210 develop an idiogram for all germline-restricted chromosomes and a map of the locations of Germ1-7 211 repeats on each chromosome (Fig 5). Remarkably, three repeats (Germ 1, 2, and 6) were found to 212 be present on each of the 12 eliminated chromosomes, albeit with distinct distributions. Germ1 is 213 present as dense signals that are located adjacent to the pericentromeric regions of all 12 214 eliminated chromosomes. This pattern contrasts with those of Germ2, which is typically located 215 closer to the telomere, and Germ6 which generally shows a more diffuse patterns across the length 216 of chromosomal arms. Unlike Germ1, the elements Germ2 and Germ6 do not cross hybridize with 217 somatic chromosomes. The other four germline-specific repeats vary more broadly in their 218 distributions across chromosomes and individual elements appear to be completely absent from 219 one or more chromosome (Fig 4).

220
Over the course of examining metaphase spreads it was noted that germline-specific 221 chromosomes were generally clustered within meiotic spreads. Closer examination of these clusters 222 revealed that groups of germline-specific chromosomes frequently contacted one another near 223 their telomeres, forming structures reminiscent of meiotic chains (Fig 6)

233 Chromosomal Localization of Germline-Restricted Repeats in Elimination Anaphases
234 In order to directly resolve the spatial organization of germline-specific repeats during elimination, 235 we performed 3D in situ hybridizations within fixed and PACT-cleared embryos (Fig 7). As expected 236 from analyses of meiotic spreads, Germ2-7 signals were visible only in lagging (eliminated) 237 chromatin and were absent from retained somatic chromosomes (Fig 4 and 7). Each of these 238 repeats marks a distinct subregion of eliminated chromatin that corresponds to its relative position

244
Germ2 and Germ4 elements were found frequently localized to the midline of lagging 245 anaphases (Fig 7 A, B) suggesting the possibility that these repeats may mark a domain of 246 interaction between the telomeres or subtelomeric regions of several chromosomes that possess 247 dense Germ2 and Germ4 domains near their distal telomeres (Fig 5). To examine whether stretched 248 lagging chromosomes retain intact telomeres on both arms, we carried out FISH using PNA (Peptide 249 Nucleic Acid) probes to the vertebrate telomere repeat consensus sequence. These hybridizations 250 reveal that stretched lagging chromatin possesses telomere signals on both ends, consistent with 251 the idea eliminated DNA is composed largely of entire germline-specific chromosomes (Fig 8 and S5 252 File). Moreover, it appears that poleward oriented (centromeric) signals are significantly larger and 253 brighter than their equatorially-oriented counterparts. It is tempting to speculate, that lagging 254 chromosomes are establishing telomeric contacts due to telomere shortening and the formation of 255 adhesive ends. However, two observations speak against this simplistic interpretation. First, 256 similarly variable hybridization intensities also observed in meiotic and mitotic metaphases (i.e. non-257 eliminating divisions; S3 File, S4 File E-I). Second, medially oriented telomere signals often appear as 258 distinct pairs of signals within regions that are overlain by broader hybridization signals from Germ2 259 (Fig 8 and S5 File F). These observations suggest that interactions between lagging sister 260 chromosomes may involve repetitive (e.g. Germ2, 4) or other sequences near the telomere ends, 261 and perhaps not telomere fusion per se.

262
The identification germline-specific repetitive elements and localization of these elements to

356
Clustering of 171 highly abundant and germline-specific sequences was performed using CD-  Table). 405 A total of 1.5 g testes C 0 t2 DNA was labeled with Cy3-dUTP (Enzo) by nick-translation in a final 406 volume of 50 l; the same amount of liver C 0 t2 was labeled with fluorescein-12-dUTP 407 (ThermoFisher). After labeling, probes were precipitated by adjusting the solution to 70% ethanol in 408 presence of 20 g single stranded sheared salmon sperm DNA (Sigma-Aldrich), followed by 409 centrifugation at 14,100 G. The resulting pellet was air dried and resuspended in 50 l of