Non-Random Distribution of Reciprocal Translocation Breakpoints in the Pig Genome

Balanced chromosome rearrangements are one of the main etiological factors contributing to hypoprolificacy in the domestic pig. Amongst domestic animals, the pig is considered to have the highest prevalence of chromosome rearrangements. To date over 200 unique chromosome rearrangements have been identified. The factors predisposing pigs to chromosome rearrangements, however, remain poorly understood. Nevertheless, here we provide empirical evidence which sustains the notion that there is a non-random distribution of chromosomal rearrangement breakpoints in the pig genome. We sought to establish if there are structural chromosome factors near which rearrangement breakpoints preferentially occur. The distribution of rearrangement breakpoints was analyzed across three level, chromosomes, chromosome arms, and cytogenetic GTG-bands (G-banding using trypsin and giemsa). The frequency of illegitimate exchanges (e.g., reciprocal translocations) between individual chromosomes and chromosome arms appeared to be independent of chromosome length and centromere position. Meanwhile chromosome breakpoints were overrepresented on some specific G-bands, defining chromosome hotspots for ectopic exchanges. Cytogenetic band level factors, such as the length of bands, chromatin density, and presence of fragile sites, were associated with the presence of translocation breakpoints. The characteristics of these bands were largely similar to that of hotspots in the human genome. Therefore, those hotspots are proposed as a starting point for future molecular analyses into the genomic landscape of porcine chromosome rearrangements.


Introduction
Chromosome rearrangements are known to be one of the main etiological factors contributing to hypoprolificacy in domestic species, especially the domestic pig [1][2][3]. To date over 200 distinct chromosome rearrangements have been identified in the domestic pig. The vast majority are balanced reciprocal translocations, making up over 90% of described rearrangements [4]. It is estimated that chromosome rearrangements in domestic pigs occur spontaneously in one of 200 live births [5,6]. The prevalence of chromosome rearrangements among swine herds is thought to be between 0.5% and 1.5%, dependent on the intensity of cytogenetic screening within these populations [3,5,6]. Although reciprocal translocations are quite prevalent throughout swine herds and the genetic and genomic reasons for this are poorly understood.
Reciprocal translocations are rearrangements involving two non-homologous chromosomes which break simultaneously, and subsequently misrepair, resulting in an exchange of chromosome segments. Research that involves the human genome has suggested that translocation breakpoints occurred nonrandomly for each individual chromosome pair, with specific chromosomes and cytogenetic landmarks being particularly susceptible to breakage. Various chromosomal features recognizable in a karyotype such as the total length of chromosomes and chromosome arms, chromosome morphology, as well as, chromatin density (heterochromatic and euchromatic), and the presence of common fragile sites have all been suggested to influence the frequency at which chromosome regions rearrange [7][8][9][10].
Despite the considerable number of chromosome rearrangements identified in the domestic pig, the characteristics of these rearrangements is largely unknown. Some work suggests that there is a non-random distribution of translocation breakpoints across chromosomes in the pig genome [1,11]. Amongst the first five reciprocal translocations identified in pigs, 2 breakpoints appeared on chromosome 6, and three breakpoints were found on chromosome 14 [11][12][13][14]. The presence of multiple breakpoints in close proximity on chromosome 14 led to the suggestion that regions of fragility promoting chromosome breakage may be present in the pig karyotype, and that there may be a non-random distribution of translocation breakpoints [14,15]. In addition, many breakpoints are known to overlap with common fragile sites, regions of the chromosomes which are susceptible to breakage under exposure of specific chemical stressors [16].
Although a handful of chromosomes have been examined in some detail, no comprehensive analysis of translocation breakpoints across the pig karyotype has ever been conducted. The identification of chromosome rearrangements in swine herds has increased in the past two decades with the continuation of a large screening programs at the National Veterinary School of Toulouse in France, and at the University of Guelph in Canada. With over 190 unique reciprocal chromosome translocations identified in the domestic pig, it is now easier to observe patterns in the number of breakpoints on chromosomes, chromosome arms, and cytogenetic landmarks.
Using 195 reciprocal chromosome rearrangements identified in our lab and those reported in the literature we performed a comprehensive analysis of the translocation breakpoints at the whole chromosome and cytogenetic band levels. Our observation of rearrangements breakpoints lead us to add empirical evidence that breakpoints are nonrandomly distributed across chromosomes, and cytogenetic bands.

Cytogenetic Screening Analysis of Pig Populations
Peripheral blood samples were routinely collected from 5802 reproductively unproven young boars raised at various Canadian farms by experienced farm workers or Canadian Food Inspection Agency veterinarians according to the Canadian Council on Animal Care and the University of Guelph's Animal Care Committee guidelines. These animals were from commercial herds, in good general health. The samples were submitted to the Animal Health Laboratory of the University of Guelph for commercial genetic screening. Data from this analysis was provided for use in this study. Lymphocyte cultures were set up according to the standard cytogenetic protocols of our laboratory, as previously published [6,17]. Twenty-five metaphases were captured from each animal and a minimum of two optimal quality GTG (G-banding using trypsin and giemsa)-banded karyotypes were arranged at the level of 400 bands resolution [18]. Following this conventional procedure, we identified 29 constitutional reciprocal chromosome translocations (Table S1).

Selection and Analysis of Reciprocal Chromosome Translocations Published in the Literature
A comprehensive list of G-banded reciprocal chromosome translocations, counting the 29 reciprocal chromosome translocations identified in our own lab and by updating a previously published list of G-banded reciprocal chromosome translocations [4], resulted in a total of 195 unique rearrangements (Table S1). Breakpoints on sex chromosomes were not included in the analysis due to the rarity of such rearrangements relative to autosomal chromosomes.

Definition of Chromosome Parameters
The physical length in megabases (Mb) for each chromosome was obtained from the 11.1 Sus Scrofa genome assembly (https://www.ncbi.nlm.nih.gov/genome/84). Using the physical lengths of chromosomes as a basis, the lengths of cytogenetic bands were estimated as follows. The standard GTG-banded ideogram and chromosome landmarks of the domestic pig karyotype were used as a reference to measure and calculate the fractional lengths of each cytogenetic band per chromosome [18] ( Figure S1). The physical length of each band, and their start and stop points were calculated by multiplying the fractional length with the physical length of the chromosome (Table S2). This resulted in a conversion map between cytogenetic bands and their physical length.
These measurements were verified by selecting 25 bacterial artificial chromosome (BAC) clones from the literature that were FISH mapped to cytogenetic bands, as well as physically mapped to exact genetic loci in the genome (Table S3). These cytogenetic bands were then converted by our map to physical positions and compared to the established DNA positions. Of the 25 cytogenetically mapped probes, 20 fell within the estimated physical positions for their respective cytogenetic bands, and the remaining five fell within an adjacent band. Therefore, it was assumed that the method for estimating pig cytogenetic band lengths is sufficient for rough estimations (within approximately 3 million base pairs).
We defined the translocation frequency of each chromosome segment (i.e., whole chromosome, chromosome arm, cytogenetic band, and groups of chromosomes) as the number of translocation breakpoints per 1 Mb. The translocation frequency was, thus, calculated as the number of translocation breakpoints for a given chromosome segment, over the physical length of the chromosome segment, resulting in the number of translocation breakpoints per 1 Mb of chromosome material. The expected number of translocations per chromosome segment was calculated by multiplying the total number of breakpoints by the chromosome segment length (Table S2).
The standard GTG-banded karyotype of the domestic pig was used to define each cytogenetic band, and their GTG-banding designation [18] ( Figure S1). In total there were 267 distinct cytogenetic bands across the 18 autosomal chromosomes. The positions of cytogenetic bands on chromosome arms were defined as proximal, median, and distal according to the position of the band in the top third, middle third, and bottom third of bands, respectively on each chromosome [18,19]. A list of common fragile sites in the pig genome was used to define which bands had a common fragile site [16] ( Table S4).

Statistical Analysis
Statistical analysis was performed in R 3.5.1 [20]. Spearman's rank correlation coefficient was applied in order to determine the presence of an association between two variables. The Chi-square test was applied in order to determine a statistical difference between the observed and expected frequencies of variables such as translocation breakpoint number. The Student's t-test was used in order to determine if the means of two groups were significantly different from one another. One-way analysis of variance (ANOVA) was used to determine if the means of three or more groups were significantly different from one another. The Poisson distribution was used to determine if translocation breakpoints occurred on cytogenetic bands independently of one another.

The Translocation Frequency of Chromosomes is Independent of Physical Length
A total of 195 unique reciprocal chromosome translocations were considered for this study (Table S1). We mapped all translocation breakpoints to the pig karyotype and summed the number of breakpoints on each chromosome ( Table 1). The number of breakpoints per chromosome ranged between 43 on chromosome 1, and 5 on chromosome 18, with longer chromosomes appearing to have more breakpoints than shorter chromosomes. We considered if there was a correlation between physical chromosome length and breakpoints. Unsurprisingly, we found that translocation breakpoints preferentially occur on longer chromosomes (r = 0.722, p = 0.0007, Spearman's correlation coefficient, Figure 1a). Chromosomes, however, generally did not break in proportion to their length, with many chromosomes having a significant difference between their observed and expected breakpoint number (X 2 = 35.99, p = 0.0046, Chi-square test, Table 1). Many chromosomes had a greater than 10% difference between observed and expected breakpoints, which on average worked out to a difference of five breakpoints. This difference appeared to be independent of the length of the chromosomes.  To better compare chromosomes of different lengths we calculated the translocation frequency (breakpoints per Mb) for each chromosome ( Table 1). The translocation frequencies of chromosomes appeared to complement the difference between observed and expected breakpoints as those chromosomes with more breakpoints than expected had higher translocation frequencies than chromosomes with fewer breakpoints than expected. The translocation frequency of chromosomes appeared to be unrelated to physical length. Comparing the translocation frequency and physical chromosome length, we found no correlation between the two (r = −0.294, p = 0.236, Spearman's correlation test, Figure 1b). Thus, we see through observing the translocation frequency that shorter chromosomes may have high densities of breakpoints, and that this appears to not influence chromosome length. As such, once length is controlled for and longer chromosomes do not appear more prone to breakage than shorter chromosomes.

The Translocation Frequency of Chromosome Arms is Independent of Physical Length
Sus Scrofa chromosomes, chromosome 1 through chromosome 12 present with distinct chromosome arms (e.g., biarmed chromosomes), and have a variable number of translocation breakpoints between their short (p arm) and long arms (q arm). We compared the physical length of each biarmed chromosome to their breakpoint number ( Table 2), observing that translocation breakpoints preferentially occurred on longer chromosome arms (r = 0.632, p = 0.0009, Spearman's correlation test, Figure 2a). Chromosome arms were generally found to rearrange in proportion to their length, with an average difference of only 2.5 breakpoints between the observed and expected number (X 2 = 34.797, df = 23, p = 0.055, Chi-square test, Table 2).  Controlling for length via the translocation frequency revealed that most chromosome arm pairs (p and q) had different translocation frequencies even though they were part of the same chromosome. The average difference in translocation frequency between p and q arms for each chromosome was 64.3%, or 0.08 translocations per Mb ( Table 3). The difference in translocation frequency between p and q arms was independent of length (r = −0.198, p = 0.353, Spearman's correlation test, Figure 2b). The translocation frequency of chromosome arms is, therefore, independent of physical length, and chromosome arm pairs rearrange independently of one another.

Translocation Breakpoints Preferentially Occur on Longer Cytogenetic Bands
In total 352 defined autosomal cytogenetic breakpoints were considered for band level analysis. We mapped these breakpoints onto the standard GTG-banded pig karyotype, denoting the number of breakpoints per band (Figure 3; Figure S1). The distribution of breakpoints appeared uneven, with the number of breakpoints per band ranging between zero and ten. The number of breakpoints per band did not fit a Poisson distribution. Many more bands than expected had no observed breakpoints or had four or more breakpoints, while there was a deficiency of bands with one to three breakpoints (X 2 = 388.33, p < 1 × 10 −5 , Chi-square test, Table 4). Given that length is a known factor that influences breakpoint number on chromosomes, we considered if the length of bands may be related to this discrepancy. Comparing the estimated physical length of bands (Table S2) to their breakpoint number we found a slight yet significant correlation between the two (r = 0.296, p = 1 × 10 −5 , Spearman's correlation test, Figure 4), indicating that longer bands tended to have more breakpoints.   Spearman's correlation coefficient was used to determine if there is a relationship between the two variables. r = Spearman's correlation coefficient and p = numerical representation that the result was seen by chance.

Translocation Breakpoints Do Not Preferentially Occur on Specific Chromosomal Positions
We considered whether the relative position of a cytogenetic band on chromosome arms affected its translocation frequency. Cytogenetic bands were defined as proximal, median, and distal (see Methods). Each group of bands was found to rearrange in proportion to their total length (X 2 = 0.843, p = 0.656, Chi-square test, Table 5). We then compared the translocation frequencies of the bands of each group, finding no relationship between the position of a band on a chromosome arm and translocation frequency of bands (p = 0.707, one-way Anova, Table 5). Therefore, a band's position on the chromosome arm relative to the centromere had no apparent influence over translocation frequency.

Translocation Breakpoints Preferentially Occur on G-Negative Bands
Observing the distribution of breakpoints in the G-banded pig karyotype it appeared that G-negative bands had the larger concentration of breakpoints. Although G-negative and G-positive bands make up a similar proportion of cytogenetic bands (55% to 44%), translocation breakpoints appeared to preferentially occur on G-negative bands, with 87% of breakpoints occurring in such bands ( Table 6). The translocation frequency of G-negative and G-positive bands were compared and showed that G-negative bands generated translocation breakpoints at five times the frequency of G-positive bands (t = 8.87, p < 1 × 10 −5 , Student's t-test). Thus, we observe that breakpoints appeared to preferentially occur on G-negative bands.

Translocation Breakpoints Preferentially Occur on Cytogenetic Bands with Fragile Sites
Common fragile sites are known to overlap with cytogenetic breakpoints in the pig genome. We considered 57 autosomal common fragile sites, across the pig genome, and referred to any cytogenetic band with a fragile site as a fragile band (see Methods; Table S4). Fragile bands make up only 20.6% of bands, however, have 33.2% of breakpoints (Table 7). Normalizing for length by calculating expected values for translocation breakpoints, we see that fragile bands have far more translocation breakpoints than would otherwise be expected (X 2 = 6.459, p =0.011, Chi-square test, Table 7). The translocation frequency of fragile bands exemplified this difference as fragile bands had a translocation frequency 35% higher than normal bands (t = −1.792, p = 0.037, Student's t-test). Notably, not all fragile bands appeared to translocate more frequently. Many fragile bands had no or very few breakpoints, however 17 of the 57 bands had very high translocation frequencies, over twice the average of cytogenetic bands in general, and appeared to drive the relationship between fragile bands and higher translocation frequency.

Translocation Breakpoints Preferentially Occur on G-negative Bands with Fragile Sites
Given that some fragile bands appeared to translocate more often than others we divided bands into groups based on the presence of a fragile site and the G-banding pattern. We compared the observed and expected breakpoint numbers of each group, finding that both G-negative groups had more breakpoints than expected, while both G-positive groups had fewer (X 2 = 138.481, p < 1 × 10 −5 , Chi-square test, Table 8). The translocation frequency of the cytogenetic bands within each group were compared and showed a significant difference between the groups (p < 1 × 10 −5 , one-way Anova, Table 8). G-negative-fragile bands had the highest translocation frequency, followed by G-negative-normal bands, while both G-positive groups had similarly low translocation frequencies.
The translocation frequencies of G-negative-fragile and G-negative-normal bands were compared, and found to be significantly different, indicating that G-negative bands with fragile sites translocated the most frequently of all cytogenetic bands (t = 2.046, p = 0.021, Student's t-test). This suggests that the presence of a common fragile site may only influence translocation frequency of G-negative bands. We performed multiple linear regression analysis to establish how the length of cytogenetic bands, G-banding, and fragility influence breakpoint number and translocation frequency. We found that all three variables significantly contributed to a linear model of breakpoint number on cytogenetic bands, with G-banding being the best predictor of breakpoint number (p < 2 × 10 −16 , Table 9). The adjusted R 2 value, however, showed that these three variables only explained 31.9% of the variation in breakpoint number amongst cytogenetic bands. Considering translocation frequency of bands, G-banding was found to be highly associated with translocation frequency of cytogenetic bands, while fragility was not. These two variables, however, explained only 22.7% of the variation in translocation frequency present on cytogenetic bands (adjusted R 2 = 0.221, Table 9). Overall, we observe that G-banding, physical length, and the presence of fragile sites only moderately explain the number of breakpoints on cytogenetic bands.

Discussion and Conclusions
Observing the distribution of translocation breakpoints across chromosomes, chromosome arms, and cytogenetic bands revealed the chromosome regions most susceptible to rearrangement, and physical features that are associated with higher breakpoint number and translocation frequency. Translocation breakpoints are nonrandomly distributed across chromosomes and chromosome arms, with particular chromosomes appearing to be far more susceptible to rearrangement than others. In addition, the length of cytogenetic bands, G-banding (heterochromatin and euchromatin), and the presence of fragile sites were all found to be associated with a higher number of breakpoints on cytogenetic bands. In particular we observed that G-negative bands had high translocation frequencies on average, with those G-negative bands with common fragile sites having the highest translocation frequencies of all bands on average.
Taking the chromosome as an individual unit, we found that chromosomes do not rearrange in direct proportion to their length. Although longer chromosomes tended to have more breakpoints in general, length appeared to have no relationship with whether chromosomes had a deficiency or a surplus of breakpoints. The physical length of chromosomes has long been suggested to influence the number of translocation breakpoints on each chromosome in the pig [4,21], and human [9,22] genomes. The rationale behind this is that longer chromosomes should have more opportunities for breakage, and therefore should break and translocate more often. Taken simply as a numerical value, this is generally true in the pig, however once length is accounted for and we see no evidence that longer physical length is associated with higher translocation frequency. Both long and short chromosomes have high (chromosomes 12 and 14) and low (chromosomes 2 and 18) translocation frequencies. Those chromosomes with the highest translocation frequencies appear to have some feature that promotes more frequent rearrangement. This suggests that chromosome features beyond the simple physical length and breakpoint number should be considered for potential roles in promoting translocation events.
Observing the number of breakpoints relative to the physical length of chromosome arms yielded slightly different results. Longer chromosome arms tended to have more translocation breakpoints, and chromosome arms typically rearranged in proportion to their length. As with whole chromosomes, longer chromosome arms have been previously suggested to be predisposed to having more translocation breakpoints [10]. Examining the translocation frequencies of the p and q arms of each chromosome, however, revealed a different trend. We observed that these chromosome arm pairs may have considerably different translocation frequencies from one another even though they are part of the same chromosome, for example, chromosome arm 12q has a translocation frequency 3.4x higher than chromosome arm 12p. Differences in breakpoint number between chromosome arm pairs has been previously established [8], however this difference in translocation frequency is perhaps more unexpected, as length is taken into account, and suggests that the factors that promote rearrangement on each chromosome are not necessarily present across the entire chromosome, and may be more localized to specific regions.
Breaking the chromosome down further into cytogenetic bands, we observed the appearance of a non-random distribution of breakpoints. Attempting to fit the number of breakpoints per band under a Poisson distribution yielded a poor fit, as a surplus of bands had no breakpoints at all, or four or more breakpoints, while there was a deficiency of bands with one to three breakpoints. As such, there tended to be a clustering of breakpoints on relatively few bands, with 12.4% of bands, those with four or more breakpoints, having 47.7% of all breakpoints between them. Length appeared to influence some of this difference as longer bands tended to have more breakpoints, however, this influence appeared small overall. Translocation breakpoints in the pig genome have previously been suggested to be distributed nonrandomly based on empirical evidence [15]. In the human genome, breakpoints have previously been shown to be nonrandomly distributed across cytogenetic bands, with clusters of breakpoints on individual bands being referred to as hotspots for rearrangement [7,9,10,23]. Clustering of breakpoints in the pig and human genomes occur in a variety of positions, with no apparent influence due to the relative position to the centromere on the chromosome arm [24]. The clustering of breakpoints in a few cytogenetic bands in distinct chromosomal positions, such as 1q17 and 15q13, suggests that individual bands may have some feature that promotes rearrangement events.
We found that G-negative bands were strongly associated with higher breakpoint numbers and translocation frequency. G-negative bands as a whole had a translocation frequency five times greater than that of G-positive bands. These results are in agreement with several studies of translocation breakpoints in humans which consistently indicate that cytogenetic G-negative bands have a higher density of breakpoints than G-positive bands [7][8][9][10]. The more open chromatin composition of these bands is proposed to be more susceptible to breakage than more condensed regions of chromosomes, although it has been suggested that cytogeneticists may be biased towards placing rearrangements within G-negative bands due to the contrasting nature of light and dark bands. Nevertheless, studies of R-banded chromosomes indicate that most rearrangements are identified in R-positive (G-negative) bands [24,25]. Therefore, our findings provide more evidences that sustain the notion that euchromatic chromosome regions are more susceptible to breakage and subsequent rearrangement.
The presence of common fragile sites within G-negative bands was found to be associated with higher translocation frequency. In contrast the presence of common fragile sites in G-positive bands appeared to have no influence on translocation frequency. Common fragile sites in the pig karyotype have been previously noted to overlap with the cytogenetic positions of translocation breakpoints [6,16,26,27]. Our results are generally consistent with those found in humans, as cytogenetic bands with fragile sites in the human genome were found to translocate more frequently than bands without fragile sites [9,[28][29][30]. These results suggest that the presence of common fragile sites on G-negative bands may influence rearrangement by promoting breakage on more open chromatin regions which already appear more susceptible to rearrangement by their nature.
Considering the features associated with breakpoint number, we determined that the chromatin density (e.g., type of banding) followed by length of bands, and the presence of common fragile sites were found to significantly contribute to a regression model of translocation breakpoint number. Only chromatin density contributed to a model of translocation frequency, however, which is in line with the presence of common fragile sites only influencing translocation frequency when present in G-negative bands. Chromatin density and physical chromosome length have previously been shown to influence breakpoint number in the human genome [10,22], however, little work has been done to demonstrate if these factors influence translocation frequency in the human genome. Although it is apparent that length, chromatin density, and the presence of fragile sites play a role in influencing breakpoint number, together at most they explain approximately a third of the variation present amongst bands. We may speculate then that more specific chromosome features, unique to each band, may contribute more specifically to the promotion of translocation events.
We determined that translocation breakpoints appeared to be nonrandomly distributed across cytogenetic bands, and that factors such as length, G-banding, and the presence of fragile sites were all related to breakpoint number. We thus sought to propose hotspots for chromosome rearrangement in the pig genome based on these characteristics. Starting with the average number of breakpoints per band, 2.4, and average translocation frequency, 0.29 per Mb, we proposed that any band that had at least five breakpoints and/or a translocation frequency of 0.58 per Mb or higher to be hotspots for rearrangement in the pig genome. In total, nineteen bands based on number of breakpoints (Table 10), and fifteen bands based on translocation frequency (Table 11) were proposed as hotspots for rearrangement. Six bands were shared between the lists. These bands are derived from a variety of chromosomes and positions. All bands are in G-negative regions, and twelve had a common fragile site. Notably, many bands from shorter chromosomes with high translocation frequencies feature prominently amongst our proposed hotspot bands. For instance, chromosome 12, with just seventeen breakpoints has three bands amongst our hotspots, indicating the specific bands on this chromosome that appear to drive the high translocation frequency of this chromosome. In total, thirty bands with varied characteristics were proposed as hotspots for rearrangement in the pig genome. For the first time, all known porcine reciprocal translocations were analyzed, revealing the chromosomes and cytogenetic bands with the highest number of breakpoints and highest translocation frequencies. The cytogenetic bands with the highest number of breakpoints and highest translocation frequencies were selected as potential hotspots for rearrangement in the pig karyotype. This work can serve as a foundation for future endeavors to assist breeders in the selection of pigs that are less susceptible to chromosome rearrangement, ultimately increasing efficiency and profitability for breeders. The comparison of chromosome rearrangements in the pig and human genomes suggests that the pig may serve as an appropriate biomedical model to study chromosome rearrangements. Although less than 200 porcine translocations were considered, this is the most complete analysis of chromosome rearrangements in the pig and it provides a baseline for future considerations of porcine chromosome rearrangements. Given the analyses concerning chromosome features easily mappable to the pig karyotype, future considerations should be given to the specific molecular and genomic landscape present on cytogenetic bands, in order to characterize which factors promote rearrangement, and explain why specific cytogenetic bands may have more breakpoints or higher translocation frequencies than other bands.