Sequence Expression of Supernumerary B Chromosomes: Function or Fluff?

B chromosomes are enigmatic heritable elements found in the genomes of numerous plant and animal species. Contrary to their broad distribution, most B chromosomes are non-essential. For this reason, they are regarded as genome parasites. In order to be stably transmitted through generations, many B chromosomes exhibit the ability to “drive”, i.e., they transmit themselves at super-Mendelian frequencies to progeny through directed interactions with the cell division apparatus. To date, very little is understood mechanistically about how B chromosomes drive, although a likely scenario is that expression of B chromosome sequences plays a role. Here, we highlight a handful of previously identified B chromosome sequences, many of which are repetitive and non-coding in nature, that have been shown to be expressed at the transcriptional level. We speculate on how each type of expressed sequence could participate in B chromosome drive based on known functions of RNA in general chromatin- and chromosome-related processes. We also raise some challenges to functionally testing these possible roles, a goal that will be required to more fully understand whether and how B chromosomes interact with components of the cell for drive and transmission.


Introduction
Since the time when microscopy first allowed visualization of hereditary material, researchers have observed peculiar chromosome variants in the genomes of higher eukaryotes. For any given species, the core of the genome consists of a certain number of A chromosomes, which carry all genes needed collectively for the organism's development, metabolism, and reproduction. Thus, without the complete set of A chromosomes, the organism cannot survive. However, extra or supernumerary non-essential chromosomes, termed B chromosomes, have been detected in numerous plant and animal species [1][2][3][4]. The frequency of B chromosomes within a given population can range from very low to complete fixation among all individuals, and a carrying individual can contain one to as many as ten or more B chromosome copies in each nucleus [1][2][3]. Interestingly, with little exception, B chromosomes are not essential for the organism. Indeed, the fact that B chromosomes can persist without providing any measurable benefit has contributed to the view that they are genome parasites [5]. But if they do not help the organism, how then do B chromosomes persist, thereby defying the expectation that non-essential genetic elements are eventually lost?
Previous work in various B-carrying organisms has provided some insights regarding this question. For example, in several grasshopper species, certain B chromosomes are transmitted to progeny through both parents, and they segregate with very high efficiency to daughter cells during mitotic and meiotic divisions [6,7]. In these organisms, the B chromosomes appear to behave similarly to the A chromosomes. In contrast, B chromosomes in other organisms exhibit "drive"-that is, they behave in specific ways that defy normal Mendelian transmission patterns in order to ensure their inheritance in subsequent generations [8]. For example, one B chromosome in a rye species counters its tendency to loss through a form of mitotic drive. Specifically, the sister B chromatids fail to separate (i.e., non-disjoin) in the division that produces the vegetative (non-gametic) cell and pollen (gametic) cell [9,10]. Moreover, the unseparated B chromatids tend to segregate toward the side of the spindle that will give rise to the pollen cell. This tropism of the B chromatid pair for the future pollen cell, which can lead to an accumulation of multiple B chromosome copies in offspring, is thought to occur by the B chromatids utilizing an inherent asymmetry in the makeup of the spindle apparatus [9]. Indeed, B chromosomes may tend to drive in this way in plants because of the universally asymmetric spindle at the pollen production stage [11].
As a remarkably different example of drive, the jewel wasp Nasonia vitripennis harbors a B chromosome known as PSR (for Paternal Sex Ratio) that is transmitted via the sperm (i.e., paternally) to progeny [12]. Interestingly, this B chromosome causes complete loss of the sperm's hereditary material but not itself during the first mitotic division of the embryo [13]. Interestingly, the PSR chromosome does not eliminate itself, but instead associates with the functional egg-derived chromosomes, and successfully segregates with them. Due to the fact that in wasps and other hymenopteran insects, males normally develop as haploids from unfertilized eggs while females develop as diploids from fertilized eggs, this genome elimination event converts fertilized embryos, which should become female, into haploid B chromosome-carrying males, thereby ensuring B chromosome transmission.
Even though we know what happens at the descriptive level in these cases of drive, what remains to be determined is how mechanistically B chromosomes drive in their resident genomes. Two general possibilities exist: a B chromosome may either (i) act passively, being transmitted and/or driving due to the intrinsic properties of its own DNA sequences [14], or (ii) it may operate actively through expression of its DNA sequences [9,15]. Here, we focus on the second possibility by reviewing a number of different types of B chromosome-linked DNA sequences, many of which are repetitive and non-coding in nature, that have been identified through genetic and genomic analyses. We highlight a subset of these DNA sequences that are known to be expressed, and we propose possible roles for each. Several recent studies have demonstrated that B chromosomes can influence the expression of A-linked genes [16][17][18][19] and epigenetic marks of A chromatin, reviewed in [20,21]. However, given the non-essential nature of most B chromosomes, these effects may not substantially affect the biology of the organism. Thus, we limit our attention here to B-expressed sequences and their potential roles in B chromosome drive and transmission. Finally, we raise some challenges to functionally testing these possible roles, a goal that will be required to more fully understand whether and how B chromosomes interact with components of the cell.

B Chromosomes Are Mosaics of Protein-Coding and Repetitive, Non-Coding Sequences
Genetic and genomic studies have been performed in a number of different organisms, including but not exclusive to maize [22], rye [19,23], grasshoppers [24], wasps [25,26], cichlids [27,28], raccoon dogs [29] and fungi [30,31], in order to identify specific DNA sequences carried by B chromosomes (see Table 1). The repertoire of B-linked DNA sequences includes both protein-coding and non-coding sequences. The origin of any given B-linked sequence may date back to the beginning of the B chromosome itself, or the sequence may have arisen subsequently as a copy of another B-linked gene or of an ancestral gene located on an A chromosome that was moved via transposable element (TE) activity, inter-chromosomal meiotic recombination, or imperfect DNA repair [19].
The majority of known B-linked protein-coding genes match genes located on the A chromosomes and belong to nearly all protein function categories [19,29,[32][33][34][35]. Most B-linked protein-coding genes are degenerate; they can be present as partial gene copies, such as truncated forms or missing exons, or they can show low sequence similarity across their entire lengths to the ancestral sequences [35].
For this reason, many B-linked protein-coding genes are considered to be pseudogenes [19]. The few B-linked protein-coding genes that do show high sequence similarity to their ancestral copies are likely not needed for the organism because the B chromosomes themselves are non-essential. Thus, given the lack of functional constraint on such B-linked protein-coding genes, high sequence similarity may indicate that the origin of the B-linked copy from its ancestral gene was a relatively recent event.
Despite the presence of protein-coding sequences, it appears that most B chromosomes consist primarily of non-coding sequences including TEs, simple satellites, and complex satellite-like repeats [3,[36][37][38][39][40]. Such highly repetitive DNA sequences are known to be enriched in the heterochromatin that surrounds the centromeres of the A chromosomes [40]. For this reason, it has been proposed that B chromosome formation begins with duplication of an A chromosome followed by the loss of its euchromatic chromosome arms, thus producing a nascent B chromosome consisting mainly of a centromere and its pericentromeric regions. Over time, TE activity can move genes from the A chromosomes onto the B chromosome; these sequences may then undergo mutational decay, further duplication through replication slippage, or rearrangement events such as intra-chromosomal inversion, deletion, and translocation [40]. We should mention here that the PSR chromosome present in the jewel wasp contains transposon-like sequences that appear to be absent from the wasp genome but to match sequences present in another wasp species [41,42]. Such patterns open up the possibility that this B chromosome derives from a chromosome of another species that moved into the wasp genome through interspecific hybridization or by parasites or food sources [43][44][45][46].

Expression of B-Linked DNA Sequences
Despite our current knowledge of some DNA sequences carried by B chromosomes, only a handful of studies have addressed which of them are expressed. Certainly, transcription of any given DNA sequence alone does not guarantee that it is functional; work in different model organisms suggests that much of the non-coding part of the A genome may be transcribed without function [47]. Nevertheless, a reasonable (and perhaps obvious) assumption is that a locus that is functional will at least be expressed at the RNA level. To date, several studies have identified RNAs produced by B chromosomes, either by examination of individual B-linked sequences or through whole genome approaches such as RNA-Seq. To our knowledge, no individual B-linked sequence has yet been tested for functionality through genetic manipulation. However, all B-linked sequences producing RNA should be considered as potential candidates for involvement in B chromosome dynamics and drive. Here, we highlight several examples of B-linked sequences that are known to express RNAs, and we speculate on the possible functions of these sequences in light of the types of RNAs that they produce and the underlying B chromosome biology in each case.

Copies of Protein-Coding Genes
Of the previously identified B-linked DNA sequences that are copies of A-linked protein-coding genes, few are known to be expressed (Table 1) [19,23,24,27,28,48,49]. For example, a B chromosome in the cichlid fish Astatotilapia latifasciata was shown to express multiple different protein-coding genes. Three of these expressed genes derive from the ancestral A-linked genes encoding Separin, Tubulin B1 (TUBB1), and KIF11 [27,28]. Interestingly, these A-linked genes play important roles in chromosomal segregation during cell division: TUBB1 is involved in microtubule organization [50], KIF11 functions in centrosome behavior and spindle assembly [50], and Separin mediates the release of sister chromatids at the onset of anaphase [50]. It has been proposed that because these genes are implicated in different aspects of chromosome segregation, expression of the B-linked variants may somehow promote B chromosome transmission such that they are inherited to over 50% of the gametes [27,28]. Indeed, a number of B-linked genes that derive from protein-coding genes involved in various aspects of the cell cycle and cell division have been detected in other organisms [19,24,29,34]. However, in most of these other cases, it is not yet known which of the protein-coding gene variants are expressed.
We point out here an important consideration: that any expressed protein-coding gene that affects aspects of cell division would likely impact the A chromosomes in addition to the B chromosome, potentially having large costs to the organism. Thus, proteins that play roles in B chromosome drive may be expected to specifically affect the B chromosome. Certain chromatin-associated proteins, which could have affinity for specific DNA sequences found uniquely on B chromosomes, would fulfill such an expectation. This idea is consistent with previously proposed models invoking co-evolving centromere repeats and their chromatin proteins as agents of centromere/meiotic drive [51]. Copies of histones H3 and H4 are known to be expressed from B chromosomes in different grasshopper species [52,53]. However, it remains to be determined whether variants of these conventional histones or other non-histone chromatin proteins are carried and expressed by B chromosomes.

Transposable Elements
DNA TEs, retro-TEs, and TE-like elements are abundant in higher eukaryotes, making up anywhere from~5 to as much as 50% of a given eukaryotic genome [54,55]. It is, therefore, no surprise that these elements have been found to be carried and expressed by B chromosomes in a number of organisms including rye [15,33], maize [56], fishes [39,57], and the jewel wasp [41,42]. It is difficult to imagine a scenario in which TE expression per se could enhance B chromosome drive. Moreover, a substantial amount of cellular energy is devoted to the silencing of TE expression, and unsilenced TEs can lead to severe genome instability [58]. However, TEs may play secondary but important roles in B chromosome drive. Given the potential of TEs to mobilize and amplify within single generations, it is expected that these genetic elements move frequently between the A and B chromosomes over short periods of evolutionary time; as mentioned above, such movement likely serves as a mechanism for the transfer of gene copies between the A and B chromosomes [39,40]. Additionally, TEs that have moved onto B chromosomes may themselves degenerate and become pseudogenes, they may fuse with non-TE sequence to form new genes [39,40], or they may decay over time and become tandemly copied to form arrays of complex satellite-like repeats [59]. Any such TE-derived sequence may itself be expressed through the transcriptional regulatory sequences of transposase or other TE-associated genes, or it may induce the expression of adjacent sequences that would otherwise be transcriptionally silent.

Long Non-Coding RNAs
An interesting class of candidates for involvement in B chromosome transmission and drive consists of long non-coding RNAs (lncRNAs). Bioinformatically, lncRNAs are challenging to identify from RNA expression datasets for a number of reasons. For one, it is difficult to identify secondary structural domains that suggest potential function of a putative lncRNA. Additionally, long RNAs that function as structural molecules may contain cryptic, unused open reading frames, leading to ambiguity in bioinformatically assigning such RNAs as coding or non-coding. Despite these challenges, previous work has led to the identification of putative lncRNAs expressed from B chromosomes in the jewel wasp and in cichlids [25,60].
In the jewel wasp, comparison of testis transcriptomes between wild type and B chromosomecarrying (PSR+) males led to the identification of ten transcripts, ranging between~500-1500 nucleotides in length, that are present only in the PSR + genotype [25,26]. These transcripts represent the highestexpressed sequences from the PSR chromosome. Fluorescence in situ hybridization (FISH) and PCR of genomic DNA were used to demonstrate that the cognate DNA sequences of these transcripts are located exclusively on the PSR chromosome [25,26]. A couple of these transcripts contain potential, short open reading frames, but the majority of them were bioinformatically predicted to be non-coding [25]. A different study in the cichlid A. latifasciata identified a transcript corresponding to a non-coding DNA repeat represented in multiple copies on a B chromosome in this organism [60]. This transcript was shown to be expressed in multiple different fish tissues [60].
A central question is whether such non-coding RNAs are functional, especially with regard to B chromosome drive. While no studies have yet demonstrated functionality of these RNAs, some interesting speculations stem from examples of lncRNA function in non-B systems. It has become apparent that lncRNAs span a wide range of cellular and developmental processes [61,62], but those of particular interest here pertain to chromatin and chromosome dynamics. In particular, two well-studied groups of lncRNA pertain to the X chromosome. In the fruit fly Drosophila melanogaster the roX1 and roX2 lncRNAs associate with the male-specific lethal (MSL) proteins to form the dosage compensation complex (DCC) in young male embryos [63]. This complex localizes to "entry" sites located along the male's single X chromosome. There, the DCC spreads to other regions on the X, where it ultimately induces transcriptional upregulation of most X-linked genes [64]. This effect involves remodeling of X chromatin through acetylation of Lysine residue 16 of histone H4 (H4K16ac) and phosphorylation of Serine residue 10 of histone H3 (H3S10p), each by a different enzymatic activity of DCC-associated components (reviewed in [65]). In this case, the roX lncRNAs play an indispensable role as a structural "glue" that scaffolds together the DCC proteins [66]. In mammals, a different lncRNA known as Xist is expressed initially from both X chromosomes during early embryogenesis but its expression is eventually turned off on one of the two X chromosomes (reviewed in [67]). Xist coats the X chromosome that continues to express it, an effect that leads to the association of the Polycomb Repressive Complex (PRC) and its trimethylation of Lysine residue 27 of histone H3 (H3K27me3). This histone mark leads to the facultative heterochromatinization of this Xist-expressing X chromosome, leaving the other X in a transcriptionally active state [68]. lcnRNA function is not limited to the X chromosome; these molecules also facilitate chromatin remodeling elsewhere in the genome, and they function in other aspects of chromatin dynamics (reviewed in [69,70]).
Taken together, these examples demonstrate the potential for lncRNAs to associate with specific chromosome regions and not others, as well as their ability to facilitate specific alterations of chromatin. Thus, it is intriguing to speculate that B chromosome-expressed lncRNAs may play unique roles in B chromosome drive in certain cases through such chromatin interactions. For example, in both rye and maize, B chromosomes that are deleted for a small region of repetitive DNA lose their ability to drive by nondisjunction at the pollen mitosis stage [9,10]. Currently it is not known if these repeats are transcribed from the undeleted B chromosomes. However, if they are expressed, then their encoded RNAs could associate with the centromeric regions where they may recruit enzymes that interact with the cohesin machinery. Such an effect could, in turn, retard the separation of the sister B chromatids during anaphase so that both B chromatids end up in the gamete (Figure 1). through asymmetrical segregation and non-disjunction could involve cis-acting B specific products (proteins or ncRNAs) that retard release of the two sister chromatids at the kinetochore. The sister B chromatid pair then migrates preferentially toward the generative pole due to an intrinsic asymmetry of the spindle apparatus. As a result, multiple copies of the B chromosome accumulate in progeny over multiple generations. (b) B chromosome drive through genome elimination, such as occurs by the PSR chromosome in the jewel wasp Nasonia vitripennis, occurs during the first mitotic division of the newly fertilized embryo. In this model, B-chromosome-expressed protein or RNA could localize preferentially with the paternal chromatin, recruiting chromatin-remodeling enzymes that disrupt normal chromatin remodeling dynamics through abnormal histone modification. As a consequence, the paternal chromatin forms a condensed mass that is unable to resolve into chromosomes and segregate properly. through asymmetrical segregation and non-disjunction could involve cis-acting B specific products (proteins or ncRNAs) that retard release of the two sister chromatids at the kinetochore. The sister B chromatid pair then migrates preferentially toward the generative pole due to an intrinsic asymmetry of the spindle apparatus. As a result, multiple copies of the B chromosome accumulate in progeny over multiple generations. (b) B chromosome drive through genome elimination, such as occurs by the PSR chromosome in the jewel wasp Nasonia vitripennis, occurs during the first mitotic division of the newly fertilized embryo. In this model, B-chromosome-expressed protein or RNA could localize preferentially with the paternal chromatin, recruiting chromatin-remodeling enzymes that disrupt normal chromatin remodeling dynamics through abnormal histone modification. As a consequence, the paternal chromatin forms a condensed mass that is unable to resolve into chromosomes and segregate properly.
It has been proposed that the putative lncRNAs expressed by the PSR chromosome in the jewel wasp may underlie the elimination of the paternally-inherited half of the wasp genome [26]. Previous work demonstrated that certain histone marks (H3K9me3, H3K27me1, and H4K20me1) appeared in abnormal patterns on the paternal chromatin immediately before its elimination [71]. The abnormal placement of these histone marks may block subsequent chromatin remodeling events, such as histone phosphorylation, that are essential for normal condensation of chromatin into chromosomes during mitosis [71]. An intriguing possibility is that PSR induces the abnormal histone marks through one or more of the identified lncRNAs. For example, one or more of these molecules may associate with the paternal chromatin and recruit chromatin-remodeling enzymes that disrupt normal chromatin dynamics [71]. Regardless of the mechanism, PSR must possess some way of sparing itself from this abnormal chromatin remodeling [71] (Figure 1).

Small Non-Coding RNAs
So far, little is known about the potential for B chromosomes to express small RNAs, non-coding molecules that typically range between~21-33 nucleotides in length (reviewed in [61]). A multitude of studies have characterized the functional roles of the three major classes of small RNAs and their corresponding pathways: micro-RNAs (miRNAs), which block translation of their cognate mRNA targets, endogenous small interfering RNAs (endo-siRNAs), which inhibit translation by inducing degradation of target mRNAs, and PIWI-associated RNAs (piRNAs), which facilitate the transcriptional silencing of chromatin through the association of certain chromatin-remodeling enzymes (reviewed in [72]). The functions of these different small RNA classes are not completely distinct from one another since there is evidence of some crossover between small RNA pathways [73]. To our knowledge, only one study, conducted in the jewel wasp, has detected small RNAs expressed from a B chromosome [26]. In this insect, several different small RNAs were found to be produced by PSR at expression levels matching those of more abundant small RNAs expressed from the A chromosomes [26]. Interestingly, the most abundant PSR-specific small RNA exhibits peculiar properties, having a length (32-33 nt) and starting in a uracil similar to piRNAs while appearing to be processed from a hairpin precursor like endo-siRNAs [26]. More work will be required to better understand to which class this and other PSR-expressed small RNAs belong. However, this work demonstrates that B chromosomes can, indeed, express this type of non-coding RNA. Additionally, given the link of certain small RNAs in chromatin remodeling, it should be strongly considered that B chromosomes like PSR, whose drive involves chromatin remodeling, may drive at least in part through the actions of small RNAs.

Functional Testing of Expressed B Loci and Some Challenges
Previous studies aimed at identifying functional B-specific sequences have been restricted to deletion analysis in rye and maize [9,10], jewel wasp, [14] and grasshoppers [74]. Although certain deletions of B chromosomes elicited a loss of drive, it is still unclear in each of these cases which individual sequence(s) within the deleted regions underlie drive and transmission [9,10,14,74]. Until only recently have studies begun to uncover individual RNAs that are expressed by B-linked sequences. Given that most known B chromosomes are not essential for the organism, it may be that much of B chromosome expression may be nothing more than noise. A fundamental question is whether any B expressed loci are functional, and if so, which ones. The ideas presented here may serve as some basis for deciding which candidate loci to prioritize within each B chromosome system. But one thing is certain: fully understanding if and how a given locus is involved in B chromosome transmission or drive will ultimately require some form of genetic manipulation. Such a goal has been challenging due to the fact that most studied B chromosomes reside in non-model organisms that lack traditional genetic tools. However, the development of CRISPR/Cas9 genome editing has made genetic manipulation of individual loci possible in almost any organism, model or not. In principle, this method promises to allow "knock out" of target loci on B chromosomes or, alternatively, the transgenic expression of B chromosome-derived sequences in a non-B genotype, in order to test for functionality.
Just as there is strong promise for CRISPR/Cas9 in achieving these goals, there are some substantial obstacles that will need to be tackled. For example, unlike essential genes located on the A chromosomes, which provide lethal or semi-lethal phenotypes when altered, mutant alleles created by the editing of B-linked loci would not provide any overt phenotype to follow. Contrarily, any such induced mutant allele that affects B chromosome drive would likely lead to quick loss of the B chromosome under study. Another difficulty would be mutagenesis of candidate sequences that are present in multiple copy number, such as the complex repeats that express putative lncRNAs in the jewel wasp [26]. A less problematic goal may be the expression of candidate B linked sequences from transgenes inserted through CRISPR/Cas9 and homology-dependent recombination (HDR). A consideration of this approach will be whether transgenic expression of multiple different B-linked sequences simultaneously is required to cause a phenotype of interest. Despite these obstacles, genome editing provides a very promising means for finally understanding how B chromosomes mediate their own transmission and drive at the mechanistic level.
Funding: This work was funded by the U. S. National Science Foundation, grant number NSF-1451839.

Conflicts of Interest:
The authors declare no conflict of interest.