The Polycomb Orthologues in Teleost Fishes and Their Expression in the Zebrafish Model.

The Polycomb Repressive Complex 1 (PRC1) is a chromatin-associated protein complex involved in transcriptional repression of hundreds of genes controlling development and differentiation processes, but also involved in cancer and stem cell biology. Within the canonical PRC1, members of Pc/CBX protein family are responsible for the targeting of the complex to specific gene loci. In mammals, the Pc/CBX protein family is composed of five members generating, through mutual exclusion, different PRC1 complexes with potentially distinct cellular functions. Here, we performed a global analysis of the cbx gene family in 68 teleost species and traced the distribution of the cbx genes through teleost evolution in six fish super-orders. We showed that after the teleost-specific whole genome duplication, cbx4, cbx7 and cbx8 are retained as pairs of ohnologues. In contrast, cbx2 and cbx6 are present as pairs of ohnologues in the genome of several teleost clades but as singletons in others. Furthermore, since zebrafish is a widely used vertebrate model for studying development, we report on the expression of the cbx family members during zebrafish development and in adult tissues. We showed that all cbx genes are ubiquitously expressed with some variations during early development.


Introduction
Teleosts, with over 29,000 species, represent the largest and most diverse infraclass (Teleostei) of vertebrates accounting for about half of the living vertebrates and more than 96% of all fish species [1][2][3]. Gene and gene family early sequencing revealed the existence of a whole-genome duplication event that occurred in the teleost lineage about 320 Myr ago, before its diversification [4][5][6], and known as the teleost genome duplication (TGD). Ohnologous genes produced by whole genome duplication are presumably redundant and one copy is subsequently lost randomly [7]. Less frequently, both ohnologues are maintained due to accumulation of mutations leading to neofunctionalization, subfunctionalization or subneofunctionalization of one or two of the ohnologues [8]. Following the TGD in the teleost ancestor, about 15% to 20% of the ohnologues were retained as pairs [9]. Interestingly, an asymmetric acceleration of evolutionary rate is observed for one of the ohnologues allowing a possible neofunctionalization and/or subfunctionalization. Furthermore, the teleost evolution is associated with major interchromosomal rearrangements having occurred after the TGD [10,11] and to the fact that teleost genes evolved faster compared to their tetrapode orthologues [12]. These genomic characteristics, sub/neofunctionalization of the retained ohnologues, large interchromosomal rearrangements and rapid evolutionary rates, are thus believed to be, in part, responsible for the remarkable diversity of the teleost fishes in terms of morphology, physiology, behaviors and adaptations. and JTT matrix-based model [42], the Neighbor-Joining (NJ) method [43] and the Maximum Parsimony (MP) using the Subtree-Pruning-Regrafting (SPR) algorithm [44] in MEGA X [41].
SMART and Pfam protein domains were identified using the Simple Modular Architecture Research Tool [45]. Consensus patterns were drawn in logo format using WebLogo [46].
Syntenic group identification at the cbx gene loci and vicinal genes in teleost genomes were assessed by manual chromosome walking and reciprocal BLAST searches using the NCBI Gene database (National Center for Biotechnology, Bethesda, MD, USA).

Zebrafish Maintenance, Embryo Preparation and Animal Ethics Statements
Zebrafish from the TU strain were kept at 27.5 • C in a 14/10 h light/dark cycle. The evening before spawning, males and females were separated into individual breeding tanks (Tecniplast). Spontaneous spawning occurred the following morning when the plastic divider is removed. Embryos were collected and staged according to Kimmel et al. [47]. The chorions were removed from embryos by action of 1% pronase (Sigma, St. Louis, MO, USA) for 1 min. Zebrafish embryos were fixed overnight in 4% paraformaldehyde in PBS (phosphate-buffered saline, Euromedex, Souffelweyersheim, France), dehydrated gradually to 100% methanol and kept at −20 • C.
Zebrafish were maintained in compliance with the French and European Union guidelines for the handling of laboratory animals (Directive 2010/63/EU of the European Parliament and of the Council of 22 September 2010 on the protection of animals used for scientific purposes). The experimental procedures carried out on zebrafish were reviewed and approved by the local Ethics Committee, CEEA 75 Nord Pas-de-Calais and the French Ministry of Higher Education and Research (APAFiS approval number 13527-2018011722529804_v3).

Whole-Mount In Situ Hybridization
Antisense-RNA probes were synthesized using the DIG RNA Labeling Kit (SP6/T7) (Roche, 11175025910) according to the manufacturer's instruction. The following IMAGE cDNA clones purchased at imaGenes GmbH (Berlin, Germany), were used as templates: cbx2, cDNA clone MGC:103563 IMAGE:7241894; cbx7a, cDNA clone MGC:110152 IMAGE:7289412; cbx8a, cDNA clone MGC:153854 IMAGE:8001741; cbx8b, cDNA clone MGC:111978 IMAGE:7399500. cbx4, cbx6a, cbx6b and cbx7b antisense probes were generated using RT-PCR from total mRNA extracted from zebrafish larvae at 5 dpf using the RNeasy Mini Kit (Qiagen, Hilden, Germany). After Reverse Transcription, cDNAs were amplified by PCR using the probe-specific primers, coupled to the T7 sequence for forward primers and the SP6 sequence for reverse primers.
RT-PCR experiments were performed from at least two independent RNA extraction samples.

In Silico Identification of Teleost Pc Orthologues
Polycomb (Pc) and its CBX mammalian orthologues interact with H3K27me3 epigenetic marks via the chromodomain (SMART, SM00298; Pfam, PF00385) [34], whereas the Pc box at the C-terminus (Pfam, PF17218) is involved in transcriptional silencing and binding to other PRC1 components such as RNF2 [50]. Adjacent to the chromodomain, all CBX protein have a DNA binding motif, the AT-hook (SMART, SM000384; Pfam, PF02178) (in CBX2) or an AT-hook-like motif (in the other four CBX proteins, CBX4, CBX6, CBX7 and CBX8) [51]. Less conserved sequences in the middle of the CBX proteins could play a role in specifically directing each CBX family member to distinct regions of the chromatin [51,52].
We applied TBLASTN searches in the NCBI and Ensembl databases using the human and zebrafish Pc orthologues as queries, to identify the corresponding genes in 67 teleost species (Table A1). A total of 689 genes from 68 teleosts including zebrafish, were identified. Based on sequence alignments and phylogenetic analyses (Figure 1, Supplementary File S1, Supplementary File S2), these Pc orthologues were associated to one of the paralogous class CBX2, CBX4, CBX6, CBX7 or CBX8 (Supplementary  Table S1). In contrast to Psc/PCGF and Sce/RING1 orthologues that are not all present in fishes including zebrafish and medaka [37], the five cbx paralogues, cbx2, cbx4, cbx6, cbx7 and cbx8, are maintained in all sequenced teleost genomes. In many cases, but not systematically, the cbx genes are even retained as pairs of ohnologues (Supplementary Table S1). To investigate the evolution of the cbx gene content in teleosts, a cladogram of the teleost fishes [53][54][55][56] showing the cbx gene content in different fish orders, has been drawn ( Figure 2). From this representation, it appears that cbx7 and cbx8 are present as pairs of ohnologues in teleost. In contrast, cbx2 and cbx6 are retained as singleton in the genome of particular clades of teleost fishes. Following the TGD, one of the two cbx2 ohnologues has been lost in Osteoglossomorpha and Otomorpha, whereas one cbx6 ohnologue has been lost in Euteleosteomorpha. Finally, the loss of a cbx4 ohnologue occurred later in teleost evolution since it just concerns fishes from the order of the Cypriniformes and zebrafish is so far the only teleost containing cbx4 as a singleton in its genome. To investigate the evolution of the cbx gene content in teleosts, a cladogram of the teleost fishes [53][54][55][56] showing the cbx gene content in different fish orders, has been drawn ( Figure 2). From this representation, it appears that cbx7 and cbx8 are present as pairs of ohnologues in teleost. In contrast, cbx2 and cbx6 are retained as singleton in the genome of particular clades of teleost fishes. Following the TGD, one of the two cbx2 ohnologues has been lost in Osteoglossomorpha and Otomorpha, whereas one cbx6 ohnologue has been lost in Euteleosteomorpha. Finally, the loss of a cbx4 ohnologue occurred later in teleost evolution since it just concerns fishes from the order of the Cypriniformes and zebrafish is so far the only teleost containing cbx4 as a singleton in its genome.

The cbx6 Paralogue in Teleost Fishes
Our phylogenetic analysis shows that in Clupeomorpha and Ostariophysi fishes, cbx6 is found as a pair of ohnologues hereafter called cbx6a or cbx6b (Supplementary Figure S1). While Cbx6a and Cbx6b belong to distinct branches of the phylogenetic Cbx6 tree, protein sequence alignments showed that Cbx6a and Cbx6b share high levels of similarity within conserved motifs such as the chromodomain, the ATHL motif, the Cx6.1 motif and the Pc box ( Figure 3, Supplementary Figure S2). In contrast, the Cx6.2 motif, as well as sequences outside of the conserved motifs, present lower levels of similarity between Cbx6a and Cbx6b proteins.
Genes 2020, 11, x FOR PEER REVIEW 6 of 22 Figure 2. Simplified cladogram of the teleost fishes showing the content of the Pc/cbx genes in different orders. Cbx2, cbx4, cbx6, cbx7 and cbx8 genes are shown as solid circle blue, violet, yellow, green and red, respectively. Empty circles indicate that a given cbx gene is absent in at least one specie in the order. Circles filled with clear colors show duplicated genes in at least one specie in the order. Stars indicate a recent whole genome duplication in the salmoniformes (red star-salmonid-specific 4th round of genome duplication, SsGD) and in the carp lineage of cypriniformes (black star-carpspecific genome duplication, CsGD). The number of species studied in each order is in brackets.

The cbx6 Paralogue in Teleost Fishes
Our phylogenetic analysis shows that in Clupeomorpha and Ostariophysi fishes, cbx6 is found as a pair of ohnologues hereafter called cbx6a or cbx6b (Supplementary Figure S1). While Cbx6a and Cbx6b belong to distinct branches of the phylogenetic Cbx6 tree, protein sequence alignments showed that Cbx6a and Cbx6b share high levels of similarity within conserved motifs such as the chromodomain, the ATHL motif, the Cx6.1 motif and the Pc box ( Figure 3, Supplementary Figure   Figure 2. Simplified cladogram of the teleost fishes showing the content of the Pc/cbx genes in different orders. Cbx2, cbx4, cbx6, cbx7 and cbx8 genes are shown as solid circle blue, violet, yellow, green and red, respectively. Empty circles indicate that a given cbx gene is absent in at least one specie in the order. Circles filled with clear colors show duplicated genes in at least one specie in the order. Stars indicate a recent whole genome duplication in the salmoniformes (red star-salmonid-specific 4th round of genome duplication, SsGD) and in the carp lineage of cypriniformes (black star-carp-specific genome duplication, CsGD). The number of species studied in each order is in brackets.
Genes 2020, 11, x FOR PEER REVIEW 7 of 22 S2). In contrast, the Cx6.2 motif, as well as sequences outside of the conserved motifs, present lower levels of similarity between Cbx6a and Cbx6b proteins. Previous synteny analyses at genomic levels in zebrafish and tetraodon showed that the chromosomal correspondence of duplicated gene pairs arising from the TGD has been extensively preserved in doubly conserved synteny blocks, while local gene order has been largely scrambled [57,58]. Examination of the genes surrounding the cbx6 ohnologues reveals that cbx6a is flanked at its 5′ by pla2g6 coding for the phospholipase A2 group VI in Ostariophysi fishes from the orders Cypriniformes (zebrafish, goldfish), Chariciformes (Mexican cavefish), Siluriformes (channel catfish) and Gymnotiformes (electric eel), as well as in Clupeomorpha fishes (Atlantic herring) (Supplementary Figure S3). In contrast, the baiap2l2 gene encoding the BAR/IMD domain containing adaptor protein 2 like 2 is located at the 5′ position of cbx6b, thus defining distinct doubly conserved synteny blocks for cbx6a and cbx6b. Previous synteny analyses at genomic levels in zebrafish and tetraodon showed that the chromosomal correspondence of duplicated gene pairs arising from the TGD has been extensively preserved in doubly conserved synteny blocks, while local gene order has been largely scrambled [57,58]. Examination of the genes surrounding the cbx6 ohnologues reveals that cbx6a is flanked at its 5 by pla2g6 coding for the phospholipase A2 group VI in Ostariophysi fishes from the orders Cypriniformes (zebrafish, goldfish), Chariciformes (Mexican cavefish), Siluriformes (channel catfish) and Gymnotiformes (electric eel), as well as in Clupeomorpha fishes (Atlantic herring) (Supplementary Figure S3). In contrast, the baiap2l2 gene encoding the BAR/IMD domain containing adaptor protein 2 like 2 is located at the 5 position of cbx6b, thus defining distinct doubly conserved synteny blocks for cbx6a and cbx6b.
Among the Osteoglossomorpha, the elephant fish (Paramormyrops kingsleyae) and the Asian bonytongue (Scleropages formosus) are so far the only two species that have fully sequenced genomes [59,60]. Like Ostariphysi and Clupeomorpha, the cbx6 gene exists as pairs of ohnologues in the elephant fish and in the Asian bonytongue genomes ( Figure 2, Supplementary Table S1). Phylogenetic analyses revealed that the Osteoglossomorpha Cbx6 ohnologues are more closely related to each other and to the Ostariphysi and Clupeomorpha Cbx6b rather than to Cbx6a ( Figure 4A, Supplementary Figure S1). However, syntenic studies identify that the Osteoglossomorpha cbx6 ohnologues are located within the same distinct doubly conserved synteny blocks as those harboring cbx6a and cbx6b in Ostariphysi and Clupeomorpha (Supplementary Figure S3). Thus, consistent with the teleost cladogram, the two Osteoglossomorpha cbx6 ohnologues originate from the TGD but are less divergent to each other than cbx6a to cbx6b.
Genes 2020, 11, x FOR PEER REVIEW 8 of 22 Among the Osteoglossomorpha, the elephant fish (Paramormyrops kingsleyae) and the Asian bonytongue (Scleropages formosus) are so far the only two species that have fully sequenced genomes [59,60]. Like Ostariphysi and Clupeomorpha, the cbx6 gene exists as pairs of ohnologues in the elephant fish and in the Asian bonytongue genomes ( Figure 2, Supplementary Table S1). Phylogenetic analyses revealed that the Osteoglossomorpha Cbx6 ohnologues are more closely related to each other and to the Ostariphysi and Clupeomorpha Cbx6b rather than to Cbx6a ( Figure  4A, Supplementary Figure S1). However, syntenic studies identify that the Osteoglossomorpha cbx6 ohnologues are located within the same distinct doubly conserved synteny blocks as those harboring cbx6a and cbx6b in Ostariphysi and Clupeomorpha (Supplementary Figure S3). Thus, consistent with the teleost cladogram, the two Osteoglossomorpha cbx6 ohnologues originate from the TGD but are less divergent to each other than cbx6a to cbx6b. The study of the cbx6 paralogue content in teleost also revealed that cbx6 is present as a singleton in Euteleosteomorpha fishes including Protacanthopterygii, Paracanthopterygii and Acanthopterygii ( Figure 2, Supplementary Table S1), indicating that one cbx6 ohnologue arising from the TGD has been lost in this fish clade. The only Acanthopterygii that has its genome sequenced and possesses a pair of cbx6 genes is the blunt-snouted clingfish (Gouania willdenowi, order Gobiesociformes). However, both the phylogenetic analyses ( Figure 4B) and the syntenic studies indicating that one cbx6 is located in a highly rearranged region of Chromosome 1 (Supplementary Figure S3), suggesting that the second copy of cbx6 arose from a recent duplication event. The study of the cbx6 paralogue content in teleost also revealed that cbx6 is present as a singleton in Euteleosteomorpha fishes including Protacanthopterygii, Paracanthopterygii and Acanthopterygii ( Figure 2, Supplementary Table S1), indicating that one cbx6 ohnologue arising from the TGD has been lost in this fish clade. The only Acanthopterygii that has its genome sequenced and possesses a pair of cbx6 genes is the blunt-snouted clingfish (Gouania willdenowi, order Gobiesociformes). However, both the phylogenetic analyses ( Figure 4B) and the syntenic studies indicating that one cbx6 is located in a highly rearranged region of Chromosome 1 (Supplementary Figure S3), suggesting that the second copy of cbx6 arose from a recent duplication event.

The cbx2 Paralogue in Teleost Fishes
A search for the cbx2 paralogues in the sequenced teleost genomes revealed that cbx2 is present as a singleton in Osteoglossomorpha and Otomorpha fishes, but exists as a pair of ohnologues in Euteleosteomorpha (Figure 2, Supplementary Table S1). This suggests that after the TGD leading to the duplication of cbx2, one of the ohnologues was lost at least twice independently in the Osteoglossomorpha and in the Otomorpha fish lineages. Within the Acanthopterygii, the jewelled blenny (Salarias fasciatus, order Blenniformes) possesses three copies of cbx2 in its genome (Supplementary Table S1). However, phylogenetic analyses revealed that two of these copies (LOC115385554 and LOC115385547) are highly similar and might arise from a recent gene duplication event ( Figure 5A).

The cbx2 Paralogue in Teleost Fishes
A search for the cbx2 paralogues in the sequenced teleost genomes revealed that cbx2 is present as a singleton in Osteoglossomorpha and Otomorpha fishes, but exists as a pair of ohnologues in Euteleosteomorpha ( Figure 2, Supplementary Table S1). This suggests that after the TGD leading to the duplication of cbx2, one of the ohnologues was lost at least twice independently in the Osteoglossomorpha and in the Otomorpha fish lineages. Within the Acanthopterygii, the jewelled blenny (Salarias fasciatus, order Blenniformes) possesses three copies of cbx2 in its genome (Supplementary Table S1). However, phylogenetic analyses revealed that two of these copies (LOC115385554 and LOC115385547) are highly similar and might arise from a recent gene duplication event ( Figure 5A).  The synteny studies showed that the cbx2 ohnologues are associated to distinct doubly conserved synteny blocks in Acanthopterygii including in the medaka fish model ( Figure 5B, Supplementary Figure S4). In jeweled blenny, one of these blocks was being subjected to rearrangements generating a second cbx2 gene within the synteny block. Thus, it is likely that the additional cbx2 copy in the jeweled blenny genome results from a recent intrachromosomal rearrangement. Similar rearrangements are also found in species of the Cypriniformes and Salmoniformes orders.
Cypriniformes is the largest group of freshwater fishes comprising about 4300 described species including zebrafish, goldfish, carps, barbels, minnows, loaches and suckers [61]. Among them, several species were subject to an additional whole-genome duplication event called the carp-specific genome duplication (CsGD). Indeed, while the zebrafish diploid genome is composed of 50 chromosomes, the diploid status of common carp (Cyprinus carpio), goldfish (Carassius auratus) and Chinese barbels (Sinocyclocheilus anshuiensis, S. grahami and S. rhinocerous) is supported by a double-size karyotype consisting of approximately 100-104 chromosomes [62][63][64]. In zebrafish, cbx2 is present as a singleton and consistent with the CsGD event, there are two cbx2 gene copies in common carp and Chinese barbels (Supplementary Table S1). However, in goldfish, an additional cbx2 copy is identified and genomic analyses showed that two goldfish cbx2 genes (LOC113067828 and LOC113067392) are located in the same linkage group (LG28B, position NC_039293.1) (Supplementary Figure S4) as the result of an intrachromosomal rearrangement. Similarly, within the Protacathopterygii, the northern pike (Esox lucius, order Esociformes) has a karyotype composed of about 50 chromosomes [65]. An additional whole-genome duplication event, called the salmonid-specific genome duplication (SsGD), occurred in the common ancestor of Salmoniformes after their divergence from Esociformes [66,67]. Then, while the Northern pike genome harbors a pair of cbx2 ohnologues, most of the Salmonidae including rainbow trout (Oncorhynchus mykiss), Coho salmon (Oncorhynchus kisutch), Arctic char (Salvelinus alpinus) and huchen (Hucho hucho) host four cbx2 genes in their genome because of the SsGD (Supplementary  Table S1). However, the Atlantic salmon (Salmo salar) possesses a fifth cbx2 copy in its genome. Remarkably, two of these Atlantic salmon cbx2 genes (LOC106606958 and LOC106606946) are located on the same chromosome (ssa06, position NC_027305.1) (Supplementary Figure S4), suggesting that the additional cbx2 copy appeared due to a recent chromosomal rearrangement.

The cbx4, cbx7 and cbx8 Paralogues in Teleost Fishes
Our TBLASTN searches in the sequenced teleost genomes showed that the three Pc orthologues cbx4, cbx7 and cbx8 are present as pairs of ohnologues in all teleost clades (Supplementary Table S1, Figure 2). One possible exception could concern the cbx7 paralogue in Paracanthopterygii. Within the Paracanthopterygii super-order, Atlantic cod (Gadus morhua, order Gadiformes) is the only specie for which whole-genome sequencing data are available (Ensembl, assembly gadMor1) and in its genome, cbx7 is found as a singleton. However, a cod-specific loss of one cbx7 ohnologue due to a particular chromosomal rearrangement in this specie cannot be ruled out. Alternatively, the failure in the identification of the second cbx7 homologue could be the result of a technical artefact such as sequencing or gene prediction information missing. Then, a definitive conclusion about the presence of cbx7 as a singleton or a pair of ohnologues in Paracanthopterygii could be made when genomic information will be available for other species of the clade.
The cbx4 paralogue is even present as three copies in the Mexican cavefish (Astyanax mexicanus) and red-bellied piranha (Pygocentrus nattereri) genomes, two species of the teleost order Characiformes. Genomic analysis reveals a remarkable conservation of cbx4 chromosomal organization (Supplementary Figure S5). First, cbx4 is located immediately 5 to cbx8. This feature is not specific to teleost, since it is also the case in mammals including in human [37]. Second, the cbx4-cbx8 ohnologous loci are associated to two recognizable-with distinct signatures-but conserved synteny blocks. The first cbx4-cbx8 locus is flanked by the genes card14 (caspase recruitment domain family member 14) and dgke (diacylglycerol kinase epsilon), whereas the other is flanked by the genes tbc1d16 (TBC1 domain family member 1) and arhgap17 (Rho GTPase activating protein 17). The phylogenic analysis of the Mexican cavefish and red-bellied piranha cbx4 paralogues suggest that two copies of cbx4 arise from the duplication of the same cbx4 orthologue in the two Characiformes species ( Figure 6A). In the Mexican cavefish, cbx4 (Gene ID: 10304589) and LOC111194212 derive from the same ancestor gene, while in red-bellied piranha, cbx4 (Gene ID: 108424718) and LOC108410419 also originate from the same ancestor.
Genes 2020, 11, x FOR PEER REVIEW 11 of 22 phylogenic analysis of the Mexican cavefish and red-bellied piranha cbx4 paralogues suggest that two copies of cbx4 arise from the duplication of the same cbx4 orthologue in the two Characiformes species ( Figure 6A). In the Mexican cavefish, cbx4 (Gene ID: 10304589) and LOC111194212 derive from the same ancestor gene, while in red-bellied piranha, cbx4 (Gene ID: 108424718) and LOC108410419 also originate from the same ancestor. Remarkably, in Mexican cavefish, cbx4 is associated to the card14-dgke synteny block, whereas LOC111194212 local chromosomal organization totally lack the characteristic of the conserved cbx4-cbx8 synteny blocks ( Figure 6B). Similarly, red-bellied piranha, cbx4 is associated to the card14-dgke synteny block whereas LOC108410419 is in a gene region without homology with the conserved cbx4- Remarkably, in Mexican cavefish, cbx4 is associated to the card14-dgke synteny block, whereas LOC111194212 local chromosomal organization totally lack the characteristic of the conserved cbx4-cbx8 synteny blocks ( Figure 6B). Similarly, red-bellied piranha, cbx4 is associated to the card14-dgke synteny block whereas LOC108410419 is in a gene region without homology with the conserved cbx4-cbx8 synteny blocks. The similar properties of the three cbx4 copies in Mexican cavefish and red-bellied piranha suggest that the additional cbx4 copy in these species originate from a single cbx4 duplication event having occurred in the Characiformes before their speciation.
In contrast to Characiformes having three cbx4 copies, zebrafish contains a single cbx4 copy in its haploid genome. The loss of one cbx4 ohnologue in zebrafish is specific and restricted since it does not affect the neighboring cbx8 gene nor the other genes of the syntenic block (Supplementary Figure S5). Furthermore, this cbx4 ohnologue loss occurred in the zebrafish lineage after it diverged from the carp lineage. Indeed, the goldfish and the three Chinese barbels from the genus Sinocyclocheilus all contain four cbx4 gene copies consequential to a pair of ohnologues subjected to the CsGD (Supplementary  Table S1, Supplementary Figure S5).

The cbx3 Genes in Teleost Fishes
The canonical PRC1 (cPRC1) complex is the functional homologues of Drosophila PRC1 composed of Pc, Psc, Ph and Sce. However, in mammal, a heterogeneous group of non-canonical PRC1 (ncPRC1) complexes have also been described [68][69][70]. These ncPRC1 are characterized by the absence of the Pc orthologues CBX2, CBX4, CBX6, CBX7 and CBX7, but the presence of YY1-binding protein (RYBP), or its homolog YAF2 associated to RING1/RNF2, one of the PCGF proteins and various other subunits. Among the different ncPRC1 complexes, PRC1.6 (also named E2F6.com, [71,72]) ( Figure 7A) is composed of the transcriptional repressor E2F6 in association with RNF2-PCGF6-RYPB/YAF2. In addition, the complex contains WDR5, the oncoprotein L3MBTL2, the transcription factors MAX and MGA and the chromodomain-containing protein CBX3. In contrast to CBX2, CBX4, CBX6, CBX7 and CBX8 which are orthologous to the Drosophila Pc protein, CBX3 is an orthologue of the Drosophila protein HP1 / Su(var)205. In addition, Pc orthologues contain a chromodomain, an AT hook/ATHL motif and a Pc box, whereas CBX3 is composed of a chromodomain associated to a chromo shadow domain (Pfam ID: PF00385) ( Figure 7B). Finally, while the chromodomain of Pc and its orthlogues recognize H3K27me3, the CBX3 chromodomain preferentially binds to the H3K9me3 epigenetic marks.
Although PRC1.6 function is not known and there is no evidence showing that PRC1.6 exists in teleost fishes, we conducted a TBLASTN search to identify CBX3 orthologues in fishes. From the NCBI and Ensembl databases, we identified 151 cbx3 genes in the 68 teleost species covered by this study (Supplementary Table S2, Supplementary File S3). Phylogenetic analyses show that cbx3 is present as a pair of ohnologues cbx3a and cbx3b in all the teleost clades studied, Osteoglomorpha, Clupeomorpha, Ostariophysi, Protacanthopterygii, Paracanthopterygii and Acanthopterygii ( Figure 7C, Supplementary File S3, Supplementary File S4). Then, like cbx4, cbx7 and cbx8, cbx3 has been retained as two gene copies in the teleost genome after the TGD.

Pc Gene Expression in the Zebrafish Model
The zebrafish genome encodes eight Pc orthologues. These cbx paralogues are cbx2, cbx4, cbx6a, cbx6b, cbx7a, cbx7b, cbx8a and cbx8b. Since zebrafish serves as a powerful vertebrate model for studying gene expression during early development and modeling human diseases, we have investigated gene expression profiles for the eight cbx paralogues in this organism. In zebrafish, zygotic transcription starts at about cell cycle 10-13 (around 3.5 hours post fertilization (hpf)). Before this midblastula transition (MBT) stage, all developmental events rely on maternally deposited gene products [73,74]. The study of cbx expression patterns before the MBT at the 1-cell stage, as well as after the MBT at 24 and 48 hours post-fertilization (hfp), using whole-mount in situ hybridization revealed that all the cbx family members are globally ubiquitously expressed ( Figure 8A, Supplementary File S5). In situ hybridization showed that the cbx transcripts are maternally loaded into the embryos since a signal could be detected before MBT at the 1-cell stage, even if the labelling remains quite low for cbx6a, cbx6b and cbx7a. At 24 hpf, cbx mRNAs are ubiquitously present in the embryo. The expression becomes more restricted in the developing brain, the gut and in the pectoral fin buds at 48 hpf. However, some differences in the cbx expression could be observed. At 48 hpf, the expression pattern of cbx2 is similar to the ezh2 expression profile with a marked signal at the midbrain-hindbrain boundary and in the pectoral fin buds, whereas cbx6b or cbx7b show a more diffuse labelling in the brain and a weaker signal in the pectoral fin buds ( Figure 8B).

Pc Gene Expression in the Zebrafish Model
The zebrafish genome encodes eight Pc orthologues. These cbx paralogues are cbx2, cbx4, cbx6a, cbx6b, cbx7a, cbx7b, cbx8a and cbx8b. Since zebrafish serves as a powerful vertebrate model for studying gene expression during early development and modeling human diseases, we have investigated gene expression profiles for the eight cbx paralogues in this organism. In zebrafish, zygotic transcription starts at about cell cycle 10-13 (around 3.5 hours post fertilization (hpf)). Before this midblastula transition (MBT) stage, all developmental events rely on maternally deposited gene remains quite low for cbx6a, cbx6b and cbx7a. At 24 hpf, cbx mRNAs are ubiquitously present in the embryo. The expression becomes more restricted in the developing brain, the gut and in the pectoral fin buds at 48 hpf. However, some differences in the cbx expression could be observed. At 48 hpf, the expression pattern of cbx2 is similar to the ezh2 expression profile with a marked signal at the midbrain-hindbrain boundary and in the pectoral fin buds, whereas cbx6b or cbx7b show a more diffuse labelling in the brain and a weaker signal in the pectoral fin buds ( Figure 8B).  Analyses of mRNA abundance measured by RT-PCR showed that cbx mRNA levels vary during zebrafish development from 1 hpf to 5 days post-fertilization (dpf) ( Figure 8C). In particular, cbx4, cbx6a, cbx6b and cbx7a expression is reduced at early stages and increases after 6 hpf, whereas for cbx7b, cbx8a and cbx8b, a decrease in mRNA levels is found between 6 and 48 hpf. A delay between the degradation maternal mRNAs occurring at MBT and the start of zygotic expression of these genes might account for the reduction in mRNA abundancy between 6 and 48 hpf. It is worth noting that these variations in cbx mRNA abundance parallel those reported using large-scale RNA-Seq experiments during zebrafish development [75]. Finally, in adult zebrafish, our RT-PCR experiments showed that all cbx family members are expressed ubiquitously ( Figure 8D).

Discussion
The Pc/CBX family member proteins are components of the canonical PRC1 protein complex that maintain transcriptional repression of hundreds of genes involved in development, differentiation, signaling or cancer. Since these proteins directly bind to the epigenetic mark H3K27me3, they are key elements targeting the PRC1 complex to its chromatin sites. While Drosophila melanogaster contains a single Pc protein, mammalian genomes code for five CBX paralogues, CBX2, CBX4, CBX6, CBX7 and CBX8. This diversity might be crucial for the control of gene expression programs since it is believed that changes in the CBX protein content within the PRC1 complexes could relocalize the PRC1 to different target genes during the differentiation processes.
Here, we performed a global analysis of the cbx gene family in teleost fishes and traced the distribution of the cbx genes through teleost evolution. Teleost fish experienced at least three rounds of whole-genome duplication; the first two before the divergence of lamprey from the jawed vertebrates, and a third teleost-specific TGD at the base of the teleosts [58,76,77]. The TGD was followed by a rediploidization process associated to a massive loss of duplicated genes. However, a number of genes are still maintained as pairs of ohnologues in teleost genomes. Consequently, it is expected that the number of cbx family members will be higher in teleost fishes than in mammals. Database searches identified 689 cbx genes in 68 teleost species that belong to 21 different orders and 6 different super-orders (Osteoglossomorpha, Clupeomorpha, Ostariophysi, Protacanthopterygii, Paracanthopterygii and Acanthopterygii). The five cbx paralogues, cbx2, cbx4, cbx6, cbx7 and cbx8 were found in all fish teleost species. This is in total contrast with the situation observed for other components of the PRC1 protein complex [37]. Indeed, mammalian genomes contain six Psc/PCGF paralogues, PCGF1, PCGF2, PCGF3, BMI1, PCGF5 and PCGF6, whereas both pcgf2 and pcgf3 are absent in zebrafish and pcgf1 is absent in medaka. Similarly, there are two Sce/RING1 paralogues in mammals, RING1 and RNF2, while rnf2 is the only paralogue present in zebrafish. The maintenance of the five cbx paralogues in teleost could suggest the absence of functional redundancy between the Cbx family members, whereas Pcgf members could be redundant. The hypothesis of a Pcgf redundancy in teleost fishes is in agreement with the absence of certain pcgf gene members in the genome of several fish species including zebrafish and the medaka [37] and with the fact that pcgf1 zebrafish mutants are viable and fertile [78]. However, the situation might be different in mammals where PCGF functions are definitively not redundant [79,80].
Surprisingly, although about 15% to 20% of the ohnologues were retained as pairs after the TGD in teleost [9], the maintenance of two ohnologues is globally a general characteristic of all cbx family members. Cbx8 is present as a pair of ohnologues (cbx8a and cbx8b) in all the teleost clades examined, as it is the case for cbx4 and probably also for cbx7. Cbx7 is identified as a pair of ohnologues in all clades except in Paracanthopterygii. However, the Atlantic cod is the only specie from this super-order having its genome sequence available. It is thus difficult to conclude whether the loss of one of the two cbx7 ohnologues is specific to the Atlantic cod or whether it reflects a feature common to all Paracanthopterygii. In contrast, cbx2 and cbx6 are present as pairs of ohnologues in the genome of several teleost clades but as singletons in others. Following the TGD, one cbx6 ohnologue has been lost in the common ancestor of Euteleosteomorpha. Concerning cbx2, one of the two ohnologues has been lost in Osteoglossomorpha and Otomorpha, but retained as two copies in Euteleosteomorpha. This suggests that the loss of cbx2 ohnologues occurred at least twice, in Osteoglossomorpha and in Otomorpha, during teleost evolution. The reason why cbx4, cbx7, cbx8, but also cbx3 are retained as pairs of ohnologues in the teleost genomes, whereas cbx2 and cbx6 remain as singletons in different teleost clades is not clear. One possibility could be linked to a possible neofunctionalization and/or subfunctionalization of one of the cbx3, cbx4, cbx7, cbx8 ohnologues, but it could also be due to constraints applied by the presence/absence of other genes in the blocks of synteny.
Our analysis of the teleost cbx gene family also shed light on other rearrangements having occurred later in particular teleost lineages. For instance, a third copy of cbx2, probably arising from an intrachromosomal rearrangement, is identified in jewelled blenny (Salarias fasciatus, order Blenniformes). Notably, an additional cbx4 copy also originate from a single cbx4 duplication event having occurred in the Characiformes before their speciation since these three cbx4 copies are found in the genomes of the two Characiformes, the Mexican cavefish (Astyanax mexicanus) and the red-bellied piranha (Pygocentrus nattereri), for which genomic sequences are available.
In this landscape, the genomic cbx gene content in zebrafish, with cbx2 and cbx4 present as singletons but cbx6, cbx7 and cbx8 retained as pairs of ohnologues, appears as an original combination, unique among the teleost fishes having their genome sequenced.
Zebrafish has proven being a unique vertebrate model for studying Polycomb group (PcG) genes during early development [81][82][83][84]. In this context, the description of PcG gene expression during development is of particular interest. The expression of the Psc/PCGF family as well as several other PcG genes has already been reported [78,85], but very little is known about Pc/CBX gene expression during zebrafish development. Whole-mount in situ hybridization and RT-PCR experiments showed that zebrafish cbx genes are maternally expressed in the embryo at diverse levels. In particular, cbx4, cbx6a and cbx7a mRNAs appear less abundant at the 1-cell stage than at later developmental stages, while it is not the case for cbx2, cbx7b, cbx8a or cbx8b. At 24 hpf, cbx mRNAs are present in all the embryos but the expression becomes enriched in anterior regions such as the brain, the pectoral fin buds and the gut at 48 hpf. If all the cbx genes are expressed in the brain, differences in their expression patterns could be observed using in situ hybridization. Finally, all the cbx family members are expressed in adult zebrafish tissues.
In conclusion, our observations contribute to the understanding of Polycomb orthologues evolution in fish and the characterization of cbx expression during zebrafish development will be useful to future studies aiming at understanding the functional role of each cbx family member.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/11/4/362/s1, Table S1: The teleost Pc orthologues, Table S2: The teleost cbx3 genes, File S1: Rooted version including gene names of the phylogenetic tree of Figure 1, File S2: List of the protein sequences encoded by the teleost Pc orthologues identified in this study, File S3: List of Cbx3 protein sequences encoded by the teleost genomes and identified in this study, File S4: Version of the Cbx3 phylogenetic tree of Figure 7 including the gene symbols, File S5: in situ hybridization on groups of zebrafish embryos at the 1-cell stage, 24 hpf and 48 hpf for cbx2, cbx4, cbx6a, cbx6b, cbx7a, cbx7b, cbx8a and cbx8b, File S6: Original RT-PCR gels, Figure S1: Phylogenetic tree of the evolutionary relationships between teleost Cbx6 paralogues, Figure S2: Sequence alignment for Cbx6 proteins from Ostariophysi and Clupeomorpha fishes, Figure S3: Blocks of synteny at the cbx6 locus in teleost fishes, Figure S4: Blocks of synteny at the cbx2 locus in teleost fishes, Figure S5: Blocks of synteny at the cbx4-cbx8 locus in teleost fishes, Figure