Evidence for Dosage Compensation in Coccinia grandis, a Plant with a Highly Heteromorphic XY System

About 15,000 angiosperms are dioecious, but the mechanisms of sex determination in plants remain poorly understood. In particular, how Y chromosomes evolve and degenerate, and whether dosage compensation evolves as a response, are matters of debate. Here, we focus on Coccinia grandis, a dioecious cucurbit with the highest level of X/Y heteromorphy recorded so far. We identified sex-linked genes using RNA sequences from a cross and a model-based method termed SEX-DETector. Parents and F1 individuals were genotyped, and the transmission patterns of SNPs were then analyzed. In the >1300 sex-linked genes studied, maximum X-Y divergence was 0.13–0.17, and substantial Y degeneration is implied by an average Y/X expression ratio of 0.63 and an inferred gene loss on the Y of ~40%. We also found reduced Y gene expression being compensated by elevated expression of corresponding genes on the X and an excess of sex-biased genes on the sex chromosomes. Molecular evolution of sex-linked genes in C. grandis is thus comparable to that in Silene latifolia, another dioecious plant with a strongly heteromorphic XY system, and cucurbits are the fourth plant family in which dosage compensation is described, suggesting it might be common in plants.


Introduction 35
Some 5 or 6% of the angiosperms, depending on the assumed total species number, have male 36 and female sporophytes, a sexual system termed dioecy, (Renner, 2014). Transitions from 37 other sexual systems towards dioecy are estimated to have occurred between 871 and 5,000 38 times independently (Renner, 2014). Chromosomes and sex determination have been studied 39 in few dioecious plants, however, and microscopically distinguishable (heteromorphic) sex 40 chromosomes have been reported in about 50 species only (Ming et al., 2011). An important 41 question is whether sex chromosomes evolve similarly in plants and animals. For example, 42 the evolution towards heteromorphy might be common between both lineages, with an 43 autosomal origin of the sex chromosomes, gradual recombination suppression between X and 44 7 identified as an X/Y SNP (with the male-specific allele being the Y allele). The information 135 of all SNPs in a gene is then combined into a probability for the gene to be sex-linked.  seq-based segregation analysis is both relatively cheap and efficient, and has been applied in 137 several plant systems in which sex-linked genes have been identified successfully, initially The flower buds were sent to IPS2 Paris, France using RNA later ICE kits by Thermo Fisher. 160 Total RNA was extracted from 12 flower bud samples using Agilent's spin column 161 purification method, mRNA was isolated with Oligo-dT Beads from NEB and RNAseq 162 libraries were constructed with the Directional Kit from NEB. Sequencing was performed at 163 IPS2 Paris, France, with Illumina NextSeq500 following a paired-end protocol of library 164 preparation (fragment lengths 100-150 bp, 75 bp sequenced from each end). RNA samples 165 were checked for quality, individually tagged and sequenced (see Supplementary Table 1  Estimating gene loss 247 X-hemizygous genes (X-linked genes without detectable Y copies) have been used to infer 248 the extent of gene loss on Y chromosomes. This only gives a rough idea of gene loss as X-249 hemizygous genes inferred by SEX-DETector comprise both genes with deleted or silenced Y 250 copies (true lost Y genes) and genes with Y copies that are expressed in some tissues but not 251 in the one used for RNA-seq (false lost Y genes). Also, X-hemizygous contigs are inferred by 252 SEX-DETector from X polymorphism, as explained in Muyle et al., 2016, whereas X/Y 253 contig inference relies on fixed mutations. X-hemizygous contigs can therefore only be 254 detected in contigs with X polymorphism, resulting in their underestimation (Bergero &  255 Charlesworth, 2011). We corrected for this by using the number of X-hemizygous contigs 256 (168) relative to X/Y contigs with X polymorphism (424) that were listed in the output of 257 SEX-DETector. Premature stop codons were detected using a custom script on X and Y 258 alleles.

Sex-linked genes identified by SEX-DETector 319
We assembled a de novo transcriptome for C. grandis with male and female reads (82,699 320 contigs, see Table 1). We used RNA-seq data from a C. grandis F1 cross mapped to our 321 reference transcriptome to identify genes located on the sex chromosomes (Supplementary 322 Table 1). The raw reads were mapped on open reading frames (ORF). A total of 45.76% reads 323 were mapped with standard mapping and 49.24% with SNP-tolerant mapping (see 324   Supplementary Table 3). We divided the contigs expressed in buds into autosomal, sex-linked 325 X/Y (defined as contigs having both X-and Y-linked alleles), and X-hemizygous contigs 326 (sex-linked, but with no Y-copy expression). These categories were inferred from single 327 nucleotide polymorphisms (SNPs) segregating in a family, using a probabilistic model (Muyle 328 et al., 2016). Out of the 82,699 contigs, 5,070 had enough informative SNPs to be assigned to 329 a segregation type, 3,706 were inferred as autosomal (73,10% of contigs with enough 330 informative SNPs), 1,196 as X/Y (23,59%), and 168 as X-hemizygous (3,31%) (see Table 2). 331

332
Age of the C. grandis XY system 333 Age estimates are based on the divergence between X and Y copies and three Brassicaceae 334 molecular clocks (see Table 3 maximum value of 10 years), yields much higher age estimates for XY divergence (see Table  345 3). 346 347

Patterns of Y degeneration 348
We looked for patterns of degeneration in our data. Males showed lower gene expression than 349 females for most of the genes, because a small subset of genes are very highly expressed in 350 developing male flower buds resulting in apparent lower mean expression after normalization 351 for total library size (Supplementary Figure 2). After correcting for this expression bias 352 between males and females (see Methods), we found that sex-linked genes are less expressed 353 in males than in females (see Supplementary Figures 2 and 4), Wilcoxon ranked test p-value = 354 2.57 × 10 -8 ). To refine our analysis of Y chromosome degeneration in C. grandis, we analysed 355 the allelic expression of genes inferred as sex-linked, which showed that Y-linked alleles were 356 significantly less expressed than X-linked alleles in males (see Figure 1, Wilcoxon ranked test 357 p-value = 3.06 × 10 -13 ). Lost Y genes can be detected by SEX-DETector when the Y copy is 358 absent or unexpressed and are assigned as X-hemizygous. But given that X To determine whether some genes are dosage-compensated, we first studied the log 2 fold 371 change between male and female expressions. In the absence of dosage compensation, the 372 Xmale/2Xfemale expression ratio is expected to be 0.5, so the log 2 of the ratio is expected to 373 be -1, because males (XY) have one X-linked copy and females (XX) have two. This is what 374 we observed for contigs that do not show reduced expression of the Y-linked allele relative to 375 the X-linked allele, i.e., that have a Y/X expression ratio close to 1 (median of log 2 376 Xmale/2Xfemale ratio is -1.29 for contigs with Y/X > 1; see Figure 2). For contigs with 377 reduced Y expression (low Y/X ratios), we observed a higher Xmale/2Xfemale expression 378 ratio, which suggests that dosage compensation occurs for some genes (median of contigs 379 with Y/X ≤ 0.5 is -0.85; see Figure 2). Finally, in X-hemizygous contigs, the distribution of 17 compensation and total compensation resulting in equal expression in males and females. This 384 trend is also present in a less visible pattern for X/Y contigs with low Y expression (see 385 Figure 2). To investigate dosage compensation further, we compared expression of X-linked 386 and Y-linked alleles in males and females for different Y/X expression ratio categories 387 (Figure 3), using female expression as a reference. We excluded 1% of the sex-linked contigs 388 that showed either an elevated Y expression (high Y/X ratios) or sex-biased X expression 389 (very high or very low Xmale/2Xfemale ratios, see Methods). The Y/X ratio was computed in 390 C. grandis males and averaged between individuals, and used as a proxy for Y degeneration. 391 In the absence of dosage compensation, X male/2X female expression ratio is expected to be 392 0.5. Instead, we found that X expression in males increases with decreasing Y expression, 393 retained genes that were identified as sex-biased with at least two methods (3,273 genes). 409 Among these, 2,682 (81.94%) were male-biased and 591 (18.06%) female-biased. Genes with 410 sex-biased expression were significantly over-represented among sex-linked genes (see 411   Supplementary Table 4, Fisher's exact test p-value < 2.2 × 10 -16 ), with 241 out of 1,364 sex-412 linked genes being sex-biased (17.67% of sex-linked genes, with respectively 13.42% and 413 4.25% having male and female-biased expression), and 206 out of 3,706 autosomal genes 414 being sex-biased (5.56% of them, 4.45% with male and 1.11% with female-biased 415 expression). Out of the sex-biased genes that were localized on the sex chromosomes, 228 416 (94.61%) had a X/Y segregation type (181 male-biased and 47 female-biased), and only 11 417 male-biased genes and 2 female-biased were X-hemizygous. X-hemizygous contigs were not 418 enriched for differentially expressed genes when compared to autosomal contigs (see 20

C. grandis Y chromosome degeneration is moderate, with an unusually reduced Y expression 459
Our results suggest that not only is the Y chromosome of C. grandis accumulating repeats as 460 shown previously, but it is also losing genes and becoming silent.

543
The sunflower genome provides insights into oil metabolism, flowering and asterid evolution.