Genetic architecture of male fertility restoration in a hybrid breeding system of rye (Secale cereale L.)

The ‘Gülzow’ (G) type cytoplasmic male sterility (CMS) system in hybrid rye ( Secale cereale L.) 24 breeding exhibits a strong and environmentally stable restoration of male fertility ( Rf ). While having 25 received little scientific attention, three G-type Rf genes had been identified on 4RL ( Rfg1 ) and two 26 minor genes on 3R ( Rfg2 ) and 6R ( Rfg3 ) chromosome. Here, we report a comprehensive investigation 27 of the genetics underlying restoration of male fertility in a large G-type CMS breeding system using 28 a palette of complementing forward and reverse genetic analysis. This includes (i) genome wide 29 association studies (GWAS) on a G-type germplasm, (ii) GWAS on a biparental mapping population, 30 (iii) in silico identification of Rf -like pentatricopeptide repeat (RFL-PPR) genes and their expressed 31 in G-type rye hybrids, and (iv) mining patterns in linkage disequilibrium. Our findings provide 32 compelling evidence of a novel major G-type non-PPR Rf gene on the 3RL chromosome. In the in 33 silico analysis, we identified 22 RFL-PPR of which 15 were expressed in the transcriptome of G-type 34 hybrids. Our findings provides a novel insight into the underlying genetics of male fertility restoration 35 in a G-type CMS system in rye. The discovery made in this study is distinct to known P- and C-type 36 systems in rye in addition to known CMS systems in barley and wheat. This study constitutes a 37 steppingstone towards understanding the restoration of male fertility in G-type CMS system and a 38 potential resources for addressing the inherent issues of the P-type system. the wheat 6DS 6H. Rfp3 4RL an orthologous Rf6 wheat 6BS and Rfm1 on barley These provide further evidence of a conserved synteny between rye 4RL and wheat explained by the later divergence of these two species from an ancestral Triticeae


Abstract 23
The 'Gülzow' (G) type cytoplasmic male sterility (CMS) system in hybrid rye (Secale cereale L.) 24 breeding exhibits a strong and environmentally stable restoration of male fertility (Rf). While having 25 received little scientific attention, three G-type Rf genes had been identified on 4RL (Rfg1) and two 26 minor genes on 3R (Rfg2) and 6R (Rfg3) chromosome. Here, we report a comprehensive investigation 27 of the genetics underlying restoration of male fertility in a large G-type CMS breeding system using in a G-type CMS system in rye. The discovery made in this study is distinct to known P-and C-type 36 systems in rye in addition to known CMS systems in barley and wheat. This study constitutes a 37 steppingstone towards understanding the restoration of male fertility in G-type CMS system and a 38 potential resources for addressing the inherent issues of the P-type system.

50
In recent years, hybrids have become the predominant class of cultivated winter rye (Secale cereale 51 L.) in Northern Europe 1 . Outperforming population-based cultivars, hybrids in rye demonstrate 52 strong heterotic effects on all developmental and yield characteristics 2,3 . Breeding of hybrids rely on 53 the existence of cytoplasmic male-sterility (CMS) and restoration of male-fertility (Rf) genes that 54 resides in genetically distinct parental populations 3,4 . This system efficiently enables control of the 55 parental crossing in the field as a prerequisite for large scale hybrid seed production 5 . 56 In hybrid rye numerous CMS systems exists of which the most predominant is the Pampa (P) type 6 . 57 In this system five major P-type Rf genes have been identified on 1RS, 4RL (Rfp1, Rfp2, Rfp3) and 58 6R (dominant modifier) chromosome, and three minor genes on 3RL, 4RL and 5R chromosome in 59 'Pampa' (P) type cytoplasm 7-10 . Less prevalent CMS systems include 'Gülzow' (G) type originating 60 from the Austrian population of rye variety 'Schlägler alt' 11 , R-type originating from a Russian 61 population, 12 , C-13 and S-14 type originating from an old Polish cultivar 'Smolickie'. In the G-type 62 CMS system one major gene have been identified on 4RL (Rfg1) and two minor on 3R (Rfg2) and 63 6R (Rfg3) chromosome 15 . In the C-type CMS system two major Rf genes have been identified on 64 4RL (Rfc1) and 6RS (Rfc2) 16,17 . Intriguingly, Stojalowski,et al. 18 observed a linkage between major unsatisfactory restoration 19,20 . In 1991, several non-adapted Argentinian and Iranian rye populations 74 with high frequency of restorer gametes were identified 25 . Crossing of an elite maternal line with one 75 of these non-adapted exotics led to observations of significantly higher restoration levels and 76 environmental stability 26,27 . In order to steer the introgression of novel superior exotic Rf genes 77 through marker assisted selection, molecular markers were developed for Rfp1, Rfp2, Rfp3 8,9 . 78 Hybrids carrying an exotic Rf gene were, however, found to exhibit a significant reduction in grain there is little available information on RFL-PPRs in rye.

103
In this paper we report a comprehensive study of the genetics underlying male-fertility restoration in 104 G-type CMS based hybrid rye breeding system. The objective of this study was to identify major and 105 minor G-type Rf genes. This was approached through (i) Genome wide association studies (GWAS)  Genetic structure of the germplasm has been thoroughly characterized in a recent study by Vendelbo,et al. 51 . A biparental mapping population was developed from a hybrid rye cv. Stannos, deriving 121 from the cross of a cytoplasmic male-sterile (CMS) line msG214135 and a restorer line R3966.

122
Biparental mapping population 123 To investigate the inheritance of male-fertility restoration in the G-type CMS based Nordic Seed    184 Nordic Seed hybrid rye cv. Helltop and cv. Stannos belong to G-type hybrid breeding system of rye.

185
For the identification of causative Rf genes in the G-type breeding system, expression of the candidate 186 RFL-PPR genes were investigated in de novo transcriptome assemblies of these two hybrids. The  Genome wide association study (GWAS) was conducted using population origin as phenotypic input   The population was phenotyped for six restoration of male fertility as well as related traits to 253 restoration in order to get a comprehensive dataset on the inheritance of 'Gülzow' (G) type Rf genes.

254
Seed number and pollen production were found, on basis of our observations, to be the most 255 representative Rf associated traits ( Fig. 2A-B).

257
The observed segregation ratio of sterile and fertile F2 plants was tested for goodness of fit to the 258 expected Mendelian ratio at the scenario of one, two, and three major Rf genes using a χ² test.    for both populations using the 600K SNP array genotype data as a comparative tool to visualize the 327 LD landscape at the Rf annotated regions (Fig. 7).

361
In the 600K case control GWAS, a unique strong peak was identified on 1RS (Fig. 1B, C). While the 1RS in the mapping population GWAS (Fig. 3, Fig. 4). This can either be due to the absence of the In order to identify the complement of major and minor Rf genes in the G-type CMS system, GWAS Mbp with a mean LOD of 4.7 (Fig. 3, Supplementary material 3). This region was furthermore show any evidence of a Rf gene under selection in the germplasm (Fig. 7C,D). Contrary to the C-and 407 P-type CMS systems, this suggests a negligible role of 4R in the G-type systems. Furthermore, in 408 addition to a major Rf gene on 3RL exclusive to the G-type CMS system, our findings suggest the 409 potential presence of an additional minor gene on the distal region of 3RL.

410
Decisive role of 3R in the G-type CMS breeding system 412 In our GWAS study, we found a strong coinciding peak on 3RL in both the 20K and 600K case 413 control suggesting that 3R houses a population differentiating trait (Fig. 1A-B). This finding was 414 consistent with the discovery of a distinctly higher interchromosomal fixation indices (Fst) on 3R (Fig.   415   6). Furthermore, Vendelbo, et al. 51 reported a singular enrichment of interchromosomal LD for both 416 parental populations on 3R. In conjunction, these findings accentuate the pivotal role of the 3R 417 chromosome in the assayed G-type hybrid rye elite breeding germplasm.

418
To investigate whether the population differentiating region on 3RL harbored a G-type Rf gene, a 419 biparental mapping population was developed. In contrast to the case control GWAS, the biparental 420 mapping population is not subject to confounding issues related to population structure 421 ( Supplementary Fig. 1). Segregation ratio of Rf associated traits in the mapping population was found 422 to be in accordance with a monogenic dominant inheritance of a Rf gene by χ 2 test consistent with the 423 singular peak identified in the case control GWAS (Fig. 1A-B). GWAS on the phenotypic dataset from 424 the mapping population confirmed that the major Rf gene co-localized with the region identified in 425 the case control GWAS on 3RL (Fig. 3A-B). The precise position of the causative Rf gene was, 426 however, initially obscured by the finding that the four most associated SNP markers, deriving from 427 the 90K Wheat (Triticum aestivum L.) array, could not be mapped to the rye reference genome 'Lo7' 428 53,57 . This was resolved by a chromosome-wise LD mapping of each of the wheat markers, with the 429 highest associated marker mapping to 747 Mbp (Fig. 7, Supplementary Fig. 2).

430
Novel major Rf gene unique to the G-type CMS breeding system 431 While no major Rf gene on 3RL has to our knowledge been identified in neither the G-type or P-type PPR on 3R to cause restoration in G-type breeding system. In the in silico analysis, we successfully (ryePPR_3R_26) (Fig. 3, Fig. 4 The data that support the findings of this study are presented in the supplementary materials and   Manhattan plot for genome wide association study (GWAS) on two restoration of male fertility related phenotypic scores, A) Seed Number and B) Pollen Production (1-9) in a F2 biparental population (n = 181) derived from the hybrid rye cv. Stannos. Signi cant association was identi ed using criterion of -log10(p) > 5.2 depicted as a red line.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. Supplementarymaterial.zip