Segregation Distortion for Male Parents in High Density Genetic Maps from Reciprocal Crosses between Two Self-Incompatible Cultivars Confirms a Gametophytic System for Self-Incompatibility in Citrus

for Male Parents in High Density Genetic Maps from Reciprocal Crosses between Two Self-Incompatible Cultivars Confirms a Gametophytic System for Self-Incompatibility in Abstract: Self-incompatibility is an important evolutionary feature in angiosperms and has major implications for breeding strategies in horticultural crops. In citrus, when coupled with parthenocarpy, it enables the production of seedless fruits in a mono-varietal orchard. A gametophytic incompatibility system with one S locus was proposed for citrus, but its molecular mechanisms remain the subject of debate. The objective of this work was to locate the S locus by the analyzing segregation distortion in reciprocal crosses of two self-incompatible citrus sharing one self-incompatible allele and to compare this location with previously published models. High density genetic maps of ‘Fortune’ mandarin and ‘Ellendale tangor’ with, respectively, 2164 SNP and 1467 SNP markers, were constructed using genotyping by sequencing data. They are highly syntenic and collinear with the clementine genome. Complete rejection of one allele was only observed in male segregation in the two parents and in only one genomic area, at the beginning of chromosome 7 of the clementine reference genome. Haplotype data in the area surrounding the theoretical S locus were in agreement with previously proposed S genotypes. Overall, our results are in full agreement with the recently proposed gametophytic S-RNase system with the S locus at the beginning of chromosome 7 of the clementine reference genome.


Introduction
Seedlessness is a major citrus breeding objective for the fresh fruit market. Strategies for breeding seedless varieties are based on the association of parthenocarpy and mechanisms that prevent fertilization of the ovules by pollen or results in embryo degeneration. Gametic sterility can result from sterility genes, such as the nucleocytoplasmic male sterility in Satsuma [1] or from ploidy manipulation to create triploid hybrids with unbalanced meiosis [2,3]. Self-incompatibility (SI), the inability for a male and female fertile plant to produce seeds from self-fertilization, is also an efficient way to select for seedless cultivars. In citrus, SI was first described in pummelos [4], but the seedlessness of some small citrus cultivars such as 'Ellendale', 'Fortune', 'Nadorcott', 'Nova', and most clementine varieties, if grown in solid blocks, results from the association of parthenocarpy and SI [5][6][7][8].
tube growth [43]. The proposed S-RNase system for GSI in citrus was in agreement with the previous transcriptomic studies of Miao et al. [44] in mandarin and by Zhang et al. [45] in limon. Recently, Liang et al. [46] identified several polymorphic pistil-expressed S-RNases in pummelo and showed that they segregate with S haplotypes. These authors provided strong evidence that the S-RNase based SI system was prevalent in citrus and that S-RNases functioned as pistil S determinants, inhibiting pollen in a S-specific manner. They located the corresponding SI locus that also included F-box genes, at the beginning of the pseudo-chromosome 7 of the clementine reference genome. The involvement of T2/S-RNase in self-incompatibility of citrus was also proposed by Honsho et al. [47] on the basis of transcriptomic, phylogenetic and genetic approaches. Both studies [46,47] analyzed the segregation of markers of the S-RNase genes on controlled progenies and found, for some of them, segregations in agreement with the gametophytic model for selfincompatibility. However no genetic studies based on whole genome segregation analysis definitively validated the location of the SI locus and its unicity. Under the GSI system, in compatible crosses between parents sharing one incompatibility allele, in male gametes, fully skewed segregations are expected at the SI locus, with the rejection of the shared haplotype, and decreasing segregation distortion directly linked to the genetic distance of the markers from the SI locus. Such a skewed segregation pattern associated with a SI locus has been described in Cocoa [26]. In the present work, we provide additional evidence for a gametophytic SI system and its location, based on the analysis of male and female segregation distortions all along the genome in reciprocal crosses between two selfincompatible small citrus cultivars that share one incompatibility allele: 'Fortune' mandarin and 'Ellendale' tangor. Previous studies of pollination of 'Ellendale' and clementine with homozygous lines for the S locus have shown that clementine and 'Ellendale' share the same S3-S11 genotype [38]. As 'Fortune' is a self-incompatible direct hybrid of clementine [48] it is expected to share one SI allele with 'Ellendale'. The present study was based on high density genetic mapping established from genotyping by sequencing data (GBS [49]). Local and chromosome haplotyping of the two parents was performed using phase markers and provides new insights into the origin of these two varieties and S haplotype diversity.

Plant Material
Two populations of diploid hybrids were created and grown at the Inra-Cirad San Giuliano research station (Corsica, France) from two reciprocal crosses between the diploids 'Fortune' mandarin and 'Ellendale' tangor, both used as female and male genitors. For a while, 'Fortune' mandarin was considered to be a hybrid between clementine and 'Dancy' mandarin [50,51]. However, molecular studies [48] suggested that it rather resulted from a cross between clementine and 'Orlando' tangelo. 'Orlando' being itself a hybrid of 'Duncan' grapefruit and 'Dancy' mandarin. Based on its phenotype, 'Ellendale' is considered as a tangor (C. reticulata Ten. × C. sinensis L. hybrid) but its precise origin remains unknown. Flow cytometry analysis was performed, as described in Aleza et al. [2] to discard triploids resulting from 2n gametes. In the following, ForEl stands for 'Fortune' mandarin × 'Ellendale' tangor hybrids, and ElFor stands for 'Ellendale' tangor × 'Fortune' mandarin hybrids.

Plant Genotyping
A total of 167 diploid mandarin hybrids (79 ForEl and 88 ElFor) and replicates of the two parents were subjected to genotyping by sequencing (GBS). Genomic DNA was isolated using the Plant DNAeasy ® kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions. The concentration of genomic DNA was adjusted to 20 ng/µL, and ApekI GBS libraries were prepared following the protocol described by Eslhire et al. [49]. DNA of each sample (200 ng) was digested with the ApekI enzyme (New England Biolabs, Hitchin, UK). Digestion took place at 75 • C for 2 h and then at 65 • C for 20 min to inactivate the enzyme. The ligation reaction was completed in the same plate as the digestion, again using T4 DNA Agriculture 2021, 11, 379 4 of 21 ligase enzyme (New England Biolabs, Hitchin, UK) at 22 • C for 1 h and the ligase was inactivated prior to pooling the samples by holding it at 65 • C for 20 min. For each library, ligated samples were pooled (i.e., two multiplex libraries of 96 samples) and PCR-amplified in a single tube. Complexity was further reduced using PCR primers with one selective base (A) as described by Sonah et al. [52]. Single-end sequencing was performed on a single lane of an Illumina HiSeq4000 at the Genoscope facilities (Paris, France) with two runs for each library. Keygene N.V. (Keygene, Wageningen, The Netherlands) owns the patents and patent applications protecting its Sequence Based Genotyping technologies. SNP genotype calling was performed from the DNA sequence reads with the Tassel 4.0 pipeline [53] to identify good quality, unique, sequence reads with barcodes. These sequences were aligned with the C. clementina 1.0 reference genome (available at https://phytozome.jgi.doe.gov, accessed on 21 April 2021) using Bowtie2 v2.2.672. For genotype calling, positions with less than five reads were considered as missing data. Next, polymorphic positions were filtered for diallelic SNPs and minor allele frequency (MAF) over 0.05.

Linkage Analysis and Genetic Mapping
The two-way pseudo-testcross mapping strategy implemented for genetic mapping from progenies resulting from crosses between two heterozygous parents (Ritter et al., 1990) and used in previous high density mapping studies in citrus [54][55][56][57] was applied to establish 'Fortune' and 'Ellendale' genetic maps. For each map, SNP markers were selected according to their respective heterozygosity for the mapped parent and homozygosity for the other one. Each set of data for the 167 hybrids was filtered to retain markers and hybrids with less than 15% of missing data.
Linkage analysis and genetic mapping were then performed using JoinMap5 (https: //www.kyazma.nl/index.php/JoinMap/; accessed on 21 April 2021). Linkage mapping was performed in the « Hap » option for both 'Fortune' mandarin and 'Ellendale' tangor. Markers were grouped using the independence LOD score. Phases (coupling and repulsion) of the linked marker loci were automatically detected by the software. Map distances were estimated in centiMorgan (cM) using the regression mapping algorithm. After a first mapping round, singletons, i.e., an individual genotype that suggests recombination with its two flanking markers, were identified. On the high-density map, the probability of having two successive cross-overs within a small genomic area is very low, while genotyping errors strongly affect the estimation of genetic distances that erroneously expand the genetic linkage groups. Therefore, as recommended by Van Os et al. [58], we replaced singletons with missing data using a homemade excel page routine and performed a second mapping round. At the same time, a few individuals displaying an aberrant number of recombinations, set by examining the global recombination distribution, were removed as we considered their genotype calling quality was insufficient. The synteny and collinearity of both 'Fortune' and 'Ellendale' genetic maps with the reference clementine genome were visualized using Circos [59]; http://circos.ca; accessed on 21 April 2021 in Galaxy [60]. Marey maps were drawn using Excel to visualize changes in the recombination rate along the genome.

Analysis of Segregation Distortion
The matrix of phased data resulting from each previous genetic map analysis was used to study the skewed segregation all along the genome for each parent, globally and when used as male or female parent. The p-values for the Chi2 test according to a 0.5 theoretical frequency for each allele were computed with Excel and we used the approach proposed by Benjamini and Hochberg [61] to limit the false discovery rate (FDR) in multiple testing; the approach was performed according to the method of Storey [62] with a q value threshold of 0.05. The results were visualized in a Circos plot.

Haplotype Analysis
SNP phase markers in 'Fortune' and 'Ellendale' were identified with the CP option of JoinMap 5 using all segregating markers in the ForEl and ElFor progenies and allowing 15% of missing data. Then we selected the set of SNP markers shared with the ones used in the diversity study based on GBS with the same methodology published by Oueslati et al. [63]. The parentage of 'Fortune' mandarin was analyzed by examining the compatibility of its haplotypes with different potential parents: clementine, 'Orlando' tangelo and 'Dancy' mandarin. 'Ellendale' tangor haplotypes were also analyzed in relation with those of sweet orange. We took advantage of the haploid sequence of clementine and the parenthood network with the other accession to infer their haplotypes, as described in Amaral et al. [64].
Local haplotyping in the genomic region surrounding the S locus was performed using the same approach for accessions included in the clementine parenthood network. The relationships between haplotypes were then analyzed by neighbor-joining using DARwin software version 6.0 (https://darwin.cirad.fr/; accessed on 21 April 2021). The analyses were based on the Manhattan index: where i and j are the two individuals, k is the locus, K is the total number of loci and x ik is the frequency of the alternative allele at locus k for the individual i.

SNP Calling
Tassel software identified 23,875 polymorphic positions. Among them, we filtered positions where all replicate of the parents were identical, with a least one of the parents heterozygous, and with less than 15% of missing data. This resulted in the selection of 8458 SNPs.

Genetic Linkage Maps of 'Fortune' Mandarin and 'Ellendale' Tangor
The SNP matrix containing 167 individuals was filtered for markers heterozygous for 'Fortune' mandarin and homozygous for 'Ellendale' tangor that had less than 15% of missing data and segregations in agreement with the parental genotypes. Linkage mapping of the 'Fortune' mandarin was then performed using a matrix of 2184 segregating and 167 individuals. Five individuals displaying an abnormal number of recombinations during the first mapping round were discarded before the final mapping. A total of 2164 out of the 2184 SNPs were assigned to one of the nine resulting linkage groups (LG) which corresponds to the number of haploid chromosomes in citrus (Table 1; Supplementary  Table S1). The number of markers was unequally distributed among the linkage groups.
LG8 included only 55 SNPs while 370 SNPs were attributed to LG7. The small number of markers in LG8 is due to high homozygosity of 'Fortune' mandarin in a large part of the corresponding chromosome.
LG8 displayed the lowest genetic size (75.937 cM). LG3, gathering 352 SNPs, displayed the largest genetic size (276.43 cM) ( Table 1). The whole map spanned 1508.4 cM, with an average inter-locus distance of 0.7 cM. 95.1% of SNPs had an inter-locus gap of less than 3 cM and only 0.51% had a gap of more than 10 cM. A total of 1577 markers were located in genes, with 1523 genes marked ( Table 1; Table S1).   Sc: pseudochromosomes of the clementine reference genome (Wu et al. [65]); N: markers not assigned to one of the 9 pseudochromosomes; LG: linkage group; Mks: number of markers; Genes: number of genes that contain at least one of the mapped SNPs.
After filtering the markers heterozygous for Ellendale and homozygous for Fortune with less than 15% of missing data and with segregation in agreement with the parental genotypes, 1503 segregating SNPs genotyped in 167 hybrids were used to construct the genetic map of the 'Ellendale' tangor. Nine linkage groups including 1467 markers were generated. The number of markers ranged from 59 for LG6 to 268 for LG5 (Table 2;  Supplementary Table S2). The total size of the genetic map was 1034.3 cM with an average inter-locus distance of 0.71 cM. The smallest linkage group was LG6 with 90.18 cM, while LG3 was the largest (164.85 cM). The inter-locus gap of 94.68% of the SNPs was less than 3 cM, while the genetic distance was more than 10 cM in only 0.68% of them. A total of 1040 markers were located on genes and 1000 genes had at least one mapped marker ( Table 2; Supplementary Table S2).   Sc: pseudochromosomes of the clementine reference genome (Wu et al. [65]); N: markers not assigned to one of the 9 pseudochromosomes; LG: linkage group; Mks: number of markers; Genes: number of genes that contain at least one of the mapped SNPs.

Synteny and Collinearity with the Reference Genome of Clementine
In 'Fortune' mandarin, the majority of the linkage groups were composed of SNPs mapped onto syntenic pseudo-chromosomes of the clementine reference genome (Table 1) and the genetic map displayed high global synteny (95.6%). LG7 stood out, with linkage mapping of 12 markers physically located on chromosome 4 and 63 on chromosome 5. The map also counted five unassigned markers that were not previously positioned on the nine pseudo-chromosomes of the clementine reference genome. These five SNPs belong to the same scaffold (Scaff 10), indicating that Scaff 10 may be joined to pseudochromosome 7. The circos representations ( Figure 1) and the Marey map (Supplementary Figure S1A) between the genetic positions and physical locations over the clementine genome allowed us to identify a cluster of 78 markers displaying clear incongruency between the genetic map and the physical positions on chromosome 3. This cluster encompasses a genomic region located between 29 Mb and 34 Mb. The other markers display global high collinearity between the 'Fortune' genetic map and the Clementine reference genome. The evolution of the recombination rate along the chromosome (Supplementary Figure S1A) is very similar to evolution observed in clementine (Wu et al., 2014), that is directly linked with the density of genes and repeat elements along the genome.
Overall synteny was also high (96.8%) between the Ellendale genetic map and the Clementine reference genome (Table 2 and Figure 2). LG4 and LG6 displayed full synteny with the reference genome. As already observed in the 'Fortune' mandarin, some SNPs located on pseudochromosomes 4 and 5 were linked to LG7 (six markers for each pseudochromosome).
LG8 had more SNPs that were not mapped on the corresponding pseudochromosome with respectively eight and seven SNPs (out of a total of 120) located on the physical assembly of pseudo-chromosomes 3 and 9.
mapping of 12 markers physically located on chromosome 4 and 63 on chromosome 5. The map also counted five unassigned markers that were not previously positioned on the nine pseudo-chromosomes of the clementine reference genome. These five SNPs belong to the same scaffold (Scaff 10), indicating that Scaff 10 may be joined to pseudochromosome 7. The circos representations ( Figure 1) and the Marey map ( Supplementary Figure S1A) between the genetic positions and physical locations over the clementine genome allowed us to identify a cluster of 78 markers displaying clear incongruency between the genetic map and the physical positions on chromosome 3. This cluster encompasses a genomic region located between 29 Mb and 34 Mb. The other markers display global high collinearity between the 'Fortune' genetic map and the Clementine reference genome. The evolution of the recombination rate along the chromosome (Supplementary Figure S1A) is very similar to evolution observed in clementine (Wu et al., 2014), that is directly linked with the density of genes and repeat elements along the genome.
Overall synteny was also high (96.8%) between the Ellendale genetic map and the Clementine reference genome (Table 2 and Figure 2). LG4 and LG6 displayed full synteny with the reference genome. As already observed in the 'Fortune' mandarin, some SNPs located on pseudochromosomes 4 and 5 were linked to LG7 (six markers for each pseudochromosome).
LG8 had more SNPs that were not mapped on the corresponding pseudochromosome with respectively eight and seven SNPs (out of a total of 120) located on the physical assembly of pseudo-chromosomes 3 and 9.

Segregation Distortion
Among the 2164 SNPs assigned on the 'Fortune' mandarin genetic map, 202 showed significant segregation distortions according to the X 2 test adjusted to the q value for multiple hypothesis testing (Table 3; Supplementary File S1). No significant distortion was found in LG 1, 4, 5, and 8. Distortion concerned only one marker in LG 2 (0.3%) and LG9 (0.9%) and 10 markers (2.8%) in LG3.
LG6 and LG7 displayed the highest numbers and rates of skewed markers (69-22.6% and 121-32.7% respectively). Segregation distortion was also investigated in efficient male and female 'Fortune' gametes (the ones that contributed to the progenies) (Table 3; Figure 3; Supplementary Table S1). No significant distortion was found in efficient female 'Fortune' mandarin gametes while 313 markers displayed significant distortion in efficient male gametes. The skewed markers were concentrated in LG6 (255) and LG7 (54). Even if the number of significant skewed markers was higher in LG6, the level of the distortion was much higher in a cluster of markers at the beginning of LG7. Indeed, in LG7, it reached the maximum segregation distortion value with complete elimination of one of the male alleles for two markers located at 11.4 cM, corresponding to positions 1,296,255 and 1,310,473 on chromosome 7 of the reference clementine genome. The level of distortion segregation decreased, in high correlation with genetic distance on both sides of this genetic position and remained significant from 0 to 55.8 cM.

Segregation Distortion
Among the 2164 SNPs assigned on the 'Fortune' mandarin genetic map, 202 showed significant segregation distortions according to the X 2 test adjusted to the q value for multiple hypothesis testing (Table 3; Supplementary File S1). No significant distortion was found in LG 1, 4, 5, and 8. Distortion concerned only one marker in LG 2 (0.3%) and LG9 (0.9%) and 10 markers (2.8%) in LG3.
LG6 and LG7 displayed the highest numbers and rates of skewed markers (69-22.6% and 121-32.7% respectively). Segregation distortion was also investigated in efficient male and female 'Fortune' gametes (the ones that contributed to the progenies) (Table 3; Figure 3; Supplementary Table S1). No significant distortion was found in efficient female 'Fortune' mandarin gametes while 313 markers displayed significant distortion in efficient male gametes. The skewed markers were concentrated in LG6 (255) and LG7 (54). Even if the number of significant skewed markers was higher in LG6, the level of the distortion was much higher in a cluster of markers at the beginning of LG7. Indeed, in LG7, it reached the maximum segregation distortion value with complete elimination of one of the male alleles for two markers located at 11.4 cM, corresponding to positions 1,296,255 and 1,310,473 on chromosome 7 of the reference clementine genome. The level of distortion segregation decreased, in high correlation with genetic distance on both sides of this genetic position and remained significant from 0 to 55.8 cM.
In 'Ellendale', 103 out of the 1467 mapped markers displayed a significant deviation from the expected genotypic proportions (Table 3; Figure 4; Supplementary Table S2). No significant distortion was observed in LGs 1, 2, 4, 5, 8, and 9. Only two (3.4%) and eight (3.5%) markers were skewed in LG6 and LG3, respectively. Ninety-three markers (58.9%) were significantly skewed in LG7. No significant distortion was observed in efficient female 'Ellendale' tangor gametes while 96 markers displayed skewed segregation in efficient male gametes. Three skewed markers were located in LG3 and the remaining 93 were located in the first part of LG7. The distortions in LG7 reached complete elimination of one allele in In 'Ellendale', 103 out of the 1467 mapped markers displayed a significant deviation from the expected genotypic proportions (Table 3; Figure 4; Supplementary Table S2). No significant distortion was observed in LGs 1, 2, 4, 5, 8, and 9. Only two (3.4%) and eigh (3.5%) markers were skewed in LG6 and LG3, respectively. Ninety-three markers (58.9% were significantly skewed in LG7. No significant distortion was observed in efficient fe male 'Ellendale' tangor gametes while 96 markers displayed skewed segregation in effi cient male gametes. Three skewed markers were located in LG3 and the remaining 93 were located in the first part of LG7. The distortions in LG7 reached complete elimination of one allele in 20 markers located between 1.3 and 5.5 cM on the genetic map and 0.083 and 1.549 MB of pseudo-chromosome 7. The level of distortion segregation decreased in high correlation with genetic distance and remained significant from 0 to 34.0 cM (there was then a gap in the genetic map between the marker at 34.0 cM and the following one at 50.9 cM).

Gene Annotation in the Fully Skewed Region of Chromosome 7 in Male Parents
Gene annotation of the clementine reference genome (Wu et al. [65]), in the genomi region where the counter-selected haplotype frequency was less than 0.04 in the two par ents (0.35-2.2 Mb), revealed 252 genes with 26 genes related to families reported to be involved in SI: 17 F-Box, 5 histidine kinase, 2 Leucine reach repeat (LRR), 1 map kinas and 1 ribonuclease ( Figure 5). Six of these F-Box genes and the ribonuclease (Ci clev10027322m.g.) form the S11 locus (position 0.98-1.20 Mb on chromosome 7) identified as being responsible for SI in clementine by Liang et al. [46]. The S11 locus is located wher skewed segregation for the two parents is most marked.

Gene Annotation in the Fully Skewed Region of Chromosome 7 in Male Parents
Gene annotation of the clementine reference genome (Wu et al. [65]), in the genomic region where the counter-selected haplotype frequency was less than 0.04 in the two parents (0.35-2.2 Mb), revealed 252 genes with 26 genes related to families reported to be involved in SI: 17 F-Box, 5 histidine kinase, 2 Leucine reach repeat (LRR), 1 map kinase and 1 ribonuclease ( Figure 5). Six of these F-Box genes and the ribonuclease (Ciclev10027322m.g.) form the S11 locus (position 0.98-1.20 Mb on chromosome 7) identified as being responsible for SI in clementine by Liang et al. [46]. The S11 locus is located where skewed segregation for the two parents is most marked.

Haplotype Structure and Origin of 'Fortune' Mandarin and 'Ellendale' Tangor
We established the phase between markers and hence the chromosome haplotypes of 'Fortune' and 'Ellendale' with JoinMap-5 using the CP scheme for two-way pseudotestcross mapping. We used the 8458 SNPs of the initial vcf file filtered at a rate of 15% for missing data. A total of 8415 SNPs were assigned to nine linkage groups. Among the 8044 syntenic markers, 6693 of these phased markers were the same as the ones in the GBS diversity analysis performed by Oueslati et al. [63] and were used to study the origin of 'Fortune' and 'Ellendale' varieties. Considering the high collinearity between the genetic maps of 'Fortune' and 'Ellendale' and the clementine reference genome, we used the positions on the clementine reference genome to analyze the haplotypic similarities between individuals along the genome using windows of 10 successive markers.
In the first step, we identified the two haplotypes of clementine: one inherited from the 'Commune' mandarin (mother of clementine) gamete ClHMc and the other from sweet orange (father of clementine) gamete ClHOr that generated the clementine [65,66] For this purpose, taking advantage of the fact that the clementine reference genome was established using a haploid clementine, we analyzed the compatibility of the haploid sequence and its complementary sequence (to obtain the diploid clementine genotype), using the 'Commune' mandarin and the sweet orange genotypic data all along the genome The two clementine haplotypes were reconstructed from this information. We then compared the two clementine haplotypes with the two haplotypes deduced from the linkage analysis based on the hybrid progenies between 'Fortune' and 'Ellendale'. The comparison revealed that for each chromosome, haplotype 1 from 'Fortune' (FH1) was the one inherited from clementine (Table 4; Supplementary Figure S2A).

Haplotype Structure and Origin of 'Fortune' Mandarin and 'Ellendale' Tangor
We established the phase between markers and hence the chromosome haplotypes of 'Fortune' and 'Ellendale' with JoinMap-5 using the CP scheme for two-way pseudotestcross mapping. We used the 8458 SNPs of the initial vcf file filtered at a rate of 15% for missing data. A total of 8415 SNPs were assigned to nine linkage groups. Among the 8044 syntenic markers, 6693 of these phased markers were the same as the ones in the GBS diversity analysis performed by Oueslati et al. [63] and were used to study the origin of 'Fortune' and 'Ellendale' varieties. Considering the high collinearity between the genetic maps of 'Fortune' and 'Ellendale' and the clementine reference genome, we used the positions on the clementine reference genome to analyze the haplotypic similarities between individuals along the genome using windows of 10 successive markers.
In the first step, we identified the two haplotypes of clementine: one inherited from the 'Commune' mandarin (mother of clementine) gamete ClHMc and the other from sweet orange (father of clementine) gamete ClHOr that generated the clementine [65,66]. For this purpose, taking advantage of the fact that the clementine reference genome was established using a haploid clementine, we analyzed the compatibility of the haploid sequence and its complementary sequence (to obtain the diploid clementine genotype), using the 'Commune' mandarin and the sweet orange genotypic data all along the genome.
The two clementine haplotypes were reconstructed from this information. We then compared the two clementine haplotypes with the two haplotypes deduced from the linkage analysis based on the hybrid progenies between 'Fortune' and 'Ellendale'. The comparison revealed that for each chromosome, haplotype 1 from 'Fortune' (FH1) was the one inherited from clementine (Table 4; Supplementary Figure S2A). The average similarity values of the 10 marker windows along each chromosome ranged from 98.6% for chromosome 6-100% for chromosomes 2 and 8 (with a total average similarity of 99.5%) while the average similarity for the 'Fortune' haplotype 2 (FH2) ranged between 37.6% for chromosome 7 and 86.4% for chromosome 8 (global average: 67.9). The analysis of similarities between FH1 and the two clementine haplotypes based on sliding windows of 10 markers (Supplementary Figure S2A) revealed nine recombination events between the ClHMc and ClHOr chromosome (Table 4) during the formation of the gamete that generated the 'Fortune' mandarin. Given that for each chromosome, FH1 was inherited from clementine (the female parent of 'Fortune'), we tested the compatibility of FH2 with three potential male parents: 'Orlando' tangelo, grapefruit, and 'Dancy' mandarin (Table 5). For 'Orlando' tangelo the average compatibility over the whole genome was very high (98.43%) with little variation between chromosomes (97.1% on chromosome 7 to 99.0% on chromosome 4). The average values were lower for grapefruit and 'Dancy' mandarin (87.4% and 75.8%, respectively) with very low values for some chromosomes (55.6% for grapefruit chromosome 9 and 49.4% for 'Dancy' chromosome 7). Considering 'Orlando' as the male parent of 'Fortune', we identified 14 recombination events between the 'Dancy' mandarin and grapefruit genomes that constituted 'Orlando' tangelo during the genesis of the male gamete that generated 'Fortune'. Supplementary Figure S3 give a schematic diagram of the parentage of 'Fortune' mandarin based on this analysis and locates the different recombination points that took place on the clementine and tangelo gametes. We also tested 'Ellendale' haplotype inheritance from sweet orange ( Table 4). One of the haplotypes considered for sweet orange (OrHCl) was the one deduced previously from the identification of the clementine haplotype (ClHOr) originating from sweet orange and the secondone (OrH2) was complementary to obtain the diploid sweet orange genotype. We then analyzed the similarity between the two haplotypes of sweet orange and the two for Ellendale (EH1 and EH2) all along the genome (Supplementary Figure S2B). On each chromosome, one of the 'Ellendale' haplotypes displayed high local similarity with at least one sweet orange haplotype. Indeed, the average similarity value of the set of 10 marker windows along each chromosome ranged between 97.2 and 99.7% for EH2 on chromosomes 1-6 and 9 and between 97.5 and 99.5% for EH1 on chromosomes 7 and 8, respectively. It is therefore highly probable that 'Ellendale' is a direct hybrid of sweet orange. Considering the clear difference in similarity for EH1 and EH2 haplotypes with sweet orange on each chromosome, we can assign the 'Ellendale' haplotype inherited from sweet orange: EH2 for chromosomes 1-6 and 9 and EH1 for chromosome 7 and 8.

Haplotypic Structure around the SI Locus
A more detailed analysis of the area surrounding the SI region (as defined by Liang et al. 2020), revealed that a recombination occurred during clementine meiosis in the 'Fortune' haplotype originating from Clementine (FH1). Indeed ( Figure 6) FH1 conforms well with the clementine haplotype inherited from 'Commune' mandarin (ClHMc) from the start of the chromosome up to position 871,872. Then, from 1,180,894 to at least 2.5 Mb, it conforms with the sweet orange haplotype of clementine (ClHOr). Data for the S07_1180894 marker indicate that, at this position, the FH1 haplotype was inherited from its sweetorange grandfather. Unfortunately, only the S07_1180894 marker provided information within the SI region. Accordingly, the recombination should have occurred before or within the SI locus. and 8, respectively. It is therefore highly probable that 'Ellendale' is a direct hybrid of sweet orange. Considering the clear difference in similarity for EH1 and EH2 haplotypes with sweet orange on each chromosome, we can assign the 'Ellendale' haplotype inherited from sweet orange: EH2 for chromosomes 1-6 and 9 and EH1 for chromosome 7 and 8.

Haplotypic Structure around the SI Locus
A more detailed analysis of the area surrounding the SI region (as defined by Liang et al. 2020), revealed that a recombination occurred during clementine meiosis in the 'Fortune' haplotype originating from Clementine (FH1). Indeed ( Figure 6) FH1 conforms well with the clementine haplotype inherited from 'Commune' mandarin (ClHMc) from the start of the chromosome up to position 871,872. Then, from 1,180,894 to at least 2.5 Mb, it conforms with the sweet orange haplotype of clementine (ClHOr). Data for the S07_1180894 marker indicate that, at this position, the FH1 haplotype was inherited from its sweet-orange grandfather. Unfortunately, only the S07_1180894 marker provided information within the SI region. Accordingly, the recombination should have occurred before or within the SI locus. Analysis of the genotype of the ElFor segregating progeny for five loci heterozygous in 'Fortune' and homozygous in 'Ellendale' surrounding the SI locus (from position 869,201 to 1,369,079) revealed that the FH1 haplotype inherited from clementine was the one strongly counter-selected (Supplementary Table S3A) in this region. For 'Ellendale', 14 markers between position 601,330 and 1,381,530 were available for a similar analysis in ForEl progenies and revealed that the EH1 haplotype inherited from sweet orange was the one strongly counter-selected in the SI area (Supplementary Table S3B).
According to Kim et al. [38], clementine and 'Ellendale' share the same S3-S11 genotype at the SI locus and we therefore expected that 'Ellendale' and 'Fortune' would share one of the clementine alleles and that this common allele would be counter-selected. We had previously identified the counter-selected haplotypes in 'Fortune' and 'Ellendale' pollen as respectively FH1 (inherited from clementine) and EH1 (inherited from sweet orange). These results confirmed our hypothesis and suggest that 'Fortune' possesses the functional SI alleles of sweet orange, inherited through clementine, and shared with 'Ellendale'. This in turn implies that the recombination in the clementine gamete that produced 'Fortune' occurred before the functional SI genes with a switch from 'Commune' mandarin haplotype, at the beginning of the chromosome, to the sweet orange haplotype.
To check this hypothesis, we performed neighbor-joining analyses using two sets of markers located in the vicinity of the SI locus; one with markers in the area where clementine's contribution to 'Fortune' comes from the 'Commune' mandarin (position 601,330-869,997; with 13 markers; Supplementary Figure S4) and a second with markers where the clementine haplotype of 'Fortune' (FH1) concerned was inherited from sweet orange (position 1,180,894-1,381,530; with eight markers; Figure 7A). We included in the analysis the inferred haplotypes of 'Fortune', 'Ellendale', 'Nules' clementine, 'Commune' mandarin and sweet orange. We expanded the inference of haplotype genotypes in this Analysis of the genotype of the ElFor segregating progeny for five loci heterozygous in 'Fortune' and homozygous in 'Ellendale' surrounding the SI locus (from position 869,201 to 1,369,079) revealed that the FH1 haplotype inherited from clementine was the one strongly counter-selected (Supplementary Table S3A) in this region. For 'Ellendale', 14 markers between position 601,330 and 1,381,530 were available for a similar analysis in ForEl progenies and revealed that the EH1 haplotype inherited from sweet orange was the one strongly counter-selected in the SI area (Supplementary Table S3B).
According to Kim et al. [38], clementine and 'Ellendale' share the same S3-S11 genotype at the SI locus and we therefore expected that 'Ellendale' and 'Fortune' would share one of the clementine alleles and that this common allele would be counter-selected. We had previously identified the counter-selected haplotypes in 'Fortune' and 'Ellendale' pollen as respectively FH1 (inherited from clementine) and EH1 (inherited from sweet orange). These results confirmed our hypothesis and suggest that 'Fortune' possesses the functional SI alleles of sweet orange, inherited through clementine, and shared with 'Ellendale'. This in turn implies that the recombination in the clementine gamete that produced 'Fortune' occurred before the functional SI genes with a switch from 'Commune' mandarin haplotype, at the beginning of the chromosome, to the sweet orange haplotype.
To check this hypothesis, we performed neighbor-joining analyses using two sets of markers located in the vicinity of the SI locus; one with markers in the area where clementine's contribution to 'Fortune' comes from the 'Commune' mandarin (position 601,330-869,997; with 13 markers; Supplementary Figure S4) and a second with markers where the clementine haplotype of 'Fortune' (FH1) concerned was inherited from sweet orange (position 1,180,894-1,381,530; with eight markers; Figure 7A). We included in the analysis the inferred haplotypes of 'Fortune', 'Ellendale', 'Nules' clementine, 'Commune' mandarin and sweet orange. We expanded the inference of haplotype genotypes in this genomic region to grapefruit, 'Orlando' tangelo, and 'Dancy' mandarin, using the previously inferred haplotypes for sweet orange and the known parental relationships: grapefruit = sweet orange x pummelo and 'Orlando' tangelo = 'Dancy' mandarin x grapefruit. We also added the haplotypes of 'Hupang' citron, 'Chandler' and 'Timor' pummelos taking advantage of the full homozygosity of these varieties for the markers concerned. genomic region to grapefruit, 'Orlando' tangelo, and 'Dancy' mandarin, using the previously inferred haplotypes for sweet orange and the known parental relationships: grapefruit = sweet orange x pummelo and 'Orlando' tangelo = 'Dancy' mandarin x grapefruit. We also added the haplotypes of 'Hupang' citron, 'Chandler' and 'Timor' pummelos taking advantage of the full homozygosity of these varieties for the markers concerned. If we consider the flanking haplotypes to be informative (due to linkage disequilibrium) for the SI locus haplotypes, it appears that the similarity of the FH1 haplotype with the Ellendale haplotypes before the cross-over (inherited from ClHMc) is not compatible with the observed rejection of FH1 and EH1 in the SI region for the reciprocal crosses (implying identity of FH1 and EH1 for the SI locus). Indeed, before the CO involved in the genesis of the FH1 gamete, FH1 is identical to ELH2 (and identical to ClHMc = McHCl), while ELH1 is shared with OrHcl = ClHor, the two Chandler pummelo haplotypes and one haplotype of Orlando (ToHGr) and the 'Chandler' pummelo haplotypes (Supplementary Figure S4).
Conversely, in the region after the recombination identified in the FH1 haplotype ( Figure 7A), the identities of the 'Fortune' and 'Ellendale' haplotypes agree with what would be expected for the SI locus according to ELH1 and FH1 rejection. ElH1 and FH1 are identical and logically identical to the haplotype of clementine inherited from sweet orange (ClHOr = OrHCl). The same genotype for the concerned markers is also shared If we consider the flanking haplotypes to be informative (due to linkage disequilibrium) for the SI locus haplotypes, it appears that the similarity of the FH1 haplotype with the Ellendale haplotypes before the cross-over (inherited from ClHMc) is not compatible with the observed rejection of FH1 and EH1 in the SI region for the reciprocal crosses (implying identity of FH1 and EH1 for the SI locus). Indeed, before the CO involved in the genesis of the FH1 gamete, FH1 is identical to ELH2 (and identical to ClHMc = McHCl), while ELH1 is shared with OrHcl = ClHor, the two Chandler pummelo haplotypes and one haplotype of Orlando (ToHGr) and the 'Chandler' pummelo haplotypes (Supplementary Figure S4).
Conversely, in the region after the recombination identified in the FH1 haplotype ( Figure 7A), the identities of the 'Fortune' and 'Ellendale' haplotypes agree with what would be expected for the SI locus according to ELH1 and FH1 rejection. ElH1 and FH1 are identical and logically identical to the haplotype of clementine inherited from sweet orange (ClHOr = OrHCl). The same genotype for the concerned markers is also shared with the 'Chandler' and 'Timor' pummelo haplotype and one grapefruit haplotype (GrH2). The FH2 haplotype inherited from 'Orlando' tangelo (ToH2) is the haplotype coming from the 'Dancy' mandarin (MdHTo) while ElH2 is identical to McHCl (=ClHMc) for the considered markers. Considering that the haplotype information in this part of the genome is indicative of the one for the SI locus for related varieties, we schematized the SI allele inheritance along related varieties ( Figure 7B) according to the nomenclature of Kim et al. [38] for clementine and 'Ellendale'. We added the information on the phylogenomic structure in the area surrounding the SI locus resulting from the analysis performed by Oueslati et al. [63]. The SI allele shared by ELH1 and FH1 should then be the S3 allele identified by Kim et al. [38]. Furthermore, the marker haplotypes suggest that the origin of the compatibility allele S f' found in 'Commune' and 'Dancy' mandarin differs from that of the S f allele in sweet orange. Both S f and S f' should have originated in the C. reticulata gene pool. 'Fortune' mandarin and 'Ellendale' tangor are important progenitors for mandarin breeding. It was therefore essential to establish molecular resources such as saturated genetic maps to optimize their exploitation in breeding schemes. Two genetic maps of 'Fortune' mandarin were previously published. The first one constructed using a reciprocal cross between 'Fortune' and 'Chandler' pummelo spanned 577 cM with 95 markers, mostly SSRs, defined in 13 linkage groups [67]. More recently, another 'Fortune' genetic map was constructed from an F1 population derived from 'Fortune' x 'Murcott' mandarin [68]. The map spanned 681.07 cM and consists of 189 SNP markers distributed along nine linkage groups. In the present study, a high-density genetic map of 'Fortune' mandarin was built for the first time. It consists of 2164 SNP markers spread among nine linkage groups, corresponding to the nine citrus chromosomes, with a total size of 1508.4 cM. All chromosomes except chromosome 8 were regularly covered. The very partial coverage of chromosome 8 is due to high homozygosity resulting from the inheritance of the same haplotype region of sweet orange genome from the two parents of 'Fortune' mandarin. Indeed, both parents share parentage with sweet orange: 'clementine' = mandarin × sweet orange and 'Orlando' tangelo = (mandarin × (pummelo × sweet orange). No 'Ellendale' tangor genetic map has been published to date. The one we implemented includes 1467 SNPs defined in nine linkage groups. It spans a total of 1034.3 cM. Most chromosomes display good and homogeneous coverage. However, markers are lacking in the first part of chromosome 2 and the middle of chromosome 4. These gaps result from high homozygosity probably due to inbreeding in the origin of Ellendale.
Synteny is high and the linear orders of the markers are highly conserved in the two genetic maps and the clementine reference genome [65]. This is consistent with previous studies concluding on high synteny and collinearity between Citrus species [54,67,68] and even between Citrus species and Poncirus trifoliata [57,69,70]. However, a few discrepancies were observed between our two genetic maps and the clementine reference genome. On LG7, both 'Fortune' and 'Ellendale' displayed two sets of SNP markers located on chromosomes 4 and 5 of the clementine reference genome. Similar results were previously reported in the 'Fortune' genetic map [68] but also on sweet orange and trifoliate orange genetic maps [57,70]. The analysis of collinearity between LG3 and chromosome 3 evidenced a misplaced and probably inverted genomic region, particularly visible in 'Fortune', located between 29 and 34 Mb. The same genomic area was also identified as misplaced in the high-density genetic maps of sweet orange and trifoliate orange [57] and even in the reference genetic map of Clementine [54]. It is therefore probable that most of the apparent non-syntenic or non-colinear markers are rather due to minor errors in the clementine genome assembly than to real structural variations between 'Fortune' or 'Ellendale' and clementine.

The Origins of 'Fortune' Mandarin and 'Ellendale' Tangor Were Assessed through Analysis of Chromosome Haplotypes
'Fortune' mandarin is a late high-quality mandarin widely used as the female parent in mandarin breeding programs [2] due to its self-incompatibility and non-apomictic reproductive behavior. It was presented by its plant breeders as a hybrid between clementine and 'Dancy' mandarin [50,51]. However genotyping data with 17 SSR markers discarded 'Dancy' as direct parent and suggested that 'Orlando' tangelo, a hybrid between 'Duncan' tangelo and 'Dancy' mandarin, was the male parent of 'Fortune' [48]. In our study, we established the chromosome haplotypes of 'Fortune' from phased data of 6693 markers, and for each chromosome, we identified the 'Fortune' haplotype inherited from clementine and checked the compatibility of the remaining haplotypes with three potential male parents: 'Orlando' tangelo, grapefruit, and 'Dancy' mandarin. 'Orlando' tangelo was validated as male parent with 98.5% compatibility over the whole genome. The contributions of the genomes of the four grandparents of 'Fortune ('Commune' mandarin and sweet orange inherited from clementine; grapefruit and 'Dancy' mandarin inherited from 'Orlando' tangelo) were analyzed all along the genome and revealed respectively, nine and 14 recombination events in the clementine and the 'Orlando' tangelo gametes that generated 'Fortune' mandarin. Interestingly, one of the cross-over in the clementine gamete occurred near the SI locus identified by Liang et al. [46] at the beginning of chromosome 7.
The 'Ellendale' variety originated in Queensland as a chance seedling at the end of the 19th century and became an important variety in Australia and a standard parent for mandarin breeding due to its self-incompatibility and non-apomictic reproductive behavior. Bowman [71] considered it a natural tangor (mandarin x sweet orange hybrid) based on its fruit attributes. However, until now no concrete proof has been provided for this origin. Comparison of the chromosome haplotypes of 'Ellendale' tangor and sweet orange allowed us to identify, for each chromosome, one of the two 'Ellendale' haplotypes that fit the sweet orange ones and the potential breaking points between the two sweet orange haplotypes needed to reconstitute the 'Ellendale' haplotype. Overall compatibility was 98.2% and we can therefore assume that sweet orange is one of the direct parents of 'Ellendale'. The identification of the second parent will need additional studies based on high throughput genotyping of mandarins and mandarin hybrid germplasm.

Segregation Distortion in the Male Parent Revealed a Genomic Region Involved in Self-Incompatibility
Significant segregation distortions were observed in 9.3% and 7.0% of the markers on the 'Fortune' and 'Ellendale' genetic maps, respectively. LG6 and LG7 had the highest percentages of distorted molecular markers in 'Fortune', while in 'Ellendale', distortion mostly concerned LG7. Interestingly, no skewed segregation was found to be significant in female gametes whereas skewed segregation reached 14.4% and 6.5% in male gametes of 'Fortune' and 'Ellendale', respectively. In citrus, segregation distortion has been described in many previous mapping studies and male parent markers have often been reported to display higher distortion than female ones [54,67,68,72], probably due to pollen competition [54]. However, none of the previous studies evidenced complete counter-selection of one male gamete allele, as would be expected for markers located near the self-incompatibility locus in the GSI system. In the present study, such a situation was observed in both 'Fortune' and 'Ellendale', for markers located at the beginning of LG7, whereas no significant distortion was observed in female gametes. Both 'Fortune' and 'Ellendale' are self-incompatibles with one assumed SI allele in common and this area of LG7 is the only area in the whole genome where such a distortion pattern was observed on the saturated genetic maps. Therefore, it is highly probable that the distortion at the beginning of LG7 reveals the presence of a major pollen gene for self-incompatibility. The annotation of the clementine genome revealed the presence of 17 F-box genes and one ribonuclease (Ciclev10027322m.g.) gene classically involved in gametophytic SI systems. Our observations are consistent with previous description for several SI varieties of pollen tube rejection after growing through the top one-third of the style, indicative of gametophytic rather than sporophytic control [7,33,43]. Our results are also in agreement with the differential expression of several F-box genes between 'Wuzishatangju' (SI) and 'Shatangju' (SC) mandarin pollens observed by Miao et al. [44] and the identification of a S-RNase gene implied in pummelo SI [43]. Above all, our results concerning skewed segregation are in full agreement with the conclusions of Liang et al. [46]. Indeed, these authors identified a SI locus for GSI system (including six of the F-Box genes and the ribonuclease we identified in the clementine annotation) located between 0.98 and 1.20 Mb of chromosome 7 of the clementine reference genome, where the skewed segregations were most marked for the two male parents of our progenies.
Moreover, the SI genotypes inferred from the haplotypes of the surrounding region of the SI locus for 'Fortune', 'Ellendale' and their progenitors provided a logical pattern, in agreement with SI phenotypes (sweet orange, grapefruit, 'Commune' and 'Dancy' mandarin being self-compatible and clementine, 'Orlando' tangelo 'Fortune' and 'Ellendale' self-incompatible) as well as with the genotypes at SI locus for 'Ellendale' and clementine proposed by Kim et al. [38]. Considering the flanking sequence, the S11 haplotype appears to have a C. reticulata origin according to the phylogenomic study of Oueslati et al. [63] and the Sf' self-compatible allele shared by 'Commune' and 'Dancy' mandarin could have a different origin from the one identified by Liang et al. [46] in sweet oranges.
The conclusion concerning a gametophytic SI system in citrus with the SI locus located at the beginning of chromosome 7 of the clementine reference genome proposed by Liang et al. [46] is therefore strongly confirmed by our study. However, the influence of environmental conditions on pollen-pistil interactions has already been documented [41,73,74] and different transglutaminase features and polyamine pattern were recently described depending on the prevailing temperature during pollination [41]. Additional studies are needed to gain a full understanding of the pollen-pistil interaction in different citrus species under different environments.

Conclusions
Two high-density genetic maps of 'Fortune' mandarin and 'Ellendale' tangor were constructed, for the first time, thanks to GBS analysis of two populations resulting from reciprocal crosses. These two maps consisted, respectively, of 2164 and 1467 markers, grouped in nine linkage groups corresponding to the nine pseudo-chromosomes of the clementine reference genome. These two genetic maps were characterized by high synteny and collinearity compared to the clementine reference genome. The inference of 'Fortune' and 'Ellendale' chromosomal haplotypes based on phase marker information, and their comparison with genotypic and haplotypic data of potential parents allowed us to decipher their origins. 'Fortune' mandarin results from clementine x 'Orlando' tangelo hybridization while 'Ellendale' tangor has sweet orange as a direct parent. The analysis of skewed segregation of male and female parents revealed a complete counter-selection of one haplotype for each male parent in the same region at the beginning of chromosome 7. These skewed segregations concerned a shared haplotypic region that includes a SI candidate locus for a gametophytic S-RNase system, recently identified by deep genomic analysis. The S alleles deduced for 'Fortune' mandarin, 'Ellendale' and their progenitors, from flanking haplotypic sequences, are consistent with their phenotypes for self-incompatibility. The new high-density genetic maps for two non-apomictic and self-incompatible varieties and the confirmation of a gametophytic S-RNase system, with the SI locus located at the beginning of chromosome 7 of the clementine reference genome, pave the way for more efficient use of self-incompatibility in breeding projects aimed at creating new seedless mandarin cultivars at diploid level.  Figure S3: Parentage of 'Fortune' mandarin and contribution of its grandparent genomes 'Commune' mandarin, sweet orange, 'Dancy' mandarin and grapefruit) to its genomic structure, Figure S4: Analysis of potential SI haplotypes according to the flanking sequences of the SI locus; Neighbor-joining analysis of haplotypes for markers located before the recombination event in the 'Fortune' FH1 maternal gamete, Table S1: Detail of the 'Fortune' genetic map including information on physical position of the markers (clementine reference genome), reference and alternative alleles, genetic position, segregation distortions and test of significance for all gametes, male gametes and female gametes, and the gene on which the marker is located (if any), Table S2: Detail of the 'Ellendale' genetic map including information on physical position of the markers (clementine reference genome), reference and alternative alleles, genetic position, segregation distortions, and test of significance for all gametes, male gametes and female gametes, and the gene on which the marker is located (if any), Table S3 Funding: This work received financial support from the European Regional Development Fund under the framework PO FEDER-FSE Corse 2014-2020 number 247SAEUFEDER1A, project called Innov'Agrumes (ARR-18/517 CE, synergie number: CO 0009083). GBS analyses were funded by the project France Genomique "Dynamo". We thank also the Collectivité de Corse for the grant of DA (number ARR-15.036680.SR).