Deviations from Mendelian Inheritance on Bovine X-Chromosome Revealing Recombination, Sex-of-Offspring Effects and Fertility-Related Candidate Genes

Transmission ratio distortion (TRD), or significant deviations from Mendelian inheritance, is a well-studied phenomenon on autosomal chromosomes, but has not yet received attention on sex chromosomes. TRD was analyzed on 3832 heterosomal single nucleotide polymorphisms (SNPs) and 400 pseudoautosomal SNPs spanning the length of the X-chromosome using 436,651 genotyped Holstein cattle. On the pseudoautosomal region, an opposite sire-TRD pattern between male and female offspring was identified for 149 SNPs. This finding revealed unique SNPs linked to a specific-sex (Y- or X-) chromosome and describes the accumulation of recombination events across the pseudoautosomal region. On the heterosomal region, 13 SNPs and 69 haplotype windows were identified with dam-TRD. Functional analyses for TRD regions highlighted relevant biological functions responsible to regulate spermatogenesis, development of Sertoli cells, homeostasis of endometrium tissue and embryonic development. This study uncovered the prevalence of different TRD patterns across both heterosomal and pseudoautosomal regions of the X-chromosome and revealed functional candidate genes for bovine reproduction.


Introduction
Significant deviations from Mendelian inheritance, well known as transmission ratio distortion (TRD), have been observed in a number of diploid organisms [1][2][3][4][5]. It is widely recognized that the TRD phenomenon is linked to a broad range of biological mechanisms associated with basic processes of life that underlie meiosis, gametogenesis, sperm and ova fertility and viability at developmental stages of the reproductive cycle (e.g., embryos, postnatal metabolism and growth, etc.) [6,7]. The importance of the above-mentioned events coupled with the current widespread use of DNA technologies, such as genotyping of breeding stock on a routine basis in some livestock industries (e.g., beef and dairy cattle and swine), emphasizes the relevance and potential usefulness of assessing TRD and opening new opportunities for research and extension initiatives. Although TRD remains an unclear and ambiguous phenomenon [8,9], it can be viewed as the ultimate consequence of several genetic factors arising at different stages of the reproductive process and early neonatal life. Indeed, some cases of TRD, such as the absence of homozygosity, are already used to target recessive phenotypes that affect reproduction, for example, decreased fertility and embryo loss in different livestock species [5,[10][11][12][13].
Despite the recent interest in targeting TRD regions across the whole genome or, alternatively, evaluating for the absence of homozygous haplotypes in livestock species [4,5,10,11,[13][14][15], TRD on sex chromosomes has not yet received attention. In

Imputation
In order to carry out haplotype analyses, 974 SNPs were selected from the heterosomal part of the X-chromosome (Supplementary Table S2). The number of raw genotypes ranged from 23,704 to 402,267 with an average of 141,821 genotypes across 974 SNPs. The length of the X-chromosome covered by these markers was between 29,245 and 137,012,896 bp. Data was phased and imputed using a Fortran program using familial information (i.e., parentoffspring trios and sibling pairs) with variable sliding windows ranging from 2 to 200 SNPs.

Male-and Female-Offspring Specific-TRD
Taking an autosomal (or pseudo-autosomal) chromosome region as a starting point, the analysis of TRD for male ( ♂ ) and female offspring (♀) can be carried out by modeling the probability of transmission of alleles from heterozygous parents [4,14]. Considering male offspring as an example, the probability of transmission of the A allele from heterozygous parents (A/B) can be stated as: P(A) = 1 − P(B) = 0.5 + α ♂ and P(B) = 1 − P(A) = 0.5 − α ♂ , where: α ♂ is the TRD parameter ranging between −0.5 and 0.5. As described by Casellas et al. (2017) [4], this model can be easily expanded to account for sire-(α s♂ ) and dam-specific (α d♂ ) TRD. The same applies to female offspring to obtain α♀, α s♀ and α d♀ .
The previous models can be generalized to the analysis of TRD phenomena in the non-autosomal (i.e., heterosomal) region of the X chromosome, but restricted to damspecific TRD.

Statistical Analyses
Two different approaches were used for analyzing TRD across the bovine X-chromosome. Transmission ratio distortion was evaluated using both SNP-by-SNP (a total of 6577 SNPs)  [13]. For sliding windows, the biallelic-haplotype procedure was applied. All the analyses were performed within a Bayesian framework using an adapted version of TRDscan software [13] with a unique Monte Carlo Markov chain of 110,000 iterations, where the first 10,000 iterations were discarded as burn-in. The statistical significance of the TRD was tested using a BF [19].

Pseudoautosomal Characterization
Mendelian inconsistency was estimated under a pseudoautosomal and heterosomal inheritance models using trios of genotypes (from dam, sire and offspring) in order to determine the nature of SNPs across the X-chromosome. Both analyses of LD and recombination events were performed. For recombination events, the sum of recombination probabilities was calculated for markers in pseudoautosomal parts of the X-chromosome using the package hsphase in R [20]. In addition, for better characterization of the pseudoautosomal regions on X-and Y-chromosomes, the regions with pseudoautosomal inheritance models were initially selected to perform a "nucleotide BLAST" analysis on Basic Local Alignment Search Tool (BLAST; https://blast.ncbi.nlm.nih.gov/Blast.cgi accessed on 10 December 2020) against the bovine reference genome. The sequence was retrieved using the bovine UMD3.1 genome assembly from NCBI database. After this step, a short sequence of 20 bp around the genomic position of the markers was extracted to obtain the nucleotide context around the TRD markers. Using the genome data viewer, regions with homology between the short sequence around the markers and the Y-chromosome were investigated. The whole assembly of the bovine Y-chromosome is completely available for visualization only for the bovine genome assembly Btau_4.6.1 and, therefore, this assembly was used for the data visualization of the Y-chromosome. It is important to reinforce that neither for UMD 3.1 or ARS-UCD 1.2 an assemble of bovine Y-chromosome is available, which results in the impossibility to use these assemblies for the abovementioned analysis.

Functional Analyses
The genes mapped around the TRD regions identified by both SNP-and haplotypebased methods were annotated using R software (Version 3.2.3) and the R package: Genomic functional Annotation in Livestock for positional candidate Loci (GALLO) [21]. The genes mapped around the TRD regions identified using the SNP approach were annotated using an interval of 100 Kilobase (Kb) upstream and downstream from the SNP position. For the haplotype approach, the genes were annotated using the start and end coordinates from each haplotype based on the UMD 3.1 assembly from the bovine reference genome. Subsequently, in addition to the bovine Ensembl ID and official associated Gene Symbol, the respective orthologous genes in human and mouse were also annotated for each positional candidate (Ensembl ID and Gene Symbol) using the R package biomaRt [22]. In this step, only the human ortholog showing a similarity higher than 75% with the bovine annotated gene were retained for subsequent analyses. The highest |α| and the respective log 10 (BF) of the associated marker (SNPs or haplotypes with TRD) were assigned for each positional candidate gene.
Functional analysis were performed using the IPA (Ingenuity Pathways Analysis) software (QIAGEN Redwood City, www.qiagen.com/ingenuity 15 December 2020) to identify the canonical metabolic pathways, diseases and functions associated with the list of candidate genes harboring deviations from Mendelian inheritance on the bovine X-chromosome (including both SNPs and haplotypes). In addition, functional analysis was performed to explore the existence of signaling networks connecting those genes [23]. The significantly enriched metabolic pathways were identified using a significance threshold of false-discovery rate (FDR) < 0.05.

Characterization of X-Chromosome Using Parent-Offspring Genotyped Trios
The mammalian sex chromosomes are characterized by two main types of regions: a short region of sequence homology between the X-and the Y-chromosomes, known as the pseudoautosomal region and a longer heterosomal (sex-linked) region specific for each sex chromosome. The pseudoautosomal region in cattle is located in the long arm of the X-chromosome and the short arm of the Y-chromosome. Its physical length is estimated to be between 5-9 Mb [24]. Other studies specified a size of 8 Mbp [25] and 5.7 Mb on the ARS-UCD1.2 reference genome assembly [26]. Within the pseudoautosomal region, pairing and recombination occurs between X-and Y-chromosomes in males as it does for autosomal chromosomes. On the other hand, the heterosomal regions do not pair and recombine during meiosis in males, due to the reduced homology between the chromosomes. In this study, the patterns of Mendelian incompatibilities, estimated under pseudoautosomal and heterosomal inheritance models for each individual SNP together with the patterns of their adjacent SNPs, were used to confirm their nature (i.e., pseudoautosomal or heterosomal). The short arm of the X-chromosome and part of the long arm showed high incompatibility under the pseudoautosomal inheritance model whereas null Mendelian inconsistencies under the heterosomal inheritance model (29,034,696 bp; Figure 1). Alternatively, the extreme part of the long arm of the X-chromosome (143,865,210-148,816,634 bp) displayed high and null Mendelian inconsistencies under heterosomal and pseudoautosomal inheritance models, respectively. This opposite behavior supported its pseudoautosomal nature described in the bovine genome [24,25,27]. However, unique individual SNPs showed Mendelian incompatibilities discrepant with their adjacent SNPs, suggesting possible genotyping artifacts or inaccurate annotation. These discrepant individual SNPs were removed from further TRD analyses. On the other hand, small regions of adjacent SNPs between these two clear extremes (heterosomal and pseudoautosomal parts) fitted either a heterosomal or a pseudoautosomal inheritance model. From the preliminary scan, 5935 and 780 SNPs were shown as exhibiting heterosomal and pseudoautosomal inheritance, respectively. In order to guarantee a reasonable statistical power, only SNPs with at least 100 trios (sire-dam-offspring) of genotypes and a Mendelian inconsistency < 1% were maintained for TRD analyses (Supplementary Table  S1). After this editing, the number of SNPs was reduced to 3832 and 400 with heterosomal and pseudoautosomal nature, respectively. The two main parts of the X-chromosome included 3680 heterosomal SNPs (BTX:29,245-137,034,696 bp) and 322 pseudoautosomal From the preliminary scan, 5935 and 780 SNPs were shown as exhibiting heterosomal and pseudoautosomal inheritance, respectively. In order to guarantee a reasonable statistical power, only SNPs with at least 100 trios (sire-dam-offspring) of genotypes and a Mendelian inconsistency < 1% were maintained for TRD analyses (Supplementary Table S1). After this editing, the number of SNPs was reduced to 3832 and 400 with heterosomal and pseudoautosomal nature, respectively. The two main parts of the X-chromosome included 3680 heterosomal SNPs (BTX: 29 [24], i.e., between 5-9 Mb. It must be noted that the pseudoautosomal part of the cattle X-chromosome is demarcated by the pseudoautosomal boundary, a border where the sequence similarity between the X-and the Y-chromosomes decreases (to 80-50%) and regions specific to each sex chromosome begin [28].  [29]. Specifically, these 10 SNPs displayed ≥ 9% and ≤0.31% Mendelian inconsistencies under pseudoautosomal and heterosomal inheritance models, respectively, and their heterosomal nature was supported by the similar pattern of the corresponding adjacent SNPs. These SNPs are also reported with null male heterozygosity (≤0.02) and moderate-to-high female heterozygosity (≥0.29) by McClure et al. (2018) [29].

Deviations from Mendelian Inheritance on the Pseudoautosomal Part of the Sex Chromosomes
Decisive evidence with Bayes factor (BF; ≥100) according to Jeffreys' scale [30] was identified for TRD across the Holstein sex chromosomes. The pseudoautosomal SNPs on the sex chromosomes revealed an important pattern of sire-TRD with specific sex-offspring components. A total of 180 SNPs was characterized by significantly displaying opposite sire-TRD between offspring sexes ( Figure 2A; Supplementary Table S4). SNPs that exhibited specific sire-TRD via one single offspring sex (only male or female) were also observed, being 26 SNPs with sire-TRD via male and 19 SNPs with sire-TRD via female (Tables 1 and S4). In addition, 3 SNPs displayed an opposite sire-TRD between offspring sexes and also a significant dam-TRD pattern only via one single offspring sex (Supplementary Table S4). On the other hand, additional 5, 2 and 1 SNPs were also identified with BF ≥ 100 for overall TRD (specific offspring sex), dam-TRD and sire-TRD (unspecific offspring sex), respectively. Among them, only one individual SNP showed a probability of random TRD  (C) Density of SNP available for analyses (pseudoautosomal with red colo heterosomal with blue color). The pink and blue circles represent the first two markers with α (BTX:143,865,210 and BTX:143,870,595), respectively. The red circle represents the last marker an opposite sire-TRD effect between male and female offspring (BTX:148,789,832). The orange green rectangles correspond to the markers with opposite sire-TRD effects between male and fe offspring and dam-TRD for male (orange) and female (green) offspring. The blue diamonds r sent the region with highest homology between the X-and Y-chromosomes (BTX:144,251 144,399,458). It is important to highlight that this is a schematic image and some genes where pressed. (C) Density of SNP available for analyses (pseudoautosomal with red color and heterosomal with blue color). The pink and blue circles represent the first two markers with α = 0.5 (BTX:143,865,210 and BTX:143,870,595), respectively. The red circle represents the last marker with an opposite sire-TRD effect between male and female offspring (BTX:148,789,832). The orange and green rectangles correspond to the markers with opposite sire-TRD effects between male and female offspring and dam-TRD for male (orange) and female (green) offspring. The blue diamonds represent the region with highest homology between the X-and Y-chromosomes (BTX:144,251,086-144,399,458). It is important to highlight that this is a schematic image and some genes where suppressed.

Opposite Sire-TRD between Male-and Female-Offspring
Opposite sire-TRD between male-and female-offspring were observed across the whole pseudoautosomal part of sex chromosomes with a specific pattern ( Figure 2). This TRD pattern was observed with a preferential transmission of one specific allele to maleoffspring and the opposite allele to the female-offspring. Figure 3 shows how strong the sire-TRD was between male-and female-offspring among different matings for the SNP (BTX:144,617,673) with the highest BF. Taking into account the approximate empirical null distribution of TRD [13] at ≤0.001% margin error for random TRD, the number of regions with this pattern reduced from 180 to 149 SNPs spanning the length of the pseudoautosomal part of the sex chromosome. Here, the important discovery resides, as shown in Figure 2A, in the strong sire-TRD observed for SNPs at the beginning of the pseudoautosomal region and its marginal reduction with the distance until being null on the extreme of the X-chromosome. It is important to highlight that this pattern is primarily attributed to the main pseudoautosomal part of the chromosomes (i.e., 143,865,210-148,816,634 bp), where the first 8 SNPs (143,865,210-144,085,178) displayed TRD ranging from 0.47 to 0.50 and with full linkage to one sex (i.e., null recombination). This result indicates a strong linkage of some alleles to X-or Y-chromosomes at the beginning of pseudoautosomal region. Addition, the decay of opposite sire-TRD between male-and female-offspring with the distance describe the accumulated recombination events across the pseudoautosomal part of the cattle genome and suggest a consistent level of these events. The results of recombination analyses supported these patterns showing an absence of recombination events at the beginning of this pseudoautosomal genomic region and a more consistent level of recombination events started as TRD magnitude decreased ( Figure 4). Therefore, specific sex-offspring TRD revealed another way to study recombination and linkage disequilibrium (LD) in sex chromosomes. Thus, whereas marker alleles in strong LD to specific sex chromosome (X or Y) were observed at the beginning of the pseudoautosomal region, this LD decreases at longer distances from the beginning of the pseudoautosomal region until being null on the extreme of the X-chromosome. This expected partial sex linkage on pseudoautosomal regions was previously reported in humans [31]. Moreover, this result suggested that novel mutations generated at the beginning of the pseudoautosomal part will be linked to a specific sex for long time than potential mutation occurred on the extreme of the X-chromosome.

Opposite Sire-TRD between Male-and Female-Offspring
Opposite sire-TRD between male-and female-offspring were observed across the whole pseudoautosomal part of sex chromosomes with a specific pattern ( Figure 2). This TRD pattern was observed with a preferential transmission of one specific allele to maleoffspring and the opposite allele to the female-offspring. Figure 3 shows how strong the sire-TRD was between male-and female-offspring among different matings for the SNP (BTX:144,617,673) with the highest BF. Taking into account the approximate empirical null distribution of TRD [13] at ≤0.001% margin error for random TRD, the number of regions with this pattern reduced from 180 to 149 SNPs spanning the length of the pseudoautosomal part of the sex chromosome. Here, the important discovery resides, as shown in Figure 2A, in the strong sire-TRD observed for SNPs at the beginning of the pseudoautosomal region and its marginal reduction with the distance until being null on the extreme of the X-chromosome. It is important to highlight that this pattern is primarily attributed to the main pseudoautosomal part of the chromosomes (i.e., 143,865,210-148,816,634 bp), where the first 8 SNPs (143,865,210-144,085,178) displayed TRD ranging from 0.47 to 0.50 and with full linkage to one sex (i.e., null recombination). This result indicates a strong linkage of some alleles to X-or Y-chromosomes at the beginning of pseudoautosomal region. Addition, the decay of opposite sire-TRD between male-and female-offspring with the distance describe the accumulated recombination events across the pseudoautosomal part of the cattle genome and suggest a consistent level of these events. The results of recombination analyses supported these patterns showing an absence of recombination events at the beginning of this pseudoautosomal genomic region and a more consistent level of recombination events started as TRD magnitude decreased ( Figure 4). Therefore, specific sexoffspring TRD revealed another way to study recombination and linkage disequilibrium (LD) in sex chromosomes. Thus, whereas marker alleles in strong LD to specific sex chromosome (X or Y) were observed at the beginning of the pseudoautosomal region, this LD decreases at longer distances from the beginning of the pseudoautosomal region until being null on the extreme of the X-chromosome. This expected partial sex linkage on pseudoautosomal regions was previously reported in humans [31]. Moreover, this result suggested that novel mutations generated at the beginning of the pseudoautosomal part will be linked to a specific sex for long time than potential mutation occurred on the extreme of the X-chromosome.   This opposite sire-TRD pattern could reinforce the high recombination rate in the cattle pseudoautosomal region, as it was also described on other mammalian species, which exceed the genome average by 10-20 times [32]. Notice that the recombination events on the sex chromosomes in males are expected to be extraordinarily restricted to a short genomic segment (i.e., pseudoautosomal regions), appearing to be a recombinational hotspot in male meiosis [31]. In addition, the crossover in the pseudoautosomal region was described to be essential for the proper disjunction of X-and Y-chromosomes in male meiosis and a deletion of pseudoautosomal region was reported to result in male sterility [33]. However, biologically, the mechanisms by which this mandatory crossover is achieved remains unknown [33]. Moreover, the results of estimated LD using squared correlation between alleles also supported the high recombination rate in the cattle pseudoautosomal region. A decay of LD in the main pseudoautosomal region in comparison to the heterosomal and autosomal regions was observed (Supplementary Figure S1). Furthermore, a relatively small decay of LD on heterosomal region in relation to autosomal region was also observed (Supplementary Figure S1).
On the other hand, 15 pseudoautosomal SNPs (from 180) with opposite sire-TRD patterns were in the undetermined regions (i.e., BTX:137,181,696-137,476,239 bp, BTX:137,951,394-139,042,034 bp and BTX:140,116,340-140,380,961 bp). In contrast to the main pseudoautosomal part, we did not observe a similar pattern (i.e., the decreased magnitude of TRD with distance). It is important to emphasize that opposite sire-TRD between male-and female-offspring not necessarily imply recombination between X-and Y-chromosomes. Therefore, this pattern could also be explained by different linkage of alleles to X-and Y-chromosome. This opposite sire-TRD pattern could reinforce the high recombination rate in the cattle pseudoautosomal region, as it was also described on other mammalian species, which exceed the genome average by 10-20 times [32]. Notice that the recombination events on the sex chromosomes in males are expected to be extraordinarily restricted to a short genomic segment (i.e., pseudoautosomal regions), appearing to be a recombinational hotspot in male meiosis [31]. In addition, the crossover in the pseudoautosomal region was described to be essential for the proper disjunction of X-and Y-chromosomes in male meiosis and a deletion of pseudoautosomal region was reported to result in male sterility [33]. However, biologically, the mechanisms by which this mandatory crossover is achieved remains unknown [33]. Moreover, the results of estimated LD using squared correlation between alleles also supported the high recombination rate in the cattle pseudoautosomal region. A decay of LD in the main pseudoautosomal region in comparison to the heterosomal and autosomal regions was observed (Supplementary Figure S1). Furthermore, a relatively small decay of LD on heterosomal region in relation to autosomal region was also observed (Supplementary Figure S1).
On the other hand, 15 pseudoautosomal SNPs (from 180) with opposite sire-TRD patterns were in the undetermined regions (i.e., BTX:137,181,696-137,476,239 bp, BTX: 137,951,394-139,042,034 bp and BTX:140,116,340-140,380,961 bp). In contrast to the main pseudoautosomal part, we did not observe a similar pattern (i.e., the decreased magnitude of TRD with distance). It is important to emphasize that opposite sire-TRD between male-and female-offspring not necessarily imply recombination between X-and Y-chromosomes. Therefore, this pattern could also be explained by different linkage of alleles to X-and Y-chromosome.

Specific Sex-Offspring Sire-TRD
Across the X-chromosome, we identified 45 initial SNPs that are exhibiting sire-TRD via one single sex whereas null TRD in the opposite sex. After discarding possible random TRD at ≤0.001%, according to the approximate empirical null distribution of TRD [13], 8 and 7 SNPs showed male-and female-offspring sire-TRD, respectively (Table 1). As introduced previously, similar patterns of TRD on only one sex (specific sex-offspring TRD) were already identified in the human X-chromosome using microsatellite loci [16]. In fact, related patterns were also described for some disease (e.g., retinoblastoma) and were hypothesized to be associated with defective imprinting [34]. In addition, the sire-TRD pattern could be due to the fact that DNA sequences remain methylated for a much longer period in the male germline than in the female germline [35]. Within this context, an adequate imprinting may be required for viability of male or female embryos and the identified SNPs could be mapped in potential genes with differential imprinting patterns. It is notably important to mention that some of these SNPs may be in the same genes as distance between most of them was small (9 SNPs were at < 500 kb from the adjacent significant SNP).
The different magnitudes of TRD could also support imprinting for some of them, as TRD ranged from 0.04 to 0.36 in male-offspring and between 0.03 to 0.29 in female-offspring. For these SNPs, the number of under-represented offspring ranges from 47 to 1014 (1 SNP with ≥1000) for males and from 54 to 2183 (3 SNP with ≥1000) for females. The number of heterozygous sires ranges from 7 to 302 and from 13 to 1281 for male-and female-offspring, respectively. The minor allele frequencies of the detected SNPs ranges from 0.08 to 0.48. It is important to mention that moderate-to-high TRD magnitude signals were observed in regions with low or extremely low frequencies, suggesting the effects of TRD selection as observed in autosomal chromosomes as well [13]. Notably, the moderate frequencies of some SNPs in this case could be explained by the null TRD observed in one parent and/or offspring (male or female). Nevertheless, mechanisms such as imprinting could have the same consequences since the not preferentially transmitted allele is still observable in the offspring generation.

SNP-by-SNP Analysis
Among 3832 heterosomal SNPs, 12 SNPs were significantly detected with decisive evidence (BF ≥ 100) for dam-TRD (Table 2). Among them, 7 SNPs were significantly observed via male, 2 via female and 3 via both (i.e., non-sex-offspring TRD). However, these SNPs were reduced to 9, 8 and 2 considering a chance of random TRD of ≤0.1, ≤0.01 and ≤0.001%, respectively. The SNP with the highest BF displayed 3675 under-represented offspring with α d = 0.02. The maximum magnitude observed for dam-TRD was |0.12| and displaying 287 under-represented offspring. In general, the prevalence of TRD across the heterosomal regions was small in comparison to pseudoautosomal and autosomal regions. This observation could be related to the biological differences between sexes, such as the processes of germ cell formation in males versus females (e.g., cell divisions rates in spermatogenesis versus oogenesis) [36].

Haplotype Analysis
Across 974 SNPs of the heterosomal part of the X-chromosome, haplotypes displaying TRD were detected with decisive evidence of BF (≥100). After discarding regions with <10 heterozygous dams and <50 informative offspring, the number of regions was 944, being 44, 71, 282 and 597 detected for 2-, 4-, 10-and 20-SNP haplotypes, respectively. The number of regions at <0.001% margin of error reduced to 14, 26, 138 and 319, respectively. When correcting for multiple tests with kernel smoothing, the number of regions reduced to 69 haplotype windows explaining the observed TRD among the corresponding linked regions. The TRD was observed in both male-and female-offspring and with strong magnitudes ranging up to |0.48|. The top 20 regions according to BF are summarized in Table 3 and the complete list in the Supplementary Table S5. It must be emphasized that, even though 69 haplotypes were characterized, the maximum number of under-represented offspring observed was 193, showing a reduced relevance in comparison to autosomal chromosomes.

Recessive TRD Pattern
Recessive TRD pattern (i.e., lethality only in homozygous state) was targeted on pseudoautosomal SNPs (in both male-and female-offspring) and heterosomal SNPs/haplotypes (only female-offspring) on sex chromosomes. Whereas no recessive TRD patterns were found in SNP-by-SNP analyses, haplotype analyses revealed eight haplotypes (Tables 4 and S6). These regions were detected with the genotypic TRD model [3,37] exhibiting additive-TRD ranging from −0.34 to −0.56. The unfavorable effects of the recessive alleles were compensated by positive dominance-TRD effects (ranging from 0.21 to 0.28) in heterozygous offspring, allowing for their survival. The corresponding number of non-observed homozygous female-offspring was between 5 to 8 ( Table 4). In comparation to autosomal chromosomes, Holstein haplotype 3 was initially identified by 7 non-observed homozygous offspring from heterozygous carrier sire by heterozygous carrier maternal grandsire matings [10].

Functional Impact around the Pseudoautosomal Boundaries Identified Using Opposite TRD Patterns between Male and Female Offspring
Among the SNPs with highest opposite sire-TRD between male-and female-offspring, the 20 bp sequence around BTX:143,928,317 SNP was identified within the LOC790278 pseudogene on the Y-chromosome, which has a corresponding version on the X-chromosome (SHROOM2 gene). The pseudogene LOC790278 is mapped on the heterosomal region of Y-chromosome (based on the Btau 4.6.1 annotation), before the SINE element responsible to reduce the homology between the X-and Y-chromosomes in the boundary between the heterosomal and pseudoautosomal [38]. Additionally, the region with the highest homology between the X-chromosome region and the Y-chromosome used in the BLAST analysis is also located in the predicted heterosomal part of the Y-chromosome. This region comprised 148.3 Kb (BTX:144,251,086-144,399,468) and has more than 99% similarity with the Y-chromosome (Supplementary Table S3). It is important to highlight that, as shown in Figure 2, even the sequential structure (regarding the gene order) of the homologous region and the BTX:143,928,317 SNP remains in the predicted heterosomal region of Y-chromosome. Within this region of homology between the pseudoautosomal region of the X-chromosome and predicted heterosomal region of the Y-chromosome, no TRD was detected due to the absence of markers exactly in this genome range, as shown in Figure 2C. As described by Raudsepp et al. (2012) [27], the SHROOM2 is a X-specific gene in cattle. However, following the gene annotation from NCBI genome data viewer, this gene is mapped to the boundary between the pseudoautosomal and the heterosomal regions, due to the location of GPR143, the gene responsible to define the beginning of the pseudoautosomal region [38]. The results obtained here indicate a possible miss-assembly of the pseudoautosomal region of Y chromosome in Btau 4.6.1 or a complex rearrangement of the pseudoautosomal Y region, which must be better studied due to the functional importance of these regions. Additionally, the functional interpretation of the candidate markers in these regions can be affected since the causal variant could be mapped to both chromosomes.

Enriched Metabolic Pathways, Diseases and Biological Functions Associated with the Positional Candidate Genes Displaying Deviations from Mendelian Inheritance on the X-Chromosome
Functional annotation of positional candidate genes displaying deviations from Mendelian inheritance on the bovine X-chromosome was performed for the 55 detected regions: 15 SNPs with specific sex-offspring sire TRD (pseudoautosomal region), 12 SNPs with dam-TRD, 20 haplotypes with dam-TRD and 8 haplotypes with recessive TRD (heterosomal region). A total of 408 genes were annotated in an interval of 200 Kb (100 Kb up-and downstream) for the individual SNPs or within the haplotype coordinates. A total of 113 genes were annotated in the interval harboring the individual SNPs (among them, 60 genes had TRD signals within the gene coordinates) and 296 within the haplotype coordinates. It is important to highlight that some genes were annotated in both SNP and haplotype approaches. The human and mouse orthologs were used for those genes without an assigned gene symbol in the bovine reference genome. A total of 295 genes on TRD regions (out of the initial 408) were annotated. This list of 295 genes was used to perform the functional analysis using the IPA (Ingenuity Pathways Analysis) software to identify the enriched canonical metabolic pathways (Supplementary Table S7), diseases, and biological functions (Supplementary Table S8). The top 10 metabolic pathways, diseases, and functions identified are presented in Table 5. Despite no metabolic pathways were significant after adjustment for multiple testing (FDR < 0.05), the following canonical metabolic pathways were in the list of the top 10 enriched pathways at p-values < 0.05: citrulline degradation, spermine biosynthesis, phosphoribosyl diphosphate biosynthesis I, arginine biosynthesis IV, proline biosynthesis II (from arginine), urea cycle, pentose phosphate pathway (non-oxidative branch), glycine cleavage complex, phosphatidylcholine biosynthesis I, and acetyl-CoA biosynthesis I (pyruvate dehydrogenase complex). Among these metabolic pathways, the spermine biosynthesis plays a crucial role in male fertility. The gene associated with this pathway, in the current study, was the SMS (Spermine Synthetase) gene.
Deletions in the X-chromosomal loci harboring the mice orthologous (SpmS) are associated with several impaired development phenotypes, such as sterility and profound postnatal growth retardation [39]. Another enriched pathway was the phosphatidylcholine biosynthesis, which was associated with the PCYT1B gene in this study, which had a haplotype exhibiting TRD. The disruption of function of the first enzyme in the phosphatidylcholine biosynthesis, the choline kinase α, causes embryonic lethality in mice [40]. Female knockout mice for the PCYT1B gene exhibited ovarian tissue disorganization with progressive loss of follicular formation and oocyte maturation [41].  Using the list of 295 positional candidate genes, 500 diseases and biological functions were enriched (FDR < 0.05). Among the top 10 enriched diseases and biological functions identified, a strong association between the genes and X-linked syndromic phenotypes was observed. It is important to highlight that several X-linked syndromes caused by structural alterations (copy number variations, translocations, inversions, breakpoints, etc.; Table 5) display a combination of neurological and reproductive alterations, such as X-fragile, Turner, Klinefelter, Kallmann, and Rett syndromes [42][43][44].
In addition, among the other significant enriched diseases and biological functions, the enrichment analysis indicated that the candidate genes are associated with relevant biological processes associated with embryo and cellular development, such as cell viability, angiogenesis, development of cardiovascular tissue, migration of endothelial cells, growth of connective tissue, abnormal morphology of body cavity and neuritogenesis ( Figure 5). The genes associated with these processes are mapped in regions where different TRD patterns were identified, all in heterosomal regions. The relationship between the TRD markers and the positional candidate genes is shown in Supplementary Table S9. Among these positional and functional candidate genes, only PLP1, DRP2 and BTK did not show a direct association with embryonic development or reproductive processes base on the literature review performed for each gene. different TRD patterns were identified, all in heterosomal regions. The relationship between the TRD markers and the positional candidate genes is shown in Supplementary  Table S9. Among these positional and functional candidate genes, only PLP1, DRP2 and BTK did not show a direct association with embryonic development or reproductive processes base on the literature review performed for each gene. Figure 5. Interaction network between the positional candidate genes around TRD regions on bovine X-chromosome and biological functions and diseases identified. For the most relevant biological functions, the edges between the genes and the processes were colored in order to highlight the connection. These biological functions were: cell viability (green), angiogenesis (pink), development of cardiovascular tissue (red), migration of endothelial cells (blue), growth of connective tissue (cyan), abnormal morphology of body cavity (orange) and neuritogenesis (purple). The positional candidate genes were colored in function of the predominant (highest |α|) TRD observed for the associated marker, where: pink represents dam-TRD among in heterosomal regions (RNF128, PIGA, PLP1, BMX, KDM6A, CSNK1A1, VEGFD, DRP2 and bta-mir-221); yellow represents male-and female-dam-TRD in heterosomal regions (FLNA, FGF13 and MECP2); blue represents recessive TRD pattern in heterosomal regions (TNMD, SLITRK4 and BTK). The gene TSC22D3 was colored in red because two different TRD patterns were observed for the markers mapped in the interval used to annotate the gene. These TRD patterns were dam-TRD and recessive-TRD, both in heterosomal regions.

Functional Candidate Genes Identified in the Candidate Regions with Deviations from Mendelian Expectations
The genes associated with regions with dam-TRD were: RNF128, PIGA, PLP1, BMX, KDM6A, CSNK1A1, VEGFD, DRP2 and bta-mir-221. The RNF128 gene was identified as differentially expressed [45] between high and low fertility heifers during mid-luteal phase of the estrous cycle. Mutations in the PIGA gene are associated with nocturnal hemoglobinuria [46]. Additionally, high rates of early embryonic lethality and low chimerism (in the surviving animals) are observed in mice knockedout for PIGA [47]. The BMX gene has a crucial role during the endothelial cell migration and angiogenesis [48] acting as a downstream RAP1 effector in VEGF-induced endothelial cell activation [49]. An Figure 5. Interaction network between the positional candidate genes around TRD regions on bovine X-chromosome and biological functions and diseases identified. For the most relevant biological functions, the edges between the genes and the processes were colored in order to highlight the connection. These biological functions were: cell viability (green), angiogenesis (pink), development of cardiovascular tissue (red), migration of endothelial cells (blue), growth of connective tissue (cyan), abnormal morphology of body cavity (orange) and neuritogenesis (purple). The positional candidate genes were colored in function of the predominant (highest |α|) TRD observed for the associated marker, where: pink represents dam-TRD among in heterosomal regions (RNF128, PIGA, PLP1, BMX, KDM6A, CSNK1A1, VEGFD, DRP2 and bta-mir-221); yellow represents male-and female-dam-TRD in heterosomal regions (FLNA, FGF13 and MECP2); blue represents recessive TRD pattern in heterosomal regions (TNMD, SLITRK4 and BTK). The gene TSC22D3 was colored in red because two different TRD patterns were observed for the markers mapped in the interval used to annotate the gene. These TRD patterns were dam-TRD and recessive-TRD, both in heterosomal regions.

Functional Candidate Genes Identified in the Candidate Regions with Deviations from Mendelian Expectations
The genes associated with regions with dam-TRD were: RNF128, PIGA, PLP1, BMX, KDM6A, CSNK1A1, VEGFD, DRP2 and bta-mir-221. The RNF128 gene was identified as differentially expressed [45] between high and low fertility heifers during mid-luteal phase of the estrous cycle. Mutations in the PIGA gene are associated with nocturnal hemoglobinuria [46]. Additionally, high rates of early embryonic lethality and low chimerism (in the surviving animals) are observed in mice knockedout for PIGA [47]. The BMX gene has a crucial role during the endothelial cell migration and angiogenesis [48] acting as a downstream RAP1 effector in VEGF-induced endothelial cell activation [49]. An insertion/deletion polymorphism in KDM6A is significantly associated with litter size and growth traits in goats, indicating a possible association between this gene and the survival ratio during embryonic development and post-natal development [50]. The CSNK1A1 acts as a regulator of DNA damage response in embryonic stem cells [51]. Additionally, the expression of CSNK1A1 is increased by the injection of miR-320 in mice oocytes [52].
Interestingly, knockout mice for mmu-miR-320 show a reduced proportion of oocytes that develop into two-cell and blastocyst stage embryos [52]. The VEGFD gene has a critical role during embryonic vasculogenesis and angiogenesis in zebrafish, reinforced by the microinjection of zVEGFD mRNA into one-cell-stage embryos, which results in severe misguidance of intersegmental vessels and abnormal connection between dorsal aorta and caudal vein [53]. It is important to highlight that the previously described gene, BMX, acts in the same VEGF-induced endothelial cell activation pathway. In addition, regarding the VEGF-induced endothelial cell activation pathway, the zebrafish ortholog of bta-mir-221 is required for the tip cell proliferation and migration in mosaic blood vessels during angiogenesis, a process which is regulated by VEGF [54].
Among the genes corresponding to male-and female-dam-TRD regions, 3 genes were highlighted: FLNA, FGF13 and MECP2. Mutations in FLNA gene cause total gene disruption, resulting in embryonic lethality, with severe cardiac structural defects, revealing the important role of FLNA during cardiac development [55]. In mice, the targeted ablation of FGF13 results in embryonic lethality [56]. The MECP2 gene is essential for embryonic development [57], acting in the differentiation of neuronal progenitors, which, in the absence of MECP2, do not show regular increase rates of the neuronal nuclei during development [58].
The positional and functional candidate genes associated with markers with significant recessive TRD patterns were: TNMD, SLITRK4 and BTK. Despite the association between TNMD and connective and cardiovascular tissue growth and differentiation, knockout mice show normal overall morphology at birth [59]. The knockout of Sirh7/Ldoc1 in mice exhibit an overproduction of placental progesterone and placental lactogen, resulting from abnormal placental cell differentiation and maturation, subsequently, causing delayed parturition [60].
The TSC22D3 gene was associated with markers displaying two different TRD patterns (dam-TRD and recessive-TRD). Deficient mice for TSC22D3 show severe testis dysplasia followed by infertility [61], reinforcing the association of this gene with male fertility-related biological processes.

Validation of Functional Candidate Genes Previously Identified in Independent Populations Using Genome-Wide Association Studies' Meta-Analysis for Fertility Traits in Cattle
In addition to the previous reported results, 15 out of the 408 positional candidate genes were previously identified as functional candidate genes for cattle male fertility traits in a systematic review performed by our group [62]. These genes were: MECP2, L1CAM, IRAK1, KIF4A, ATP6AP1, G6PD, POLA1, FLNA, IKBKG, HCFC1, EMD, IL13RA2, PAK3, DCX, and TEX11. Some of these genes are responsible for regulating important metabolic pathways or biological processes associated with fertility. Among these processes, it is important to highlight the control of the progression of spermatogenesis, control of ciliary activity, development of Sertoli cells, DNA integrity in spermatozoa, and homeostasis of testicular cells. Additionally, the impact on fertility traits of these 15 functional and positional candidate genes was also evaluated in other species. The MECP2 gene is associated with the development of Rett syndrome in humans [63]. In human and mice, variants located in the MECP2 gene were previously associated with precocious puberty, undescended testes, and micropenis [64,65]. During the endometriosis development, L1CAM is considered a pathogenic factor in humans due to its overexpression in ovarian and endometrial carcinomas [66]. Interestingly, IRAK1 gene is differentially expressed in impaired spermatogenesis [67]. The function of KIF4A transcript is crucial for the chromosome positioning and spindle formation, playing an important role during chromosome segregation in mouse oocytes [68,69]. Additionally, KIF4A variants were already associated with uterine capacity in beef cattle [70]. The ATP6AP1 gene encodes the main accessory protein of V-ATPase, which is a crucial regulator of intra-organellar acidification [71]. The deficiency of ATP6AP1 is associated with immunodeficiency, hepatopathy, and abnormal protein glycosylation in prenatal and postnatal stages [72]. In addition, ATP6AP1 gene was described, in zebrafish, as responsible to mediate the dorsal forerunner cell proliferation and the left-right asymmetry [73]. The GDP6 deficiency was suggested to be associated with reduced female fertility in humans [74]. The FLNA gene is involved with the regulation of the embryonic development of several tissues, such as the cardiac, neural and muscular tissues [55,75]. Consequently, FLNA gene plays a crucial role for embryo survival. The HCFC1 gene codifies a transcriptional co-regulator responsible to control cell proliferation, which loss-of-function variants were associated with disrupted neuronal and neural progenitor cells with a deleterious effect in humans and mice [76]. In humans, copy number variants at the PAK3 gene were found in men with Sertoli-cell-only syndrome, XY gonadal dysgenesis and premature ovarian failure [74]. The TEX11 gene was already broadly investigated in genetic association studies regarding male fertility in cattle and other species. This gene was associated with male infertility in humans [75] and in cattle [76]. Indeed, variants in TEX11 gene are responsible to explain up to 13% of the total genetic variance for scrotal circumference at 12 months in beef cattle [76].

Conclusions
Transmission ratio distortion analyses showing deviations from Mendelian inheritance revealed 149 SNPs displaying opposite sire-TRD between male and female offspring with strong magnitude at the beginning of the pseudoautosomal region and gradually reducing along this region until being null at the extreme of the chromosomes. This finding is evidence of the strong linkage of unique SNPs to specific-sex (Y-or X-) chromosomes and depicts the accumulation of recombination events across the pseudoautosomal region of the cattle genome. Moreover, haplotypes were found displaying recessive TRD patterns in female-offspring. Functional analyses and validation analysis using independent populations and methodologies showed that the TRD regions identified in this study were related to key biological processes and molecular functions responsible for regulating processes such as spermatogenesis, development of sertoli cells, homeostasis of endometrium tissue and embryonic development.  Table S1. Map file of markers of X-chromosome with number of parentoffspring genotyped trios and Mendelian inconsistency rate; Table S2. Map file of the selected markers of heterosomal part of the X-chromosome for phasing and haplotype analyses; Table S3. Data from BLAST analyses of the X-and Y-chromosomes; Table S4. Full list with detailed information of specific sex-offspring sire-TRD in Holstein X-chromosome; Table S5. Full list with detailed information of significant dam-TRD among heterosomal part in Holstein X-chromosome from haplotype analyses; Table S6. Full list with detailed information of Haplotype windows with recessive TRD pattern on heterosomal part of Holstein X-chromosome; Table S7. The enriched canonical metabolic pathways obtained from functional analysis of the identified genes; Table S8. The diseases and biological functions pathways obtained from functional analysis of the identified genes; Table S9. Detailed TRD regions with the full information of the positional candidate genes; Table S10: Number of animals genotyped for each density of the SNP array.