Mapping QTL Associated with Resistance to Avian Oncogenic Marek’s Disease Virus (MDV) Reveals Major Candidate Genes and Variants

Marek’s disease (MD) represents a significant global economic and animal welfare issue. Marek’s disease virus (MDV) is a highly contagious oncogenic and highly immune-suppressive α-herpes virus, which infects chickens, causing neurological effects and tumour formation. Though partially controlled by vaccination, MD continues to have a profound impact on animal health and on the poultry industry. Genetic selection provides an alternative and complementary method to vaccination. However, even after years of study, the genetic mechanisms underlying resistance to MDV remain poorly understood. The Major Histocompatability Complex (MHC) is known to play a role in disease resistance, along with a handful of other non-MHC genes. In this study, one of the largest to date, we used a multi-facetted approach to identify quantitative trait locus regions (QTLR) influencing resistance to MDV, including an F6 population from a full-sib advanced intercross line (FSIL) between two elite commercial layer lines differing in resistance to MDV, RNA-seq information from virus challenged chicks, and genome wide association study (GWAS) from multiple commercial lines. Candidate genomic elements residing in the QTLR were further tested for association with offspring mortality in the face of MDV challenge in eight pure lines of elite egg-layer birds. Thirty-eight QTLR were found on 19 chicken chromosomes. Candidate genes, microRNAs, long non-coding RNAs and potentially functional mutations were identified in these regions. Association tests were carried out in 26 of the QTLR, using eight pure lines of elite egg-layer birds. Numerous candidate genomic elements were strongly associated with MD resistance. Genomic regions significantly associated with resistance to MDV were mapped and candidate genes identified. Various QTLR elements were shown to have a strong genetic association with resistance. These results provide a large number of significant targets for mitigating the effects of MDV infection on both poultry health and the economy, whether by means of selective breeding, improved vaccine design, or gene-editing technologies.


Introduction
Marek's disease (MD) represents a significant global economic and animal welfare issue. This immunosuppressive disease is responsible for an estimated 2 billion USD annual economic analysis that predicted mutations within genes, miRNAs, and lncRNAs highly associated with MDV response in commercial egg production lines. This analysis not only provides new markers for MD resistance but also reveals more about the biology behind the mechanism of MDV susceptibility, information that should lead to more precise selection strategies in the future.

Experimental Animals
All procedures carried out on the birds involved in this study were conducted in compliance with Hy-Line International Institutional Animal Care and Use Committee guidelines.
2.1.1. Full Sib Advanced Inter-Cross Line (FSIL) F 6 birds from the FSIL were used to map QTL affecting MD resistance. The development of the FSIL F 6 challenge population has been previously described [31,33,34]. It was initiated from a cross of two partially inbred commercially utilized elite White Leghorn lines, known to differ in their resistance to MDV. Five independent FSIL families were developed and expanded over five generations. In all five families, the male parent was from the less resistant line and the female parent was from the more resistant line.
At the F 6 generation, 1615 chicks were challenged with vv+ MDV strain 686 following the protocol of Fulton et al. [7]. The experiment was carried out in two hatches. It was terminated at 152 days of age in the first hatch, and after 145 days of age in the second. Resistance was measured as survival time (age at death). For birds that survived to the end of the experiment, survival time was taken as age at end of the experiment. To standardize the two hatches, survival for all birds that survived to end of experiment was set to 149 days. This measure of resistance was used in all association tests.
This population was segregating for two MHC haplotypes (B 2 and B 15 ). Given the known strong association of MHC-type with MD resistance, all analyses were done within MHC-type within each family.

Pure line Pedigreed Sire Populations with Daughter Progeny Records for MDV Mortality
As part of the routine selection process within the Hy-Line (HL) breeding program, individual males were mated to multiple females to produce 30 half-sib female progeny per sire. The dams were MD vaccinated following normal industry practices. Progeny were vaccinated at 1 day of age with HVT/SB1 and at 7 days of age inoculated subcutaneously with 500 PFU of vv+ MDV (provided by Avian Disease and Oncology Lab, East Lansing MI). Mortality was recorded from 3 to 17 weeks of age (termination of experiment), as described in Fulton et al. [7]. The sire MD tolerance phenotype is the proportion of survivors among the daughters upon MD challenge.. Data were available from 15 generations of 8 elite lines (not the same generations in all lines), with 1081 to 1393 males per line having MD progeny mortality data, for a total of 9391 males. The lines included three different egg production breeds, namely five White Leghorn lines, two White Plymouth Rock lines and one Rhode Island Red line. The sires and their progeny from these 8 pure lines (all with pedigree data) were used to test functional genomic elements (genes, miRNAs, lncRNAs in the mapped QTL regions (QTLR)), for association with MD progeny mortality. Also tested were a group of singleton coding SNPs predicted to have deleterious effects on protein structure and function (henceforth called potentially functional mutations, see below for details).

Genomic Sequencing of Founder Birds of the FSIL
Genome sequence information was produced for the 10 F 0 founder individuals of the F 6 population. This was used to identify variants that differed in the genomes of the parental birds and may have functional impact. Sequencing the DNA from the 10 F 0 founder birds was carried out by the Edinburgh Genomics sequencing facility (Edinburgh, UK). Samples were prepared for sequencing using 1 ug of genomic DNA following the TruSeq PCR free kit protocol (FC-121-3001, Illumina, Little Chesterford, UK). Resulting libraries were quality checked on an Agilent DNA 1000 bioanalyzer (Agilent Technologies, South Queensferry, UK) and then clustered onto HiSeq Rapid V2 flow cell at a concentration of 15 pM. Sequencing was carried out on an Illumina Hiseq 2500 using the HiSeq Rapid v2 SBS reagents (Illumina, Little Chesterford, UK), for 150 cycle paired end reads. Each of the 10 samples was sequenced to around 15× coverage. Quality of sequences was determined using FastQC [35], and mapping to the chicken reference genome (Galgal4) was carried out using BWA (v0.7.0) [36]. The resulting bam file was sorted with Samtools (v0. 1.18) [37]. Picard tools (v1.95) was then used to add read groups and mark duplicates (http://broadinstitute.github.io/picard/). The mpileup program within Samtools carried out SNP calling (with options: -q20 -Q20 -AB -ugf) and these variants were then filtered using the bcftools package within Samtools (v0.1.18).

Annotation of Genome Variants
Variants distinguishing the genomes of the F 6 parental birds were classified as exonicsynonymous or non-synonymous, intronic, 5 -upstream, 3 -downstream or intergenic, as determined by the SNPEff program (v3.6c) [38]. Non-synonymous coding variants were further designated as predicted to be highly deleterious to protein function (henceforth, "potentially functional mutations"), moderately deleterious or having a low likelihood of being deleterious to protein function. In all, a total of 5,718,725-6,154,628 variants (in all 5 males v all 5 females) were annotated within each sample. Some of the identified variants were used to test the QTLR, as described below.
2.4. 600 K High Density SNP Genotyping of F 6 Birds F 6 birds were genotyped genome-wide with the high density 600k Affymetrix SNP array [32] for GWAS. DNA was available from 1615 MDV challenged birds. After quality control (QC), 1192 F 6 birds provided high quality genotypes (178, 234, 357, 221, and 202 birds from Families 1-5, respectively). Genotyping was performed using 200 ng of gDNA with the standard protocol for the Axiom Affymetrix platform (Thermo Fisher Scientific, Inchinnan, UK). Samples were amplified using the Axiom 2.0 Reagent Kit (#901758, Thermo Fisher Scientific), and the resultant product checked for both quantity and quality of fragmentation. QC was performed using absorbance assay for quantity and running a sample on a 4% agarose E-Gel (#G800804, Thermo Fisher Scientific) to check for fragmentation. The samples were then hybridised to the Axiom Chicken Genotyping array (#902148, Thermo Fisher Scientific). Hybridisation, wash, stain and scanning were carried out within the GeneTitan MC Instrument. The resulting. CEL files were loaded into Axiom Analysis Suite (v2.0) for first stage analysis.

GWAS of the F 6 Population to Identify MD Resistance QTL
The F 6 genotypes were used to map QTL that impacted survival following MDV challenge. After applying Affymetrix standard QC the remaining markers were further filtered to remove markers with minor allele frequency ≤ 0.01 and markers with significant deviation from Hardy-Weinberg Equilibrium (p ≤ 0.001). An independent GWAS was then carried out within each of the 5 families of this study, using JMP Genomics SNP-Trait association Trend test (JMP Genomics, Version 9, SAS Institute Inc., Cary, NC, USA, 1989-2019). Survival to the end of the experiment was taken as the Censor Variable; age at death was taken as a survival trait and MHC as a class variable and fixed effect. To obtain experimental significance thresholds, the proportion of false positives (PFP) [39,40] was used to correct for multiple tests.

Identification of QTL and Their Confidence Intervals
Although many marker x family combinations were nominally significant to highly significant (comparison wise p ≤ 0.05 to p ≤ 0.001), very few remained significant after PFP corrections for multiple tests. Nevertheless, visual inspection of the chromosomal Manhattan plots by families showed distinct clusters of markers with high −LogP values intermixed with markers with low −LogP values ( Figure 1). Therefore, following Lipkin et al. [41], we identified QTL by using a moving average of −LogP (mAvg) to smooth the Manhattan plots. We used a window size of~0.1 Mb (27 markers) with a step size of 1 marker and a critical threshold mAvg of −LogP = 2.0 (p = 0.01) to declare significance, and Log drop 1 [42] to define QTLR boundaries.
Many QTLR had overlapping boundaries across two or more families, thus providing replication and increased assurance of significance. Conservatively assuming such overlaps represents the same underlying genetic element, we combined these QTLR within and across families, taking the start and end log-drop boundaries of the QTLR as the first upstream and last downstream marker across all families in the QTLR. Table 1 shows final list of 38 QTLR after consolidation within and across families. Also shown are the number of genes (Ensembl database v79) and non-coding RNAs [43] found in each QTLR.
Genes 2020, 11, x FOR PEER REVIEW 5 of 22 showed distinct clusters of markers with high −LogP values intermixed with markers with low −LogP values ( Figure 1). Therefore, following Lipkin et al. [41], we identified QTL by using a moving average of −LogP (mAvg) to smooth the Manhattan plots. We used a window size of ~0.1 Mb (27 markers) with a step size of 1 marker and a critical threshold mAvg of −LogP = 2.0 (p = 0.01) to declare significance, and Log drop 1 [42] to define QTLR boundaries. Many QTLR had overlapping boundaries across two or more families, thus providing replication and increased assurance of significance. Conservatively assuming such overlaps represents the same underlying genetic element, we combined these QTLR within and across families, taking the start and end log-drop boundaries of the QTLR as the first upstream and last downstream marker across all families in the QTLR. Table 1 shows final list of 38 QTLR after consolidation within and across families. Also shown are the number of genes (Ensembl database v79) and non-coding RNAs [43] found in each QTLR.    QTLR-ordinal number of the QTLR after consolidation within and across families (see Section 2.6); Chr-chromosome; Start/End-Galgal4 QTLR coordinates of the first and last SNP in the QTLR; Length-Size of the QTLR (bp); Genes and lncRNA-number of genes (Ensembl database v79) and non-coding RNAs [43] found in the QTLR.

MDV Challenge Experiment
An experiment measuring host response to MDV viral strain 686 challenge (based on gene expression measured by a whole genome RNA-seq assay) was carried out with 10 male and 10 female birds from HL W36 commercial production hybrids. The parental lines used to produce the W36 commercial hybrid are closely related to the lines used to develop the F 6 experimental cross. Previously it was reported that males are more resistant to MD than females [44][45][46]. On this basis, we compared differentially expressed genes (DEGs) of males and females as surrogate for comparison of chicks from more-and less-resistant lines. Hence, gender balanced groups were used for the RNA-seq challenge experiment, with 5 males + 5 females in each of the challenged and control groups. This allowed us to identify viral response genes in each sex. Thus, this experiment included 4 comparisons: Females-challenged v control; Males-challenged v control; Controls-males v females; Challenged-males v females. Although some of the DEGs in the cross-sex comparisons will be due to sex rather than response to MDV, others can be expected to reveal some of the genes behind the host immune responses and potential differential susceptibility.
To reflect infection in the field, ten 6-day old HL W36 commercial birds (5 males + 5 females) were infected with 500 pfu of the very virulent plus (vv+) MDV strain 686, by subcutaneous injection in the neck (virus kindly provided by the USDA/Avian Disease and Oncology Lab, East Lansing, MI, USA). All birds were vaccinated at 1 day of age with HVT/SB1 vaccine. Spleen tissues from these 10 challenged and from 10 unchallenged control birds were harvested at 4 dpi, flash frozen in liquid nitrogen and stored at −80 • C for subsequent RNA preparation and RNA-Seq analysis. All birds were housed together, with challenged and control chicks separated by 2.44 m.

RNA Preparation
RNA was prepared from the 4-dpi flash-frozen spleen tissues of each of the above 20 chicks after homogenization in Qiazol reagent (Qiagen, Manchester, UK), and subsequent preparation using RNeasy RNA isolation kit (Qiagen, Manchester, UK) as per the manufacturer's guidelines. RNA was resuspended in dH 2 O.

Selection of Candidate Genomic Elements Underlying the 38 QTLR for Further Testing and Validation in the 8 HL Elite Pure Lines
In principal, a random selection of markers from a QTLR region could be used for testing in the 8 elite-sire validation populations. However, we decided to take a step forward, and search the QTLR for attractive candidate elements that could tested in the validation populations. The assumption being that for a marker to show association across a number of populations, it must be in high LD with the causative mutation in all of these populations. For this to hold, the marker must be very close to the causative mutation, which is what we would expect for markers taken from the quantitative trait element itself.

Identification of Candidate Genes within QTLR
The BioMart data mining tool within the Ensembl database (release 79) was used to identify genes lying under the 38 QTLR identified from the F 6 data ( Table 1). This information was then compared with the gene expression data (from RNA-Seq analysis of the MDV challenge experiment) in order to identify potential candidate genes for MD resistance. The DAVID analysis tool (v6.8) (https://david. ncifcrf.gov/home.jsp) identified gene ontology (GO) terms associated with DEGs under each QTLR. Following this, Ingenuity Pathway Analysis software (IPA v2.3) (https://www.qiagenbioinformatics. com/products/ingenuity-pathway-analysis/) revealed which canonical pathways were perturbed by MDV infection in the host, and allowed us to analyze the gene interaction networks involved in the host response. Likely upstream regulators of differentially regulated genes were identified and DEGs in male and female birds were compared.

Selection of Genetic Markers within QTLR for Genotyping
Markers located in genomic elements underlying each QTLR (genes, miRNAs, lncRNAs, and potentially functional mutations) were identified from genome sequences aligned to the reference chicken genome for each of the 8 elite lines utilized in this study. SNPs predicted to alter amino acids were preferentially chosen, but variants were also selected to encompass the entire gene where possible. All SNPs selected were validated using 200 samples from cohorts separated by 13-15 generations from each line to confirm that the marker was polymorphic (i.e., identified both homozygote and heterozygote individuals), and to define the minimum number of markers required to define a haplotype within each genomic element (gene, miRNA, or lncRNA) in each line. This minimum number of SNPs was then used to identify the haplotype status of each individual bird. The elite lines were selected for MD resistance, but the selection was at a low level, as selection emphasis was for production-related traits including egg numbers, shell quality and feed efficiency. If SNPs were homozygous in both generations, then the SNP composition was assumed monomorphic for the intervening and subsequent generations.

Selection of Candidate Coding Genes
Twenty candidate coding genes ( Table 2) were selected for further study in the 8 elite sire lines based on (i) position under a QTLR defined by this or previous genetic studies, (ii) known biology of the gene (e.g., involvement in innate immunity, cell death, or cancer), (iii) polymorphisms (presence of variants), (iv) differential expression between challenged and control, or more and less resistant birds-either in this or previous studies (e.g., Smith et al. 2011 [18]), and (v) allele specific expression [50]. Specific details for the detection assays are provided in Supplementary Table S1.

Selection of Candidate Non-Coding RNAs
With many causal variants for quantitative traits known to reside within regulatory elements [51], we deemed it important to examine non-coding transcripts as well. In contrast to the candidate genes, these were selected more or less on the basis of position underlying QTLR alone, with no information on potential biological function.
Micro RNAs (miRNAs): A total of 46 miRNAs were found underlying the QTLR. Sequence information was used to identify markers segregating in HL lines. For some miRNAs, polymorphisms were not found, while for others multiple polymorphic sites were observed. Since most of the miRNAs are very short, haplotype information could be derived only for some of them. Ultimately, 17 miRNAs were chosen for further association analysis (Table 3). Detection assays for 30 SNPs lying within these 17 transcripts were developed (Supplementary Table S1).
Long non-coding RNAs (lncRNAs): With a view to testing some lncRNAs potentially associated with MDV resistance, five small QTLR were chosen which harboured only a few (1-6) lncRNAs. From these RNAs, up to 5 lncRNAs per QTLR were tested for association with MD. Thus, there was a relatively high likelihood that one of these would be the genomic element housing the causative mutation. From among these, a further 17 transcripts were chosen for further association analysis (Table 4). Two lncRNAs overlapped (labelled with negative distance). Detection assays for 80 SNPs across these 17 transcripts were developed (Supplementary Table S1).

Selection of Potentially Functional Variants in Coding Genes
Analysis of the complete sequences of the 10 F 0 birds showed multiple SNPs distinguishing the founder birds that could impact gene function. Alternative variants fixed in one parental line compared to the other and which were predicted to cause highly deleterious changes in the gene protein product, were chosen as candidates for further study. Across the genome as a whole, a total of 252 variants of this nature were identified in 217 genes, with 22 of these variants lying within genes underlying the 38 QTLR. Twelve of these variants (representing 9 genes) were segregating in one or more of the 8 elite lines and were chosen for further studies in these lines. These 12 SNPs are summarized in Table 5. Specific details on the detection assays are provided in Supplementary Table S1. Markers located in the genomic elements underlying the QTLR (genes, miRNAs, lncRNAs, and singleton variants with potentially functional effects), were identified from genome sequences aligned to the reference chicken genome for each of the lines utilized in this study. For the candidate genes underlying the QTLR, SNPs predicted to alter amino acids were preferentially chosen. For all genomic elements (not including the singleton functional mutations), variants were also selected to encompass the entire gene when possible, so as to define the haplotype status of each bird for subsequent association tests. All SNPs selected were validated using the same approach as defined above.

Genotyping of the Eight Elite Lines
Genomic DNA was extracted from whole blood using salt/ethanol precipitation and stored at −20 • C until use. DNA was diluted in dH 2 O to 25 ng/uL. Genotyping was done as single-plex assays on the SNPline system (LGC, Hoddesdon, UK). All assays used KASP chemistry, which is a fluorescent-based competitive allele-specific detection method [52]. Genotyping was done in 1536 well plates, and results analysed with Kraken software (LGC, Hoddesdon, UK) as previously described [53]. Specific details on primers for each assay are provided in Table S1.

QTLR Association Analysis
Association of markers and marker-haplotypes with sires progeny-tested for post-challenge survival in the 8 HL elite lines was tested using JMP Genomics SNP-Trait association Trend test (JMP Genomics, Version 9, SAS Institute Inc., Cary, NC, USA, 1989-201). When analysing a line-marker combination, the generation and MHC background were taken as class variables and fixed effects, and when analysing a marker combining all lines together, the line was added to the fixed effects. To obtain experimental significance thresholds, the proportion of false positives (PFP) [39,40] was used to correct for multiple tests.

High Resolution MD QTL Mapping in Advanced Full-Sib Inter-Cross F 6 Families
Following genotyping of the F 6 families by the Affymetrix 600K HD SNP chip, 40-43% of SNPs (depending on family) were found to be informative and passed QC (total, 238,777 to 259,098 informative SNPs per family; average of 246,400 SNPs in a family). The distribution of the marker p-values was fairly uniform over the range of allele frequencies. A clear excess of small p-values (an indication of presence of true QTL effects) was not seen. In accordance with this, only in families 2, 4 and 5, were markers significant at PFP ≤ 0.2. However, n 1 , the estimated number of falsified null hypotheses at PFP ≤ 0.2 [39,40], ranged from 1718 in Family 1 (0.7% of all markers tested in the family) to 17,385 in Family 2 (6.7% of markers tested), indicating the actual presence of true QTL effects.
As described in Methods, many markers were nominally significant, and distinct clusters of nominally significant markers intermixed with non-significant markers were clearly seen. Hence, to identify QTL we used a moving average of marker −Log 10 p values (mAvg) across a window size of 0.1 Mb (27 SNPs) to smooth the plots. Examining Manhattan plots (Figure 1 and Supplementary Figure S1), showed that the smoothing did yield a good monotonic plot, allowing identification of QTL and the use of the drop method [42] to define their boundaries.
Using a threshold of mAvg ≥ 2.0 and QTLR boundaries defined by Log Drop 1, a total of 57 regions meeting our criteria for significance were identified, ranging from 4 to 16 in a family. In some families, several regions clustered within 1 Mb of one another. Nine such chromosome-by-family clusters were identified, containing 2-4 significant sub-regions per cluster. Conservatively, we counted each cluster as representing a single within-family QTLR. Similarly, aligning QTLR maps of the five families, showed that six significant regions overlapped or were within 1 Mb of one another across two families. Here too, counting the same QTLR across different families only once, resulted in a final total of 38 QTLR being identified (Table 1). These 38 QTLR were the focus of further examination of variants as defined by multiple methodologies employed below.

Annotation of QTLR
Transcripts of genes located under the 38 mapped QTLR were identified by using the Biomart tool within the Ensembl database (v79) (Supplementary Table S2). The number of known annotated genes in each QTLR ranged from 0 to 86, for a total of 1072 genes. Using the annotation developed and described by Kuo et al. [43], which was based on characterisation of full-length transcripts using long-read sequencing and a bioinformatics pipeline to define coding and non-coding RNAs, we identified 1-99 lncRNAs within each defined QTLR, with a total 1153 lncRNAs (Table 1).

Identification and Analysis of SNPs in F 0 Parental Birds
To investigate variants residing in regions of the genome potentially associated with MDV resistance, the 10 founder parents that gave rise to the F 6 population used in this study were sequenced and variants determined between males and females. Combined across both lines 8,273,112 SNPs were identified. Examining the 38 QTLR within the genome highlighted 133,394 SNPs in these regions, which were fixed for alternative SNP variants with one parental line compared to the other (Supplementary Table S3). With a view to testing for causality, any of these variants could then be tested for genetic association with the progeny-test for survival in the eight elite HL lines.

Transcriptomic Analysis of Male and Female Commercial Birds Challenged with MDV
Based on the observation noted above, that male and female birds tend to show differential resistance, we examined the expression profiles of five male and five female W36 birds challenged with vv + MDV, and five male and five female W36 non-challenged control birds (see Section 2.7). Gene expression results from this transcriptomic analysis are presented in Table S4.
First, we compared gene expression in challenged and control chicks within each sex of W36 birds to find that 185 genes (58 up-and 127 down-regulated) were differentially expressed in male chicks, while there were 114 genes (62 up-and 52 down-regulated) differentially expressed upon challenge of the female birds. When these genes are compared across sex, the individual responses are seen to be quite different, with minimal overlap of DEGs between the male and female groups ( Figure 2). chicks, while there were 114 genes (62 up-and 52 down-regulated) differentially expressed upon challenge of the female birds. When these genes are compared across sex, the individual responses are seen to be quite different, with minimal overlap of DEGs between the male and female groups ( Figure 2). When the different responses were compared, it is seen that biological pathways predominantly involved in the host response in female chicks include activation of cancer signalling, whereas the DEGs identified in the male birds have roles in the Th1 immune response and dendritic cell maturation amongst others ( Figure 3A). The diseases and biofunctions associated with the genes involved in each response are depicted in Figure 3B. Strikingly, with MD being an oncogenic disease, functions related to cancer and tumorigenesis are seen to be up-regulated in female birds, but are repressed in the male chicks. Investigation of genes acting as upstream regulators of identified DEGs again indicates differential regulatory mechanisms in each sex. TGFB appears to be a key regulator of mechanisms in females, whereas STAT1 (an interferon response gene) is suggested to have a more prominent role in the males ( Figure 3C). When the different responses were compared, it is seen that biological pathways predominantly involved in the host response in female chicks include activation of cancer signalling, whereas the DEGs identified in the male birds have roles in the Th1 immune response and dendritic cell maturation amongst others ( Figure 3A). The diseases and biofunctions associated with the genes involved in each response are depicted in Figure 3B. Strikingly, with MD being an oncogenic disease, functions related to cancer and tumorigenesis are seen to be up-regulated in female birds, but are repressed in the male chicks. Investigation of genes acting as upstream regulators of identified DEGs again indicates differential regulatory mechanisms in each sex. TGFB appears to be a key regulator of mechanisms in females, whereas STAT1 (an interferon response gene) is suggested to have a more prominent role in the males ( Figure 3C).  When the different responses were compared, it is seen that biological pathways predominantly involved in the host response in female chicks include activation of cancer signalling, whereas the DEGs identified in the male birds have roles in the Th1 immune response and dendritic cell maturation amongst others ( Figure 3A). The diseases and biofunctions associated with the genes involved in each response are depicted in Figure 3B. Strikingly, with MD being an oncogenic disease, functions related to cancer and tumorigenesis are seen to be up-regulated in female birds, but are repressed in the male chicks. Investigation of genes acting as upstream regulators of identified DEGs again indicates differential regulatory mechanisms in each sex. TGFB appears to be a key regulator of mechanisms in females, whereas STAT1 (an interferon response gene) is suggested to have a more prominent role in the males ( Figure 3C). Examination of the unchallenged control birds from each group (male and female W36 birds) identified 302 genes which were expressed in an inherently different manner between the sexes. So, these DEGs, being independent of MD challenge, are due to gender differences and/or intrinsic immune differences, manifesting in differing resistance to infection. Significant DEGs include the obvious immune related genes, but also various serine proteases and carboxypeptidases, which may suggest a role in resistance to MDV. Genes previously implicated in MDV resistance including MHC genes, IRG1 [18] and SCYC1 [16], were also highlighted.

Analysis of Differentially Expressed Genes Located in QTLR
When DEGs are located within QTLR affecting MD resistance, it is plausible that these genes may be the causative quantitative trait gene (QTG) responsible for the QTLR effect. Based on this hypothesis, the list of genes located in the 38 QTLR identified in this study was compared to the list of DEGs. Genes that were both expressed under challenge and that were also located under one of the 38 mapped QTL, were considered likely candidate QTG (Supplementary Table S5).
To determine if any particular types of genes were represented by the DEGs underlying significant genomic regions (identified either from this or previous studies [31,33,[54][55][56]), the gene ontologies associated with these DEGs were studied. Table S6 shows that immunoglobulins, stimulus response genes and genes involved in regulating the immune response are all highly represented. Analysis of the biological pathways controlled by these genes was also examined. To look for possible common pathways, network analysis was performed. Three networks stood out significantly: cell movement, cell signalling, and cancer ( Figure 4A); cell signalling and nervous system development ( Figure 4B); antimicrobial resistance, inflammation, and cell death ( Figure 4C). When candidate upstream regulators of these genes were investigated, the cytokines TNF, EDN1, IL1B and IFNG were all indicated as potentially regulating many of the genes under study. Interestingly, the TNF gene is located in QTLR 18 on Chr 4,387,071 bp downstream of the top window in this QTLR.

Populations, Genomic Elements and Markers
In a unique aspect of this study, QTLR identified in the F6 mapping population were re-tested for association with MD resistance in 8 pure lines maintained under selection for multiple commercial production traits at HL. Markers for QTLR validation were chosen from among the various classes of genomic elements identified underlying the QTLR, namely genes, miRNAs and lncRNAs. Also

Populations, Genomic Elements and Markers
In a unique aspect of this study, QTLR identified in the F 6 mapping population were re-tested for association with MD resistance in 8 pure lines maintained under selection for multiple commercial production traits at HL. Markers for QTLR validation were chosen from among the various classes of genomic elements identified underlying the QTLR, namely genes, miRNAs and lncRNAs. Also included were a group of singleton coding SNPs predicted to have deleterious effects on protein structure and function (the functional mutations).

Genetic Association Tests
When possible, a number of markers were genotyped in each candidate gene, so that association could be based on marker haplotypes rather than individual markers. When this is done, all of the information is in the 'Haplotypes'. The individual markers are usually in complete LD and hence do not add anything to the analysis. Thus, for statistical analyses what is important are the number of element x line tests, rather than number of marker x line tests. We counted an element x line test as significant if the haplotype was significant, or if one or more markers in the element was significant.
A total of 355 markers located in 66 genetic elements (genes, miRNA, lncRNA, and functional mutations) underlying 26 of the 38 QTLR were tested for association with MD mortality in the 8 elite lines (Supplementary Table S7). In 46 of the elements, multi-marker haplotypes were also tested. The association tests were carried out separately within each of the 8 elite lines, and also in data combined across the 8 lines, yielding 2032 p-values. A total of 387 element by line tests were performed.
Significance of individual candidate genes: Twenty genes located in the QTLR were tested by a total of 127 gene x line tests of markers encompassing the entire gene. A test was considered significant if the haplotype x line test was significant, or if any one of the markers encompassing the haplotype were significant. All but IL4I1, TAGLN and CD79B (in QTLR 32, 35 and 37), were significant in two or more marker x line tests. Thus, 17 genes may be taken as strong candidates to be a QTG (Table S8).
Association of markers located in miRNA: A total of 17 miRNAs were chosen (Table 3), and 28 markers tested for association (Supplementary Table S7) for a total of 96 element x line tests. Only 5 significant to highly significant p-values were obtained. (Supplementary Table S8).
Significance of individual miRNAs: Significant results were found in only five out of the 17 miRNA tested. Most interesting was gg-mir-6553 with the highest proportion of significant tests among the miRNAs (Supplementary Table S8). This miRNA is located in QTLR 5, along with three significant genes (CSTA, LAG3, and C1S). The results indicate two putative causative quantitative elements in this QTLR: one upstream in the region of CSTA, the other downstream in the region of LAG3 and C1S.
Associations of markers located in lncRNAs: A total 17 lncRNAs were chosen (Table 4), and 80 markers were tested for association (Table S7) in a total of 114 element x line tests, of which 9 were significant (Supplementary Table S8).
Significance of individual lncRNAs: Significant results were found with 9 of the 17 lncRNAs tested (Table S8). QTLR worth mentioning are 20 and 21 on Chr5, both tested by lncRNAs only, are detailed in Supplementary File S1.
Associations of QTLR functional mutations in coding genes: Twelve potentially functional mutations were tested in a total of 50 element x line tests, of which six were significant ( Table 5).
Significance of genes harbouring potentially functional mutations: Significant results were found in 5 of the 9 QTLR genes harbouring potentially functional mutations. In Line WPR1, the 2 markers in the gene BG1 (markers MD-29 and MD-30 in QTLR 32 on Chr 16) had practically the same p-values as the markers of the candidate gene TAP1 (Supplementary Table S8), thus strengthening the results of that gene. It should be noted that BG1 and TAP1 lie within the chicken MHC, as noted above, a locus known to impact MDV resistance.

Summary of Element x Line Tests
Supplementary Table S9 summarises the element x line tests by type of element. Tests of the coding genes are outstanding in the high proportion of significant element x line tests; more than double that of the other genomic elements. Since the candidate genes were chosen on the basis of much more information than the other classes of elements, the high proportion of significant tests can be taken to support the proposition that the process by which the candidate genes were selected was effective and that an appreciable fraction of the chosen candidate genes are the actual quantitative trait genes. This must be qualified, however, as the distribution of the different classes of genomic elements among the QTLRs was not random, and thus may be biasing the results.

Validation of QTLR
A total of 26 QTLR were tested for significance of element x line tests. Of these, eight did not have any significant element x line tests, five had one significant test and 14 had 2-6 significant element x line tests. Thus, 18 QTLR can be considered as validated. Of the unconfirmed QTLR, QTLRs 1, 3, and 24 were tested by lncRNAs only, QTLRs 12 and 18 by miRNAs only, QTLR 14 by functional mutations only, and QTLR 25 by both miRNA and functional mutations. More markers need to be tested in these QTLR to decide if the lack of significance is a result of lack of informativity (Type 2 error), or if indeed these QTLR are false positives (Type I error). More detail on the association found within each individual QTLR is presented in Supplementary File S1.

Discussion
QTL mapping in an FSIL F 6 population phenotyped for survival in the face of MDV challenge, identified 38 QTLR distributed over 19 autosomes. Use of such a resource allowed for the identification of QTLR at a higher resolution than have been mapped previously, thus allowing for easier identification of potential candidate genes. The mapped QTLR, along with genomic sequences of the F 0 founder individuals, and transcriptomic information from challenged and control birds, has allowed us to identify genes, miRNAs, lncRNAs, and potentially functional mutations located under these QTLR as candidates for association with progeny mortality from Marek's disease. Genetic association studies in multiple elite lines have confirmed the significant effects of most of these candidates on MD. Here we will discuss the potential role of some of the most significant candidates.
Many of the genes we have associated with MD response in this study have biological roles clearly relevant to the pathogenesis of MDV infection. One of the primary targets of the virus are B-cells and genes known to be associated with B-cells, include two of our candidates, CD7 and TLR4. CD7 is a surface antigen found on naive and memory T-cells, as well as defined NK cell progenitors, and provides survival signaling through its tyrosine kinase activity in association with T-cell activation and is involved in T-cell/B-cell interactions. The Toll-like receptor, TLR4 is found on the surface of B-lymphocytes and macrophages and drives the induction of IL-1 β, TNFα, and other pro-inflammatory cytokines implicated as being important in this study.
After initial infection and a period of latency, T-cells become infected. Genes related to T-cell signalling pathways include our MD-associated ADAMTS5, CD7, HAVCR1, LAG3, RELT, TIMD4 and TREML2. ADAMTS5 (ADAM Metallopeptidase with Thrombospondin Type 1 Motif 5) encodes a metalloproteinase that plays an important role in inflammation and cell migration. It also has a critical role in T-lymphocyte migration from draining lymph nodes following viral infection. HAVCR1, Hepatitis A Virus Cellular Receptor 1 (T-Cell Immunoglobulin Mucin Receptor 1), is a receptor for TIMD4. HAVCR1 plays a critical role in regulating immune cell activity, particularly regarding the host response to viral infection, while TIMD4 is a T-cell immunoglobulin involved in regulating T-cell proliferation and lymphotoxin signalling. LAG3 (Lymphocyte-Activation Gene 3) belongs to the immunoglobulin superfamily and acts as an inhibitory receptor on activated T-cells. It negatively regulates the activation, proliferation, and effector function of both CD8 + and CD4 + T-cells as well as mediating immune tolerance. RELT is a member of the TNF-receptor superfamily. It can activate the NF-kappaB pathway and selectively bind TNF receptor-associated factor 1 (TRAF1). This receptor acts via CD3 signalling to stimulate T-cell proliferation, suggesting its regulatory role in the immune response. TREML2 (Triggering Receptor Expressed on Myeloid Cells Like 2) is a cell surface receptor that may play a role in both the innate and adaptive immune responses. It interacts with CD276 on T-cells, enhancing T-cell activation.
Once infection of T-cells has occurred, the disease can then proceed to become oncogenic. Once again, we see that many of our MD associated genes have functions that have been implicated in cancer, including BG1 which encodes an Ig-superfamily type I transmembrane receptor-like protein that contains an immuno-receptor tyrosine-based inhibition motif (ITIM). BG1 has previously been documented as conferring MHC-associated resistance to MDV-induced lymphoma [57]. Other candidates include BRINP1 (silenced in some bladder cancers), CD7 (associated with leukaemia), CSTA (encodes a stefin that functions as a cysteine protease inhibitor, suggested as a prognostic tool for cancer), and SUPT20H (a known tumour rejection antigen). FLT3 is another significant candidate, with mutations in this gene being common in acute myeloid leukaemia. FLT3 is involved in activation of various pathways including apoptosis, and proliferation and differentiation of hematopoietic cells.
One of the main pathologies of MD is its effect on the nervous system, and so it is interesting to see that some of our MD associated genes are involved in the function/growth of neurons (SCN4A and CTNNA2). SCN4A (Sodium Voltage-Gated Channel α Subunit 4) encodes one member of the sodium channel α subunit gene family involved in generation and propagation of action potentials in neurons and muscle. CTNNA2 (Catenin α 2) is thought to be involved in the regulation of cell-cell adhesion and differentiation in the nervous system. It is required for proper regulation of cortical neuronal migration and neurite growth.
The remaining MD associated genes are seen to have general roles as immune system genes: C1S, TAP1 and SOCS1. C1S (Complement component 1S) encodes a serine protease component of the complement system which enhances the host antibody immune response. TAP1 (Transporter 1, ATP Binding Cassette Subfamily B Member) is involved in the transport of antigens from the cytoplasm to the endoplasmic reticulum for association with MHC class I molecules. SOCS1 (Suppressor of Cytokine Signalling 1) encodes a protein which functions downstream of cytokine receptors, and takes part in a negative feedback loop to attenuate cytokine signaling via the JAK/STAT3 pathway. All of these candidate genes had more than one test significant at p ≤ 0.05.
We would hypothesize that many of our candidate genes are working in conjunction with each other to confer the resistance phenotype, whether that be through their proximity in the genome (e.g., CSTA, LAG3 and C1S on Chr5 and HAVCR1 and TIMD4 on Chr13) or their interaction in biological networks (e.g., SOCS1, TLR4 and TREML2 or RELT, LAG3, HAVCR1 and TIMD4). This would support the idea that some traits may be associated with pathway-level interactions as opposed to discrete gene functions [58,59].
Examination of these genes and their significance of association with MDV resistance across the elite lines indicates a few top candidates, namely: the cluster of genes in QTLR5 (CSTA, C1S and LAG3), FLT3 in QTLR10, CTNNA2 in QTLR19 and TAP1 in QTLR32. It is also interesting to note the distribution of associated genes across the different lines of birds. Examination of the coding candidate genes shows that only two genes, FLT3 and CTNNA2, are significant in all three varieties of birds. The genes ADAMTS5 and TAP1 only showed significant association in Plymouth Rock birds, whereas CSTA, LAG3, C1S, SUPT20H and the functional variant in SCN4A identified as significant candidates in White Leghorn and Rhode Island Red breeds. Finally, several genes are only significant in White Leghorns, namely RELT, TRANK1, HAVCR1, TIMD4, SOCS1, TLR4, CD7, TREML2, along with the functional variant in ENSGALG00000003188.
Genes identified in this analysis include many novel candidates for resistance as well as highlighting genes proposed in previous studies. For example CD8B (T-cell glycoprotein), CTLA4 (immunoglobulin) and CD72 (B-cell associated) are postulated as important lncRNA target genes by You et al. [24] and are found under our QTLRs (CD8B-QTLR19), and differentially expressed in our transcriptomic work (CTLA4 and CD72). Similarly, ATF2 (involved in carcinogenesis is found in QTLR25) was proposed as an important target for the miRNA gga-mir15b during MDV infection [25]. Also in QTLR25 we find gga-mir-10b, previously seen to be upregulated in the spleen during MDV infection [27]. Other potentially interesting miRNA targets include PBEF1 (pre-B-cell enhancing factor) and FCHSD2 (involved in endocytosis) [26] that lie under QTLR2 and 11, respectively. Further genes previously linked with MDV resistance include GH1 (growth hormone) and CD79B (B-cell antigen), both of which lie under our QTLR37.
One of the significant aspects of this research is that it utilized large, commercial production relevant lines, and the challenge virus is a very virulent ++ strain, frequently encountered by production birds in the field. In contrast, most previously published MDV resistance research utilizes specialized research lines, many of which are inbred, and selected for differential response to MDV. These studies utilized laboratory strains of the virus, for which commercial production birds now appear to be resistant. Furthermore, this study investigated MD resistance genes in three distinct breeds of chickens, namely White Leghorn, White Plymouth Rock, and Rhode Island Red, not just one laboratory line. Moreover, these MD association studies replicated the results from the FSIL study increasing our confidence in the causal nature of the QTLR, and possibly the genes and variants in MDV resistance. The response to the virus was measured as mortality in a large progeny group (approximately 30 daughters) for over 9000 sires, using pre-existing information that had been developed within a commercially relevant production trait breeding program. This unique approach increases the relevance of the results to application into a commercial breeding program, while simultaneously provides information on the underlying mechanism of general viral resistance applicable to not only birds, but also other species. This information can provide insights into mechanisms for improving resistance or lead to the development of improved commercial vaccines.

Conclusions
Utilizing an FSIL F 6 population of birds phenotyped for response to Marek's disease virus infection has allowed us to map QTLR for disease resistance at high-resolution. Combining this with expression data from challenged and control birds, we have identified candidate genes, miRNAs, lncRNAs and potentially functional mutations which have been validated in genetic association tests with MD mortality in diverse, commercially relevant, elite lines of poultry. This most comprehensive genetic study to date supplies us with variants in candidate genes that can now go on to be functionally tested for their utility in marker assisted selection, improved vaccine development, and potential future gene editing strategies.