Next Article in Journal
Xanthohumol Alters Gut Microbiota Metabolism and Bile Acid Dynamics in Gastrointestinal Simulation Models of Eubiotic and Dysbiotic States
Previous Article in Journal
Aerogels Part 2. A Focus on the Less Patented and Marketed Airy Inorganic Networks Despite the Plethora of Possible Advanced Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

BTN2A1 and BTN3A1 as Novel Coeliac Disease Risk Loci: An In Silico Analysis

1
Department of Pathology, University of Cambridge, Cambridge CB2 1TN, UK
2
Medical Research Council Biostatistics Unit, University of Cambridge, Cambridge CB2 1TN, UK
3
Center for Immunity and Immunotherapies, Seattle Children’s Research Institute, Seattle, WA 98105, USA
4
Division of Infection and Immunity, School of Medicine, Cardiff University, Cardiff CF14 4YS, UK
5
Department of Health Data Science, University of Liverpool, Liverpool L69 7ZX, UK
6
The Kennedy Institute of Rheumatology, University of Oxford, Oxford OX1 2JD, UK
7
Nonacus Ltd., Quinton Business Park, Birmingham B32 1AF, UK
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(21), 10697; https://doi.org/10.3390/ijms262110697
Submission received: 20 September 2025 / Revised: 18 October 2025 / Accepted: 29 October 2025 / Published: 3 November 2025

Abstract

Coeliac disease (CeD) is a gastrointestinal enteropathy triggered by the consumption of gluten in predisposed individuals. A recent study showed that individuals were at more than 10% risk of having CeD if a first-degree relative also had the disease. However, only around 50% of CeD genetic heritability is attributable to specific loci, with the majority of this heritable risk attributed to the HLA loci, while the remaining 50% of disease risk is currently unidentified. We investigated the butyrophilin family of immunomodulators as novel CeD risk loci. We sequenced the butyrophilin loci of 48 CeD and 46 control patients and carried out gene-based burden testing on the captured single-nucleotide polymorphisms (SNPs). We found a significantly increased BTN2A1 gene burden in CeD patients. To validate these results, the SNP data of 3094 CeD patients and 29,762 control participants from the UK Biobank database were subjected to single-variant analyses. Fourteen BTN2A1, ten BTN3A1, and thirteen BTN3A2 SNPs were significantly associated with CeD status. These results are interesting, as BTN2A1 and BTN3A2 have not been associated with CeD risk previously but are known to modulate the activation of Vγ9+ γδ T cells and NK cells. Twenty of the 37 SNPs above were associated with CeD status independent of the risk-associated HLA genotypes. All twenty of these SNPs, alongside a novel SNP not included in the above SNPs, were associated with CeD in HLA-DQ2.5-matched case-control groups. We reaffirm the association of the BTN3A2 locus with CeD risk and identify BTN2A1 and BTN3A1 as putative novel CeD risk loci.

1. Introduction

1.1. Background to Coeliac Disease

Coeliac disease (CeD) is a T cell-mediated autoimmune enteropathy triggered by the consumption of gluten, a protein found in wheat, rye, and barley [1]. During active CeD, individuals with underlying genetic risk suffer from small intestinal inflammation after the consumption of dietary gluten [2]. This chronic inflammation causes villous atrophy that can lead to symptoms including abdominal pain, diarrhoea, malabsorption, and malnutrition [3]. Currently, the only treatment for CeD is eliminating gluten from the diet of patients with CeD predisposition [4].
The genetic background of CeD predisposition is still not fully understood, as only 50% of the genetic heritability has been attributed to specific loci [1]. The most well-established CeD risk loci are the human leukocyte antigen (HLA) complex [1,5,6,7,8,9,10]. The HLA-DQ2.5, HLA-DQ2.2, and HLA-DQ8 heterodimers are present in more than 80% of CeD patients [11,12,13,14,15]. In contrast, about 20–30% of healthy controls have the CeD-associated risk HLA genotypes [11,12,16]. These HLA genotypes were estimated to explain about 30–40% of the total CeD genetic heritability [17,18]. Although these HLA genotypes greatly contribute to CeD predisposition, non-HLA loci are increasingly becoming regions of interest in exploring the remaining 50% of CeD heritability. In order to further understand CeD susceptibility, genes involved in immunoregulatory pathways must be examined, such as the butyrophilin family of immunomodulators. Recent evidence has shown the butyrophilin family genes to be non-HLA CeD risk loci of interest [19,20,21].

1.2. The Emerging Role of the Butyrophilin Family of Genes and Their Role in Maintaining γδ T Cells

The butyrophilin proteins are a family of immunoglobulin-like cell surface receptors that have been shown to regulate both innate and adaptive immunity, including the activity of dendritic cells (DC), natural killer (NK) cells, αβ T cells, and γδ T cells [22,23,24,25,26]. Members of the butyrophilin family were found to maintain local γδ T cell compartments in the blood and epithelia of both mice and humans (Table 1) [27,28,29,30,31,32,33]. Hayday and Vantourout [34] hypothesised that butyrophilin proteins serve as a steady-state signal that maintains the local γδ T cell population in a quiescent or inactive state. In the duodenum, the BTNL3/BTNL8 heterodimers act as the ligand for Vγ4+/Vδ1+ γδ intraepithelial lymphocytes (IELs) (Figure 1) [21,27,28]. Specifically, the BTNL3/BTNL8 heterodimer binds the germline-encoded hypervariable region 4 (HV4) of T cell receptor gamma (TCR-γ), when the variable (V) gene segment encoding that TCR-γ is the TRGV4 gene.
During active CeD, these γδ T cells, alongside CD4+ and CD8+ αβ IELs, are activated by dietary gluten [40]. Mayassi et al. [21] showed the loss of interaction between BTNL3/BTNL8 heterodimers and the duodenal Vγ4+ γδ T cells as a characteristic of active CeD in a study of 62 active CeD, 57 gluten-free diet (GFD)-treated CeD, and 99 control participants. During chronic inflammation induced by dietary gluten, the expression of the BTNL3/BTNL8 heterodimer was lost in the small intestine of patients with CeD predisposition. This was accompanied by the permanent loss of BTNL3/BTNL8-reactive Vγ4+/Vδ1+ γδ T cells. The chronic inflammation only subsided when patients followed a GFD. Although the BTNL3/BTNL8 expression recovered, the local γδ TCR repertoire was permanently reshaped: the innate-like Vγ4+/Vδ1+ γδ T cells and T cell receptor γ variable region 4 (TRGV4) gene transcripts were significantly decreased [21].

1.3. A Hypothesis for the Role of Butyrophilin Variation and γδ T Cells in CeD Risk

Recently, a common BTNL8*BTNL3 deletion copy number variant (CNV) was described by Aigner et al. [41] in a cohort of more than 4000 samples (Appendix A). The study reported that 58.4% of their 346 samples of European ancestry had at least one BTNL8*BTNL3 deletion allele (Table A1). This CNV has been shown to encode a BTNL8*3 fusion protein, which likely has an impaired ability to bind to the Vγ4Vδ1+ T cells in the small intestine [31]. As Mayassi et al. [21] observed a permanent shift in the duodenal γδ TCR repertoire when the interaction between the T cells and the BTNL3/BTNL8 heterodimer was disrupted, this fusion protein could predispose carriers to CeD.
Alongside BTNL3 and BTNL8, BTNL2 and BTN3A1 were also implicated in CeD risk. Goudey et al. [19] have identified 14 SNPs associated with CeD, independent of the known CeD risk HLA loci, in a study of 763 CeD and 1420 control samples. One of the SNPs was located in the proximity of BTNL2, a gene harbouring among the highest density of GWAS hits in autoimmune and inflammatory diseases from the butyrophilin family [42,43,44,45,46,47]. Goudey et al. [19] showed that this SNP was marked as being an expression quantitative trait locus (eQTL) for the BTNL2 gene in RegulomeDB, a database annotating the function of non-coding SNPs [48,49]. Furthermore, RegulomeDB also reported a high level of evidence for transcription factor binding for this eQTL [19].
In a separate paediatric study of 26 active CeD, 5 treated CeD, and 25 control subjects, BTN3A1 expression was associated with active CeD in children [20]. The study examined the differential expression of more than 25 defence-related genes in the three subject groups, demonstrating the upregulation of BTN3A1 mRNA and protein expression in the intestinal epithelial cells of children with active CeD. This is an intriguing finding, as BTN3A1 is required for the phosphoantigen (pAg)-induced activation of Vγ9Vδ2+ T cells in peripheral blood, a subset of γδ T cells not previously implicated in CeD. These two studies indicate that the full functions and roles of the butyrophilin family of proteins in immunomodulation remain to be explored.
These findings raise a previously unexplored question about CeD heritability. Do certain individuals co-inherit polymorphisms in their butyrophilin family genes and/or their TRGV4 gene segments that predispose them to CeD? The objective of this study was to assess the association of butyrophilin gene-based burden with CeD risk using a 94-patient discovery cohort. The association of butyrophilin SNPs with CeD predisposition was validated via the UK Biobank’s genome-wide genotyping dataset of 25,192 participants. In this study, we show that 14 BTN2A1, 10 BTN3A1, and 13 BTN3A2 SNPs are significantly associated with CeD status, while HV4 sequence variation was not associated with CeD risk.

2. Results

The impact of genetic variation in the butyrophilin family of genes and the HV4 sequence of duodenal γδ T cells on CeD predisposition was examined in three studies (Figure 2). First, 48 CeD and 46 control samples were subjected to targeted sequencing to capture SNPs in 10 butyrophilin family genes known to be expressed in small intestinal tissues and immune cells. The sequenced butyrophilin variance in CeD samples was burden tested via the control samples. Next, these results were validated, subjecting all available BTN2A1, BTN3A1, and BTN3A2 SNPs to single-variant testing, in a cohort of 3094 CeD and 29,762 control participants from the UK Biobank genome-wide genotyping database. Finally, targeted sequencing of the TRGV4-HV4 sequence was undertaken in 141 CeD and 238 control samples, to investigate the association between TRGV4-HV4 variation and CeD risk.

2.1. BTN2A1 SNPs Were Significantly Associated with CeD Risk in a Study of 94 Samples

To investigate the association between butyrophilin genes and CeD risk, a cohort of 48 CeD and 46 control patients was examined for SNPs in 10 butyrophilin family genes, selected based on their gene expression profile in the duodenum, small intestines, and immune cells (Table A2 and Table A3) and their role in immunomodulation: BTN2A1, BTN2A2, BTN3A1, BTN3A2, BTN3A3, BTNL2, BTNL3, BTNL8, ERMAP, and MOG.

2.1.1. Risk-Associated HLA Genotypes Were Significantly More Frequent in CeD Patients

First, by way of data quality control, the HLA genotypes of the samples were examined. In accordance with previous literature, 95.8% (46/48) of the CeD patients, compared with 54.3% (25/46) of the control group, had CeD risk-associated HLA genotypes (Fisher’s exact test, p = 5.5 × 10−10) (Table A7, Figure A6) [11,13,50].

2.1.2. The BTNL8*BTNL3 Copy Number Variant Was Not Associated with CeD

Next, the BTNL8-BTNL3 loci were examined for the presence of the deletion CNV. The presence of the CNV was determined using a surrogate SNP, the rs72494581 minor allele known to be associated with the presence of the deletion variant [51]. A total of 58.3% (28/48) of the CeD patients and 47.8% (22/46) of the control participants were found to possess at least one deletion variant (Table A8). Interestingly, 10.9% (5/46) of controls were homozygous for the BTNL8*BTNL3 deletion compared to only 4.2% (2/48) of CeD patients, but this did not reach statistical significance (Table A8, Figure A7, Fisher’s exact test, p = 0.2144).

2.1.3. BTN2A1 Gene Burden Was Significantly Higher in CeD Patients

To determine whether any of the butyrophilin family variants were associated with CeD risk, gene-based burden testing, using the TRAPD program [52], was carried out to burden test the non-synonymous coding variants identified in the CeD patients against the variants in the control samples.
The analysis was carried out on qualifying variants at sites where more than 90% of samples had a read depth coverage of >10. Of the 108 and 58 non-synonymous coding variants discovered in the CeD and control samples, respectively, only 5 bi-allelic SNPs shared by both the CeD and control groups qualified for burden testing (Table 2, Table A9, Table A10 and Table A11). Only BTN2A1 variants were significantly associated with CeD risk gene burden in both the dominant (adjusted p = 1.46 × 10−5) and the recessive (adjusted p = 3.70 × 10−8) models, indicating that the presence of a single qualifying BTN2A1 SNP significantly increased CeD risk (Table 2a,b). BTN2A1 variants were more frequent in CeD patients, as 45.8% (22/48) of CeD participants had at least one qualifying BTN2A1 variant compared to 10.9% (5/46) of controls (Table 2a,b). To summarise, the gene burden analysis of butyrophilin genes in CeD patients compared with controls showed a significant association between BTN2A1 gene burden and CeD risk.
Although these results were promising, due to the BTN2A1 gene being part of the extended MHC region and its close proximity (~4 Mb) to the classical MHC region (6p21.3), we could not exclude the possibility that this significant association could be secondary to the risk-associated HLA genotypes of the CeD patients [54,55]. Therefore, these results were validated in the 500,000 genome-wide genotyping dataset of the UK Biobank, by single-variant testing of BTN3A1, BTN3A2, BTN2A1, BTNL3, and BTNL8 SNPs.

2.2. BTN3A1, BTN3A2, and BTN2A1 Genes Were Significantly Associated with CeD in HLA-DQ2.5-Matched Participants of the UK Biobank Database

The UK Biobank dataset was used to validate the association between BTN2A1 and CeD risk and to investigate the association between CeD and butyrophilin SNPs in potentially CeD-relevant genes. After removing participants with missing HLA imputation or genotype data, the final cohort consisted of 3094 CeD patients and 29,762 control participants (Appendix H).

2.2.1. Risk-Associated HLA Genotypes Were Significantly More Frequent in CeD Patients of the UK Biobank

First, as a means of quality control for CeD diagnosis, the HLA genotypes of the CeD and control participants of the UK Biobank were examined. The majority of participants selected from the 500,000 genome-wide genotyping dataset had CeD risk HLA genotypes regardless of their CeD status (Table A13, Figure A12). Risk HLA genotypes were found in 92.4% (2860/3094) of CeD patients and 57.6% (17,144/29,762) of controls. In both control and CeD participants, HLA-DQ2.5 was the most frequent HLA genotype at 21.6% (6416/29,762) and 53.4% (1652/3094), respectively. Interestingly, HLA-DQ8 was the second most frequent risk genotype in controls at 14.1% (4203/29,762). Meanwhile, individuals heterozygous for HLA-DQ2.5/HLA-DQ8 were the second most frequent in the CeD group, with 19.6% (606/3094) of participants possessing that risk HLA genotype.
To compare the proportion of CeD risk-associated HLA genotypes in CeD and control participants in the 500,000 genome-wide genotyping dataset, a chi-square test of independence was used. Similar to the results from the targeted butyrophilin sequencing dataset, the CeD participants had significantly higher proportions of CeD risk HLA genotypes compared with controls (X-squared = 4062.5, df = 6, p < 2.2 × 10−16).
Indeed, when the association between the CeD risk HLA genotypes and CeD status was investigated using a binomial regression model in the UK Biobank dataset, the association between the risk HLA genotypes and CeD status was confirmed. Interestingly, in the regression analysis, all risk HLA genotypes were significantly associated with CeD (adjusted p ≤ 5.13 × 10−4, Table A14) except the HLA-DQ8 genotype (adjusted p = 0.125).

2.2.2. BTN2A1, BTN3A1, and BTN3A2 SNPs Were Significantly Associated with CeD Status in the UK Biobank

Single-variant analyses were carried out to test the association between CeD status and SNPs from the BTN3A1, BTN3A2, BTNL3, and BTNL8 genes in the UK Biobank [56]. Due to the genotyping array used by the UK Biobank, the genetic information of only a limited number of SNPs from each gene was available. A total of 101 butyrophilin SNPs were individually tested for association with CeD status in the UK Biobank (Table 3). As the HLA loci were significantly associated with CeD risk [1], and the BTN3A1 and BTN3A2 loci are in close proximity [22,24], the CeD risk HLA genotypes were also taken into account for the single-variant analyses by including the risk HLA genotypes in the binomial models and analysing the association between butyrophilin SNPs and CeD status in HLA-matched case-control groups as well. The genetic associations were tested by building binomial regression models, where the association between each variable and CeD status was examined.
A total of 37 SNPs were significantly associated with CeD status in the UK Biobank: 14 BTN2A1, 10 BTN3A1, and 13 BTN3A2 SNPs (adjusted p-value ≤ 0.05, Table 4, Table A15 and Table A16). All 37 SNPs were in Hardy–Weinberg equilibrium in the control cohort (Table A17). Most of the significant SNPs were non-coding, with 25 of the 37 SNPs being located in intronic regions. Only one BTN3A1 (rs41266839) and three BTN2A1 (rs13195509, rs3734542, and rs3734543) SNPs were missense variants, and one BTN2A1 (rs13195402) SNP encoded a STOP codon. Of the 37 SNPs, the reference alleles of 30 SNPs were associated with a decreased CeD risk. No BTNL3 nor BTNL8 SNPs were significant in predicting CeD status in the UK Biobank dataset, after Bonferroni correction.

2.2.3. Twenty Butyrophilin SNPs from the UK Biobank Remained Significantly Associated with CeD Status When the Participants’ Risk HLA Genotypes Were Taken into Account

To investigate whether the butyrophilin SNPs in the UK Biobank remained significantly associated with CeD status when taking the HLA loci into account, a second set of binomial regression models was produced. Single-variant models were built for each of the 101 SNPs of interest, which included the risk HLA genotypes of the UK Biobank participants as an additional predictor variable (Table A18). Only 7 BTN2A1, 2 BTN3A1, and 11 BTN3A2 SNPs remained significantly associated with CeD status after applying Bonferroni correction (adjusted p ≤ 0.05, Table 5). All of the significant SNPs were in Hardy–Weinberg equilibrium in the control cohort (Table A19). Similar to the previous model, the majority of the significant SNPs were non-coding, with the exception of a STOP gained SNP (rs13195402) and three missense SNPs (rs13195509, rs3734542, and rs3734543) in the BTN2A1 gene. Out of the 17 non-coding SNPs, 11 SNPs were located in intronic regions. The reference alleles for all 20 SNPs were associated with a decreased CeD risk, meaning that the alternate alleles were more frequent in CeD patients. As the HLA loci were taken into account, these SNPs are likely to be real associations with CeD status, instead of being caused by linkage disequilibrium (LD) due to the proximity of the BTN and HLA loci on chromosome 6.

2.2.4. Twenty-One Butyrophilin SNPs Were Significantly Associated with CeD Status in HLA-DQ2.5-Matched Case-Control Groups of UK Biobank Participants

The final set of analyses was carried out to investigate whether the butyrophilin SNPs were significantly associated with CeD status in all of the CeD risk HLA genotype patients. Therefore, the UK Biobank participants were separated into risk HLA-matched CeD and control groups (Table 6). All 101 butyrophilin SNPs were single-variant tested for their association with CeD status in the HLA-matched groups.
HLA-DQ2.5 was the most common risk HLA genotype in CeD patients, both in the UK Biobank as well as in previous studies [11,12]. A significant association between butyrophilin SNPs and CeD status was only present in the HLA-DQ2.5-matched UK Biobank participants. The BTN2A1, BTN3A1, and BTN3A2 SNPs significantly associated with CeD status in the HLA single-variant testing models remained significant in the HLA-DQ2.5-matched tests as well (Table 7, Table A20 and Table A21). Interestingly, the allele frequency of all significantly associated SNPs significantly differed from the Hardy–Weinberg equilibrium in the control group (Table A22). Additionally, rs7773938, an intronic BTN2A1 SNP, is a novel SNP that was only significantly associated with CeD status in UK Biobank participants with the HLA-DQ2.5 genotype. The reference alleles of all 21 significant SNPs were more frequent in controls compared to CeD individuals, meaning that having the alternate allele at these loci significantly increases an individual’s CeD risk. These results imply that butyrophilin SNPs could only explain additional CeD risk in HLA-DQ2.5-matched individuals of the UK Biobank. As the presence of these reference alleles remained significantly associated with decreased CeD risk even after HLA-matching, the association with these SNPs was not likely to be caused by LD to the HLA loci. Therefore, the 21 butyrophilin SNPs identified were significantly associated with CeD status and contributed to further CeD risk in UK Biobank participants possessing the HLA-DQ2.5 genotype.

2.3. HV4 Variation Was Not Significantly Associated with CeD Risk in a Study of 379 Samples

2.3.1. TRGV Usage Was Not Significantly Different Between CeD and Control Samples

Previous evidence by our group showed that the γδ T cell repertoire is permanently altered in the duodenum of CeD patients [57]. Mayassi et al. [21] also showed that the BTNL3-reactive duodenal Vγ4+ γδ T cells are lost after active CeD, and the local γδ TCR repertoire is permanently reconfigured.
First, we investigated TRGV usage in the duodenal TRG repertoires of 108 healthy controls and 45 CeD patients (Table 8, Figure A13). The TRGV10, TRGV4, and TRGV2 variable (V) gene segments were the most frequent in this dataset. We focused on the TRGV4 segment usage, which is capable of binding the BTNL3/BTNL8 heterodimer. The mean TRGV4 segment usage did not differ between CeD (18.50% of the TRG repertoire) and healthy control samples (18.06%) (Figure 3, Table A23).

2.3.2. HV4 Sequence Variation Was Not Significantly Associated with CeD Risk

Next, the TRGV4-HV4 amino acid sequences were examined in 141 CeD and 238 healthy control samples (Table 8). As demonstrated by Melandri et al. [29] and Willcox et al. [31], only HV4 loops with the wild-type (reference) KYDTYGSTRKNLRMILR amino acid sequence could directly bind BTNL3. Substitutions in the amino acids underlined (KYDTYGSTRKNLRMILR) were found to disrupt this direct binding between BTNL3 and Vγ4+ T cells, while substitutions in the following underlined amino acids (KYDTYGSTRKNLRMILR) only caused a marginal reduction in binding [31]. As the HV4 is germline-encoded and does not undergo recombination, we hypothesised that variations in the germline-encoded TRGV4-HV4 amino acid sequence could alter the binding of the Vγ4+ γδ T cells to BTNL3 protein in the duodenum, predisposing to CeD.
Seven unique HV4 amino acid sequences were identified in the dataset (Table 9a,b). The reference HV4 sequence KYDTYGSTRKNLRMILR capable of binding the BTNL3 protein was the most frequent in both the healthy control (95.8%, 228/238) and CeD (97.9%, 138/141) samples. Approximately 84.9% (202/238) of healthy control samples and 82.3% (116/141) of CeD were homozygous for the WT HV4 sequence. There were no significant differences in the HV4 amino acid sequence distribution between CeD and healthy control samples (Fisher’s exact test, p = 0.26, Figure A14, Table A24). Thus, neither TRGV usage nor HV4 amino acid sequence variation could explain CeD risk in a dataset of 379 samples.

3. Discussion

Around 30% of genetic heritability for CeD can be explained by the HLA risk genotypes HLA-DQ2.5, HLA-DQ8, and HLA-DQ2.2, which were first connected to CeD in 1972 [17,18,58,59]. However, an estimated 50% of genetic heritability remains unexplored [60]. Recently, the butyrophilin family of genes were proposed as non-HLA CeD risk loci [19,20,21]. These genes encode transmembrane proteins that were implicated in regulating the activity of innate and adaptive immune cells, alongside maintaining characteristic epithelial γδ T cell populations in mice and humans [22,24]. Prior to this study, four genes were associated with CeD: BTN3A1, BTNL2, BTNL3, and BTNL8 [19,20,21].
Burden testing the non-synonymous coding butyrophilin SNP data of 46 healthy control and 49 CeD samples showed the BTN2A1 gene burden to be significantly higher in CeD patients in both the dominant (adjusted p = 1.46 × 10−5) and the recessive models (adjusted p = 3.70 × 10−8). CNV analysis of the BTNL8-BTNL3 region in these samples did not show a significant association with CeD risk.
The significant association between BTN2A1 SNPs and CeD predisposition was validated using the UK Biobank 500,000 genome-wide genotyping dataset. Fourteen BTN2A1, 10 BTN3A1, and 13 BTN3A2 SNPs were significantly associated with CeD (adjusted p ≤ 0.05), the majority of which were non-coding variants. When the risk-associated HLA genotypes of these participants were taken into account, only 7 BTN2A1, 2 BTN3A1, and 11 BTN3A2 SNPs remained significant (adjusted p ≤ 0.05), showing HLA-independent associations with CeD risk. Finally, butyrophilin SNPs were single-variant tested in CeD risk HLA-matched groups. The 20 SNPs above, alongside a novel intronic BTN2A1 SNP, were significant in predicting CeD status in 1652 CeD and 6416 control participants with the HLA-DQ2.5 genotype (adjusted p ≤ 0.05).
We thus identified BTN2A1 and BTN3A2 as novel CeD risk loci and corroborated BTN3A1 as a CeD risk locus. The association between BTN3A1 and CeD is in accordance with evidence shown by Pietz et al. [20], who hypothesised that the pAg presentation by intestinal epithelial cells in active CeD may contribute to IFN-γ production and T cell proliferation. Indeed, all three of these butyrophilin genes are involved in the pAg-mediated, innate-like activation of peripheral blood γδ T cells [37,38,61,62]. Interestingly, the majority of the UK Biobank SNPs significantly associated with CeD predisposition were outside coding regions. Of note, UK Biobank validation indicated that only non-coding BTN3A1 and BTN3A2 SNPs were significantly associated with CeD risk. These results could provide an explanation for these genes not having significantly increased gene burden in CeD patients, as the burden testing only considered coding variants.
Due to the association between butyrophilin family members and CeD status, and the involvement of butyrophilin heterodimers in shaping γδ T cell repertoires via binding to Vγ4+ γδ T cells [21,27,28,31], we investigated the likely effects of polymorphisms in the TCR γ V segment, TRGV4, on the interaction between Vγ4+ γδ T cells and the BTNL3/BTNL8 heterodimer, but we failed to find any significant association between the TRGV4-HV4 amino acid sequences and CeD risk, which may suggest that the interaction between Vγ4+ γδ T cells and the BTNL3/BTNL8 heterodimer is not a primary event in determining whether or not CeD develops. However, these results could be due to both the coding regions of butyrophilin genes and the HV4 amino acid sequence being conserved via stabilising selection. This could be due to the interaction between butyrophilins and γδ T cells, including BTN3A1 and BTN3A2 PAg-dependent activation of peripheral blood Vγ9+ T cells, and HV4-BTNL3 interaction, which serves as the maintenance signal for the Vγ4+ γδ T cells in the duodenum [21]. Taking these results together, we provide a new hypothesis for the role of butyrophilins in CeD (Figure 4).
Firstly, these results could imply that BTN2A1 and BTN3A2 act on duodenal Vγ4+ γδ T cells, as well as on peripheral blood Vγ9Vδ2+ γδ T cells, perhaps mediating their pAg-dependent activation (Figure 4a). This hypothesis could explain why the BTNL8*BTNL3 deletion variant, which encodes a BTNL8*3 fusion protein but no full-length BTNL3 or BTNL8 proteins, was not significantly associated with CeD risk in the cohort of 94 samples. Participants who are homozygous for the deletion can only express the truncated BTNL8*3 fusion protein, which lacks the BTNL3-IgV extracellular domain required for maintaining the duodenal TCR of Vγ4+ γδ T cells, which we hypothesised could increase CeD risk [21,29,31,41]. If BTN2A1, BTN3A1, or BTN3A2 could provide a survival signal to the Vγ4Vδ1+ IELs in the healthy small intestine, this could explain why controls could be homozygous for the BTNL3/BTNL8 deletion variant without having CeD.
Secondly, BTN2A1 variants may predispose patients to CeD, via BTN2A1’s role as a ligand for DC-SIGN on DCs, which are important in CD pathogenesis in presenting gluten antigens to CD4+ αβ T cells [24,25]. Thus, BTN2A1 might regulate the autoimmune response in CeD indirectly via DC activity (Figure 4b). Additionally, previous evidence has shown that BTN3 proteins can provide co-stimulatory signals to αβ T cells, increasing their production of interferon-γ (IFN-γ), a proinflammatory cytokine [26]. This same study showed the dual effect of butyrophilins on NK cell activity: BTN3A1 upregulated, while BTN3A2 downregulated IFN-γ production and NK cell activation (Figure 4c).
Thirdly, peripheral blood Vγ9Vδ2+ T cells might undergo BTN2A1-mediated PAg-dependent activation in CeD (Figure 4d), either being recruited to infiltrate the small intestine from the peripheral blood or contributing to CeD pathogenesis in an as yet undetermined way. Interestingly, in the analysis of our cohort of 108 healthy control and 45 CeD duodenal samples, only 3–4% of γδ T cells were Vγ9+ T cells (Figure 3, Table 8). There were no significant differences in the proportion of Vγ9+ T cells in CeD and healthy controls (adjusted p = 0.728), a finding which may argue against a key role for Vγ9+ T cells in CeD.
In conclusion, the butyrophilin family of genes are promising immunomodulators involved in connecting the adaptive and innate immunity [24]. Our results provide evidence that the butyrophilin genes BTN2A1, BTN3A1, and BTN3A2 may be putative CeD risk loci. Due to their important roles in the maintenance, activation, and regulation of γδ T cells, the butyrophilins may be involved in the pathogenesis of other autoimmune and inflammatory disorders. Our work provides a clear rationale for further research into the role of the butyrophilin family of genes in CeD.

4. Materials and Methods

4.1. Participant Selection Criteria

All patient samples used for sequencing were obtained with full ethical approval (IRAS project ID: 162057, REC reference: 04/Q1604/21, PI: Prof. E. Soilleux).
CeD patient samples were selected using hospital records, while control samples were selected to exclude suspected CeD patients.
Control exclusion criteria:
  • Has CeD diagnosis;
  • Malabsorption;
  • Anaemia;
  • Lymphocytosis;
  • On a GFD;
  • Diarrhoea.

4.1.1. Participant Selection for the Butyrophilin Family Gene Sequencing

A total of 48 CeD samples (40 blood, 8 formalin fixed, paraffin-embedded (FFPE) duodenal biopsies) and 46 control samples (38 blood, 8 FFPE duodenal biopsies) were obtained from Cambridge Haematopathology and Oncology Diagnostic Service or Cambridge University Hospitals NHS Foundation Trust Department of Haematology (blood samples) and the Human Research Tissue Bank of Cambridge University Hospitals NHS Foundation Trust (FFPE biopsies).

4.1.2. Validation Cohort Participant Selection from the UK Biobank for Single-Variant Analysis

CeD patients and controls were selected from the anonymised UK Biobank online database using the Cohort Browser program on the online Research Analysis Platform (RAP, https://ukbiobank.dnanexus.com/, application ID: 18532, accessed on 23 May 2022). Participants’ sociodemographic, lifestyle, hospital record information, HLA imputation, and genome-wide genotyping data were available from the UK Biobank online resource centre (https://biobank.ndph.ox.ac.uk/, accessed on 23 May 2022).
Control and CeD participants were selected based on their responses to the CeD online questionnaire (data-field 21068, https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=21086, accessed on 23 May 2022), the dietary web questionnaire (data-field 20086, https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20086, accessed on 23 May 2022), their hospital inpatient record (category 2000, https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=2000, accessed on 23 May 2022), and their death record (category 100093, https://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100093, accessed on 23 May 2022). All participant clinical data were classified using the World Health Organisation’s International Classification of Disease (ICD) system [63]. Most of the hospital inpatient data were coded in ICD-10, but some pre-1997 data collected in Scotland used ICD-9 (https://biobank.ndph.ox.ac.uk/ukb/refer.cgi?id=138483, accessed on 23 May 2022).
Control exclusion criteria were the same as for the blood and biopsy cohort, with the CeD online questionnaire, hospital inpatient record, or death record serving as evidence of a CeD diagnosis.
Coeliac disease inclusion criteria included either of the following:
  • Hospital diagnosis record includes coeliac disease: ICD9 (5790), ICD10 (K90.0);
  • Cause of death includes coeliac disease: ICD10 (K90.0).
After removing individuals with missing data, the finalised UK Biobank cohort consisted of 3094 CeD patients and 29,762 control participants.

4.1.3. Samples Selected for the HV4 Analysis

The sequencing data from three different datasets were used that were selected using the same criteria. A total of 141 CeD and 238 healthy control tissue samples were selected for the HV4 analysis (Table 8).

4.2. Analysis of Butyrophilin Family Variation in the Targeted Sequencing Cohort

4.2.1. Sequencing of HLA Loci and Selected Butyrophilin Family Genes by Hybridisation Capture

The expression profiles of the 15 butyrophilin family members outlined by Rhodes et al. [24] were examined in the Human Protein Atlas (HPA, accessed on 27 October 2020) for protein (or, where protein was unavailable, mRNA) expression in the duodenum, small intestine, rectum, and colon (Appendix B, Table A2), as well as mRNA expression in T cells, DCs, NK cells, macrophages, regulatory T cells, and γδ T cells (Table A3) [64]. BTN2A1, BTN2A2, BTN3A1, BTN3A2, BTN3A3, BTNL2, BTNL3, BTNL8, ERMAP, and MOG were selected.
The Genome Reference Consortium Human Build 38 patch release 12 (GRCh38.p12) genomic position of the 10 butyrophilin genes of interest was determined using the NCBI database [65], the regions of interest were uploaded to the Nonacus Ltd. probe design platform (panel id: 890, Table A4) [66], and 2× tiling probes were designed maximising coverage of the target regions, while avoiding under or over sequencing any regions [67,68]. HLA hybridisation probes were designed and provided by Nonacus Ltd. Hybridisation capture was performed using the Nonacus Cell3 Target Hybridisation & Capture Kit (Nonacus) version (b) protocol (Figure A1, Appendix B.2 and Appendix B.3). Captured libraries were sequenced using the Illumina MiSeq system. Sequencing data obtained are available at https://zenodo.org/records/15203243 (accessed on 12 April 2025).

4.2.2. Germline Short-Variant Discovery and HLA Genotyping

The quality of the sequencing files was assessed using the default FastQC v0.11.9 settings, and the Illumina adapters were removed using Trimmomatic v0.39 [69,70].
The variant call pipeline was built by adapting the GATK best practices for germline short-variant discovery [71], the analysis pipelines of Zhao et al. [72], the Du group [73,74], and Matthews [75] (Appendix C). The code for the pipeline calling SNPs from the raw, unmapped FASTQ sequencing files is available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/tree/dropbox/nonacus_miseq_analysis/variant_call (accessed on 19 March 2025).
HLA genotypes were determined from the sequencing data using HLA-HD version 1.7.0 [76], and the CeD risk-associated HLA genotypes (Section 4.3.1) were called from the alleles. The code for the risk HLA genotyping is available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/tree/dropbox/nonacus_miseq_analysis/hla_typing (accessed on 24 September 2024).

4.2.3. Copy Number Variation (CNV) Analysis of the BTNL8-BTNL3 Loci

The presence of the 56 kb deletion variant in the BTNL8-BTNL3 loci (chr5:180948027–181003596, GRCh38) was analysed by using a surrogate SNP, the T > C rs72494581 (chr5:181003797, GRCh38) BTNL3 intronic SNP, which is associated with the CNV (Table A5) [51]. Fisher’s exact test was performed to investigate differences in BTNL8-BTNL3 CNV between cohorts.

4.2.4. Burden Testing Analysis

The TRAPD program was used for burden testing the variants found in selected butyrophilin genes in samples of the targeted sequencing cohort, as described in Appendix D.2 (Figure A4) [52,53,77,78].
The variants in the CeD and control groups were burden tested using both the recessive and the dominant models.

4.3. Single-Variant Testing of Butyrophilin Family Variance in the UK Biobank Database

4.3.1. CeD Risk-Associated HLA Genotyping in the UK Biobank Cohort

HLA genotyping was performed using the HLA imputation values of the UK Biobank 500,000 genome-wide genotyping cohort (Appendix E.1.), to identify the following CeD risk-associated alleles: HLA-DQA1*02:01 with HLA-DQB1*02:02 (making up the HLA-DQ2.2 heterodimer in the DR2-DQ2 haplotype), HLA-DQA1*05:01 with HLA-DQB1*02:01 (making up the HLA-DQ2.5 heterodimer in the DR3-DQ2 and DR5-DQ7/DR7-DQ2 haplotypes), and HLA-DQA1*03:01 with HLA-DQA1*03:02 (making up the HLA-DQ8 heterodimer in the DR4-DQ8 haplotype).

4.3.2. Single-Variant Testing Using Binomial Regression Models

The single-variant testing model was built into R version 4.2.1 by adapting the UK Biobank analysis of Yu et al. [79]. The code for investigating the association between butyrophilin family SNPs and CeD risk in the UK Biobank is available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/tree/dropbox/ukbiobank_butyrophilin_snp/Butyrophilin_SNP_analysis?ref_type=heads (accessed on 13 November 2024).
The UK Biobank individual SNP data were annotated using the reference SNP cluster IDs (rsIDs) from the SNP database (dbSNP) and the reference allele for these SNPs from the Genome Reference Consortium Human Build 37 (GRCh37) [56,65,80,81]. Further methodological information can be found in Appendix E.2.

4.4. Analysis of TRGV Usage and HV4 Variation in CeD and Control Samples

4.4.1. Processing Samples and TCR Sequencing

The methods of DNA extraction, bulk amplification, and sequencing of the TCR repertoires in Dataset 1 and Dataset 2 were described in Foers et al. [57]. For Dataset 3, the DNA from FFPE duodenal samples and from fresh frozen duodenal samples were extracted using the QiaAmp FFPE DNA kit (Qiagen, Hilden, Germany) and the DNeasy Blood & Tissue Kit (Qiagen), respectively, according to the manufacturer’s instructions (Figure A5).
Hybridisation capture probes were designed for the targeted sequencing of the TCR repertoires of Dataset 3, in collaboration with Nonacus Ltd., Birmingham, UK. Capture probes were designed against the 3′ end of all productive V segments and the 5′ end of all productive J segments available on IMGT, according to their genomic position in the GRCh38.p13 reference genome [82]. Four capture probes (120 bp long) were designed for each productive segment, with the first probe to anneal 10 bp away from the junctional end, with subsequent probes 6 bp away from the previous one.
Samples were prepared for hybridisation capture using the Cell3 Target Library Preparation Kit (b) (Nonacus Ltd., Birmingham, UK), according to the manufacturer’s instructions, and sequenced on an Illumina MiSeq platform.

4.4.2. TRGV and HV4 Analysis Pipeline

The paired-end FASTQ files containing the TRG sequencing data were analysed using MiXCR v4.0.0 (Appendix F) [83,84].
To examine if the TRGV usage was significantly different between CeD and healthy control duodenal samples, pairwise Mann–Whitney U tests were carried out for each of the 10 TRGV segments. To eliminate any false positives due to multiple testing, Bonferroni correction was applied to the p-values. For each test, the proportion of the specific TRGV segment was compared between the CeD and the control groups.
The germline HV4 analysis was carried out using Python 3, by identifying variations in the amino acid sequence that directly binds BTNL3 [29,31]. The HV4 was defined as amino acids 10–25 of the FR3, as described by Willcox et al. [31]. The HV4 reference amino acid sequence ‘KYDTYGSTRKNLRMILR’ (named WT sequence for the purposes of this analysis) was demonstrated to be capable of binding BTNL3 [29,31]. Patients were designated as homozygous or heterozygous for the WT amino acid sequence of the HV4 loop, with a minimum of 10% of each HV4 sequence being used as a cutoff percentage for heterozygosity. Fisher’s exact test was applied to compare HV4 WT frequency between CeD and healthy control samples [85].

Author Contributions

Conceptualization, E.J.S.; methodology, K.N.L.H., S.E., H.K., K.D., L.S., K.P. and A.F.; software, K.N.L.H., T.W.W., H.K. and A.F.; validation, K.N.L.H., A.F. and E.J.S.; formal analysis, K.N.L.H. and A.F.; investigation, K.N.L.H., K.D. and S.E.; resources, E.J.S. and A.F.; data curation, K.N.L.H., S.E. and K.D.; writing—original draft preparation, K.N.L.H.; writing—review and editing, T.W.W. and E.J.S.; visualization, K.N.L.H.; supervision, E.J.S.; project administration, E.J.S.; funding acquisition, E.J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the PhD scholarship of the Department of Pathology, University of Cambridge, and a joint grant from Coeliac UK and Innovate UK to Nonacus Ltd. and EJS (INOV01-18). KNLH also received a Sponsored Dissertation Grant from Coeliac UK. The Cambridge University Hospitals Human Research Tissue Bank is supported by the NIHR Cambridge Biomedical Research Centre (NIHR203312).

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Oxfordshire Research Ethics Committee A (4 June 2004, IRAS project ID: 162057, REC reference: 04/Q1604/21, Principal Investigator: E. Soilleux).

Informed Consent Statement

Study-specific patient consent was not required under the terms of our ethical approval. Study subject selection and risks and benefits: As this study required anonymised patient tissue or cells, surplus to diagnostic or therapeutic requirements or tissues collected by others for research purposes (for example during clinical trials, or by commercial tissue providers, or other research groups etc.), no patient recruitment or specific consenting was required and no interventions were undertaken.

Data Availability Statement

Targeted sequencing data of selected butyrophilin family genes in a cohort of 94 samples (2.1): https://zenodo.org/records/15203243 (accessed on 12 April 2025). γδ TCR sequencing data of 46 CeD and 97 healthy control duodenal samples (Dataset 1, Table 2): https://dataview.ncbi.nlm.nih.gov/object/PRJNA1330789?reviewer=sp7hjkohgpbo6mt3qv7thd57fp (accessed on 26 October 2025). γδ TCR sequencing data of 11 CeD and 11 healthy control duodenal samples (Dataset 2, Table 2): https://dataview.ncbi.nlm.nih.gov/object/PRJNA1330746?reviewer=dcl0ipo3ftc3m83b637i0hjmpd (accessed on 26 October 2025). γδ TCR sequencing data of 84 CeD and 130 healthy control blood samples (Dataset 3, Table 2): https://dataview.ncbi.nlm.nih.gov/object/PRJNA1330754?reviewer=uoqvavqemvh35apdc0ifn99585 (accessed on 26 October 2025).

Acknowledgments

We are grateful to the Haematopathology and Oncology Diagnostic Service (HODS), Cambridge University Hospitals NHS Foundation Trust, for the provision of patient DNA samples. We thank the Cambridge University Hospitals NHS Foundation Trust Human Tissue Research Biobank (HTRB) for the provision of patient tissue samples. We thank the patients, without whom this research would not have been possible. This research has been conducted using the UK Biobank Resource under Application Number 18532. We thank the participants of the UK Biobank.

Conflicts of Interest

L.S. and K.P. are employees of Nonacus Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BTN/BTNL Butyrophilin/butyrophilin-like
CeDCoeliac disease
CNVCopy number variation
DCDendritic cell
GFDGluten-free diet
FFPEFormalin-fixed, paraffin-embedded
HLAHuman leukocyte antigen
HPAHuman Protein Atlas
HV4Hypervariable region 4
IELIntraepithelial lymphocyte
NK cellNatural Killer cell
TRGVT cell receptor γ variable region
SNP/SNVSingle-nucleotide polymorphism/variation
WTWild-type

Appendix A. The Molecular Background of the 56 kb BTNL3*BTNL8 Deletion Variant

The BTNL3 and BTNL8 loci are segmental duplications and share a high sequence similarity. During meiosis, highly identical sequences are prone to recombination, which can give rise to CNVs. This is the likely explanation for the BTNL8*BTNL3 56 kb deletion copy number described by Aigner et al. [41]. This study reported that 58.4% of their 346 samples of European ancestry had at least one BTNL8*BTNL3 deletion allele (Table A1). This CNV has been shown to encode a BTNL8*3 fusion protein, which consists of the transmembrane domain, the extracellular IgV and IgC domains of BTNL8, and the intracellular signalling domain of BTNL3. As the BTNL3-IgV domain is missing in the fusion protein, it is plausible that the BTNL8*3 fusion protein has an impaired ability to bind to the Vγ4Vδ1+ T cells in the small intestine [31].
Table A1. The BTNL8*BTNL3 copy number variation is present in 58.4% of individuals of European ancestry, as first described by Aigner et al. [41]. Carriers are defined as individuals with at least one BTNL8*BTNL3 deletion allele. Abbreviations: CEU: Utah residents with Northern and Western European ancestry; HapMap: International HapMap Project; het.: heterozygous; HGDP: Human Genome Diversity Panel; hom.: homozygous; N: number.
Table A1. The BTNL8*BTNL3 copy number variation is present in 58.4% of individuals of European ancestry, as first described by Aigner et al. [41]. Carriers are defined as individuals with at least one BTNL8*BTNL3 deletion allele. Abbreviations: CEU: Utah residents with Northern and Western European ancestry; HapMap: International HapMap Project; het.: heterozygous; HGDP: Human Genome Diversity Panel; hom.: homozygous; N: number.
PopulationHom. for DeletionHet. for DeletionHom. for Full SequencesDeletion Allele NDeletion Allele FrequencyGroup NCarriers NCarriers
%
HapMapCEU175668900.3191417351.8
Toskani, Italia94534630.358885461.4
HGDPFrance72817420.404523567.3
Italy51813280.389362363.9
Italy (Bergamo)32980.28614535.7
Orkney Islands1113130.433151280.0
TotalEuropean ancestry421601442440.35334620258.4

Appendix B. Selecting and Sequencing the Butyrophilin Genes of Interest

Appendix B.1. HPA Expression Profiles of Butyrophilin Family Genes

Table A2. The expression of the butyrophilin family members in intestinal tissues provided by the HPA. Butyrophilin protein expression in (a) the duodenum and in (b) the small intestine, colon, and rectum was extracted from the Tissue section of the Human Protein Atlas database [64,86].
Table A2. The expression of the butyrophilin family members in intestinal tissues provided by the HPA. Butyrophilin protein expression in (a) the duodenum and in (b) the small intestine, colon, and rectum was extracted from the Tissue section of the Human Protein Atlas database [64,86].
(a) Butyrophilin family expression in the duodenum
Reliability as Defined by the HPAProtein Expression in Duodenum (IHC)
*If RNA Data Only*
CommentIncluded?
ERMAPUncertainHighUncertain Tissue Atlas reliability score
High expression in digestive tissues
Low–medium expression in immune cells
Yes
MOGEnhancedNoneNot expressed in immune cells or digestive tissues
Genomic variance control for the significance of butyrophilin variation in CeD risk
Yes
BTN1A1SupportedNoneNot expressed in digestive tissues or immune cellsNo
BTN2A1ApprovedMediumIncluded due to reliability score and expression
Implicated in the stimulation of Vγ9Vδ2+ T cells [37]
Yes
BTN2A2UncertainHighUncertain Tissue Atlas reliability score
Medium–high expression in digestive tissues
Expressed in immune cells
Yes
BTN3A1ApprovedMediumIncluded due to reliability score and expression
Implicated in the stimulation of Vγ9Vδ2+ T cells [30]
Yes
BTN3A2UncertainMediumUncertain Tissue Atlas reliability score
Implicated in the stimulation of Vγ9Vδ2+ T cells [27]
Yes
BTN3A3EnhancedMediumIncluded due to reliability score and expressionYes
BTNL2Pending*None*Pending Tissue Atlas reliability score
HLA-independent significant association with CeD [19]
Yes
BTNL3Pending*High*Pending Tissue Atlas reliability score
No protein expression data available for the intestinal tissues
Previously documented role in CeD [21]
Yes
BTNL8EnhancedMediumPreviously documented role in CeD [21]Yes
BTNL9Pending*Low*Pending Tissue Atlas reliability score
No protein expression data available for the intestinal tissues
Not expressed in immune cells
No
BTNL10NANANo entryNo
SKINT1LNANANo entryNo
BTN2A3PNANANo entryNo
(b) Expression of selected butyrophilin family members in the small intestine, colon, and rectum
Tissue Expression (IHC) *If RNA Data Only*Included in the Panel?
Reliability as Defined by the HPASmall Intestine
(Glandular)
Duodenum
(Glandular)
Rectum
(Glandular)
Colo
(Glandular)
ERMAPUncertainHighHighHighHighYes
MOGEnhancednonenonenonenoneYes
BTN2A1ApprovedLowMediumMediumMediumYes
BTN2A2UncertainHighHighMediumMediumYes
BTN3A1ApprovedHighMediumHighHighYes
BTN3A2UncertainHighMediumMediumMediumYes
BTN3A3EnhancedMediumMediumMediumMediumYes
BTNL2Pending*Very low*nonenonenoneYes
BTNL3Pending*High**High**High**High*Yes
BTNL8EnhancedMediumMediumnonenoneYes
The reliability score of each entry was provided by the HPA. The score was based on the reliability between the RNA sequencing and antibody staining data. For most genes, the HPA provided immunohistochemical evidence for the protein expression of the genes. Only RNA sequencing data were available for BTNL2, BTNL3, and BTNL9 expression, denoted with *. Evidence linking the butyrophilins to immune cell function and CeD risk was also used to determine inclusion in the custom sequencing panel [19,21,30]. The expression of butyrophilin family members in (b) is shown only for the genes that were selected for the custom probe panel [data accessed in 2021].
Table A3. The expression of butyrophilin family genes of interest in immune cells provided by the HPA. The butyrophilin family mRNA expression data in immune cells were accessed from the Human Protein Atlas (HPA) [64,86].
Table A3. The expression of butyrophilin family genes of interest in immune cells provided by the HPA. The butyrophilin family mRNA expression data in immune cells were accessed from the Human Protein Atlas (HPA) [64,86].
Immune Cell Expression (RNA Sequencing)Included in the Panel?
Reliability as Defined by the HPAγδ T CellsT CellsT-RegDCsMacrophagesNK Cells
ERMAPUncertainLowMediumMediumMediumMediumLowYes
MOGEnhancedVery lowNoneVery lowNoneNoneNoneYes
BTN2A1ApprovedMediumMediumMediumMediumHighLowYes
BTN2A2UncertainLowMediumMediumHighHighMediumYes
BTN3A1ApprovedHighHighHighLowMediumHighYes
BTN3A2UncertainHighHighHighMediumHighHighYes
BTN3A3EnhancedHighHighHighMediumMediumHighYes
BTNL2PendingNoneNoneNoneNoneNoneNoneYes
BTNL3PendingNoneNoneNoneNoneNoneNoneYes
BTNL8EnhancedNoneNoneNoneNoneNoneNoneYes
The reliability score of each entry was provided by the HPA. The score was based on the reliability between the RNA sequencing and antibody staining data. The RNA expression data from T cells, DCs, NK cells, and macrophages were selected from the Single cell type section of the gene entries. The RNA expression data from T-regs and γδ T cells were accessed from the Immune cell type section of the gene entries [data accessed in 2021].
Table A4. The GRCh38.p12 genomic location of the selected butyrophilin genes.
Table A4. The GRCh38.p12 genomic location of the selected butyrophilin genes.
Gene of InterestLocation (GRCh38.p12)
BTN2A1chr6:26,457,955–26,476,622
BTN2A2chr6:26,382,893–26,394,874
BTN3A1chr6:26,402,269–26,415,216
BTN3A2chr6:26,365,170–26,378,320
BTN3A3chr6:26,440,504–26,453,415
BTNL2chr6:32,393,339–32,408,879
BTNL3chr5:180,988,846–181,006,727
BTNL8chr5:180,899,097–180,952,166
ERMAPchr1:42,817,122–42,844,991
MOGchr6:29,657,092–29,672,365

Appendix B.2. Modified Nonacus Cell3 Hybridisation Capture and Illumina Sequencing

To summarise the modified protocol, the DNA quality of all samples was measured initially, to acquire 200 ng input DNA for the fragmentation step (1.B) of the hybridisation capture protocol (Figure A1). Fragmentation time in step 1.B was modified to 30 min to achieve 200 bp DNA fragments. Afterwards, the Genomic Tapestation kit (Agilent Technologies, Santa Clara, CA, USA) was used to check the correct fragment size.
Figure A1. The Nonacus Cell3 capture hybridisation capture method was modified for the HLA and butyrophilin sequencing panel. All modifications to the manufacturer’s protocol were noted with *. The HLA probes were provided by Nonacus Ltd.
Figure A1. The Nonacus Cell3 capture hybridisation capture method was modified for the HLA and butyrophilin sequencing panel. All modifications to the manufacturer’s protocol were noted with *. The HLA probes were provided by Nonacus Ltd.
Ijms 26 10697 g0a1
Next, in step 1.C, unique molecular identifier (UMI) adapters were ligated to the DNA fragments. During the magnetic bead clean up using NGS Target Pure Clean-Up Beads (Nonacus), the adapter-ligated DNA fragments were incubated for 6 min on the magnetic strip. Nuclease-free water was used in the final step of the clean up.
The pre-hybridisation PCR in step 1.D was carried out for 4 cycles. Afterwards, the DNA concentration of each reaction was measured in step 1.E. Samples with DNA concentrations lower than 10 ng/μL were subjected to additional rounds of amplification, and steps 1.D and 1.E were repeated. CB22, CB24, CB26, CB27, and CB30 had low DNA concentration after the amplification step; therefore, they were subjected to 5 additional cycles of PCR. Sample NB7 was also subjected to 6 more cycles of PCR.
In step 2.A, the samples were pooled together, each library containing DNA fragments from 8 patients. For each sample, 125 ng of DNA was used, and the pooled libraries were dried using a vacuum concentrator for 30 min. Each pooled library was hybridised overnight at 65 °C with the designed butyrophilin probes and a 1:50 dilution of HLA probes provided by Nonacus.
In step 2.B, the hybridised library was captured on Dynabeads M-270 Streptavidin beads (Invitrogen/Thermo Fisher Scientific, Waltham, MA, USA). Post-hybridisation PCR was carried out for 20 cycles in step 2.C. This was followed by bead clean up using NGS Target Pure Clean-Up Beads. Similar to step 1.C, the beads and the amplified captured library were incubated for 6 min on the magnetic strip. Nuclease-free water was used in the final step of the clean up.
In step 2D, the concentration of the captured library was measured using Qubit (CAT Q32851, lot 2313066), and the size of the DNA fragments in each hybridisation library was quantified using the 4200 Tapestation (CAT G2991A, lot DEDAA01701).
Illumina MiSeq sequencing required each captured hybridisation library to be diluted to 10 nM concentration. The concentration of each sample in nM was calculated using the following equation:
c o n c e n t r a t i o n   n M = c o n c e n t r a t i o n   ( n g / μ L ) 660 × D N A   f r a g m e n t   s i z e   ( b p ) × 10 6
where the concentration (ng/μL) was the DNA concentration of the captured library as quantified by Qubit, and the DNA fragment size (bp) was the average DNA fragment size as measured by Tapestation.
After each of the 12 hybridised libraries was diluted to 10 nm, 2 μL of each diluted library was mixed together. Afterwards, 10 μL was sent to the Department of Biochemistry, University of Cambridge, UK for sequencing using the Illumina MiSeq system (San Diego, CA, USA).

Appendix B.3. Measuring DNA Quantity and Fragment Size

A Qubit 2.0 fluorometer (Invitrogen) was used to measure nucleic acid quantity, using the Qubit dsDNA High Sensitivity Quantification Assay kit (Invitrogen) according to the manufacturers’ instructions.
The 4200 Tapestation System (Agilent Technologies) was used to measure the fragment size of DNA samples. The Genomic DNA ScreenTape Analysis kit (Agilent Technologies) was used to measure the fragment sizes of DNA samples after step 1.B of the Nonacus hybridisation capture protocol (Nonacus). The D1000 ScreenTape Assay kit (Agilent Technologies) was used to measure the DNA sizes of the pooled hybridisation libraries in step 2.D of the Nonacus hybridisation capture protocol.

Appendix C. Detailed Germline Short-Variant Discovery Protocol

Appendix C.1. Per Sample Preprocesses and Variant Call Using GATK

The variant discovery process for the targeted sequencing cohort was split into two parts. In the first part, each patient sample was processed separately. The variants were called per sample as recommended by the GATK v4.2.6.0 documentation. In the second part of the variant discovery process, the variant-called samples were consolidated, and genotyping was performed jointly for the whole cohort (Appendix C.2).
The workflow management software Snakemake 7.12.1 (accessed on 1 August 2022) was used to orchestrate the per sample preprocessing and variant calling part of the pipeline (Figure A2) [87].
Figure A2. The genetic variants were called per sample for the hybridisation capture samples by adapting GATK best practices. Each box symbolises a step or rule in the Snakemake workflow [87]. The directed acyclic graph was created using the Snakemake software’s built-in commands.
Figure A2. The genetic variants were called per sample for the hybridisation capture samples by adapting GATK best practices. Each box symbolises a step or rule in the Snakemake workflow [87]. The directed acyclic graph was created using the Snakemake software’s built-in commands.
Ijms 26 10697 g0a2
Based on the preprocessing methods of Cucco et al. [73,74], the adapter-trimmed raw sequencing files were mapped to the Genome Reference Consortium Human Build 38 (GRCh38) human reference genome using the Burrow-Wheeler Aligner (bwa) v0.7.17 program in the ‘align_bwamem’ rule [88,89]. The resulting sequence alignment map (SAM) files were converted to binary alignment maps (BAM) (‘sam_to_bam’) and then sorted (‘sort_bam’) and indexed using the SAMtools version 1.16.1 program [77]. The mapping efficiency of the sorted BAM files was assessed using the command ‘samtools stats’.
Next, we used GATK v4.2.6.0 to carry out germline short-variant discovery in accordance with GATK best practices [71]. First, any duplicate reads that were derived from the same original DNA sample were marked using the MarkDuplicates tool (‘mark_duplicates’). This was followed by calculating (‘base_recalibrate’) and correcting any errors detected in the base quality scores (‘apply_bqsr’) using the BaseRecalibrator and ApplyBQSR tools, respectively. Following these preprocessing steps, the SNP and indel variants were called for each sample using the HaplotypeCaller tool in GVCF mode (‘variant_call’).

Appendix C.2. Joint Genotyping Using GATK and Variant Annotation Using VCFtools and ANNOVAR

In the second part of the variant discovery process, the samples that had undergone variant calling were subjected to consolidation, followed by joint genotyping. Here, the samples were separated into CeD and control groups before consolidating the samples into a joint dataset.
The joint genotyping was carried out using GATK programs with default settings (Figure A3). First, the germline cohort data were created by consolidating the per sample genomic variant call format (GVCF) files created by the HaplotypeCaller tool as described above. The sample consolidation step was carried out by the GenomicsDBImport tool (‘consolidate_gvcfs’). The resulting cohort database was passed to the GenotypeGVCFs joint genotyping tool (‘jointcall_cohort’). Next, the raw variants were filtered in a two-step process. First, the variant quality scores on the log-odds scale (VQSLOD) were calculated using the VariantRecalibrator tool (‘variant_recalibration’). A filtering threshold was applied to these variant quality scores to produce a set of high-quality variant calls using the ApplyVQSR tool (‘applyvqsr’). The output of the above joint cohort processes was a recalibrated VCF file that contained all genotyping data of the cohort. The recalibrated VCF files were annotated by applying VCFtools v0.1.17 in frequency (‘run_vcffreq’), count (‘run_vcfcounts’), and comparison (‘compare_vcf’) mode [90]. Variant annotation was also carried out using the table_annovar program from ANNOVAR version 8 June 2020 with default settings (‘table_annovar’) [91].
Figure A3. The samples of the targeted sequencing cohort (n = 94) were jointly genotyped, and the variants were annotated using an adapted GATK workflow. Each box symbolises a step or rule in the Snakemake workflow [87]. The directed acyclic graph was created using the Snakemake software’s built-in commands.
Figure A3. The samples of the targeted sequencing cohort (n = 94) were jointly genotyped, and the variants were annotated using an adapted GATK workflow. Each box symbolises a step or rule in the Snakemake workflow [87]. The directed acyclic graph was created using the Snakemake software’s built-in commands.
Ijms 26 10697 g0a3

Appendix D. Detailed CNV Analysis and Burden Testing Protocols

Appendix D.1. BTNL8*BTNL3 CNV Analysis

Table A5. The rs72494581 surrogate SNP was used to infer the CNV at the BTNL8-BTNL3 region of chromosome 5.
Table A5. The rs72494581 surrogate SNP was used to infer the CNV at the BTNL8-BTNL3 region of chromosome 5.
rs72494581 GenotypesAssociated CNV at BTNL8-BTNL3 Region of Chromosome 5
TTHomozygous for reference alleleFull-length BTNL8-BTNL3 region on both copies of chromosome 5
CTHeterozygousOne copy has full-length BTNL8-BTNL3 region
One copy has BTNL8*BTNL3 deletion
CCHomozygous for alternative alleleBTNL8*BTNL3 deletion on both copies of chromosome 5

Appendix D.2. Detailed Burden Testing Protocol

To summarise, qualifying variants within a gene were selected that had a low minor allele frequency or were predicted to be pathogenic. Any qualifying SNPs with more than two alleles, called multi-allelic sites, were split into SNPs with two alleles for the analysis: the reference allele and one of the alternative alleles. These variants are termed bi-allelic variants [53]. The disease risk burden, or the number of minor alleles, in the control and CeD cohorts was counted and compared. The burden testing was performed using dominant models and recessive models in TRAPD. The dominant model considers individuals as carriers for gene burden, if they have at least one qualifying variant from the selected sites within the gene, while the recessive model requires the presence of two or more variants to be labelled a carrier [53]. As gene burden is an additive value, the zygosity of the qualifying sites does not matter, only the number of qualifying variants. For example, in a gene with three qualifying sites, an individual who is homozygous for the alternate allele for one qualifying site carries the same amount of gene burden as an individual who is heterozygous for two of the qualifying sites. The analysis was modified from the one described by Guo [53], to adapt it to this cohort, as the original pipeline used an external control dataset.
The GATK processed sequences were subjected to further preprocessing before being burden tested with TRAPD, as recommended by Guo [53]. First, multi-allelic variants were separated using BCFtools version 1.16, as required by the TRAPD manual [77]. Next, the control and CeD cohort sequencing files were annotated using Ensembl Variant Effect Predictor (VEP) 109.3, and the SNPs were filtered to contain only non-synonymous coding variants [78]. The hybridisation capture sequencing files were then analysed after read depth filtering (Figure A4).
Figure A4. The variants in the CeD cohort (n = 48) were burden tested via the controls (n = 46) using the TRAPD program. Test Rare vAriants with Public Data (TRAPD) was used to burden test the variants in the hybridisation capture CeD cohort (n = 48) against the control cohort (n = 46) [52]. The annotated CeD and control files from the GATK pipeline were preprocessed as recommended by the manual [53]. The variants were burden tested after read depth filtering. Steps in which the code was modified are marked with *.
Figure A4. The variants in the CeD cohort (n = 48) were burden tested via the controls (n = 46) using the TRAPD program. Test Rare vAriants with Public Data (TRAPD) was used to burden test the variants in the hybridisation capture CeD cohort (n = 48) against the control cohort (n = 46) [52]. The annotated CeD and control files from the GATK pipeline were preprocessed as recommended by the manual [53]. The variants were burden tested after read depth filtering. Steps in which the code was modified are marked with *.
Ijms 26 10697 g0a4
The cohort files were read depth filtered using VEP to select sites, where more than 90% of samples had a read depth coverage of >10. The final step of the preprocessing was to index and intersect the CeD and control sequencing files, to get the common SNPs between the two groups.
Following the preprocessing, the TRAPD code was applied using Python 2.7 to create the SNP file from the CeD and the control cohort sequencing files using ‘make_snp_file.py’, which contains the qualifying variants from each gene. Carriers of the qualifying SNPs from both the control and the CeD files were counted using the ‘count_cases.py’ file. The ‘burden.R’ code was modified to adapt it to the targeted sequencing cohort, as the original pipeline used an external database as the control, while this analysis uses the control sequences from the same cohort.

Appendix E. Detailed Protocol Single-Variant Testing Analysis of Selected Butyrophilin SNPs in the UK Biobank

Appendix E.1. HLA Genotyping in the UK Biobank Using the HLA Imputation Data

The HLA typing code using the UK Biobank HLA imputation data is available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/tree/dropbox/ukbiobank_hla_typing/hla_imputation_only (accessed on 7 March 2025).
To summarise, the code used the HLA imputation values from data-field 22182. These values describe the likelihood of each HLA genotype, of which 14 were HLA-DQA1 alleles and 18 were HLA-DQB1 alleles. The HLA alleles were imputed by the UK Biobank from SNP data using the HLA*IMP:02 program [92]. In resource 182 (https://biobank.ctsu.ox.ac.uk/crystal/refer.cgi?id=182, accessed on 18 May 2022), the UK Biobank suggested using a threshold value of 0.7. If any HLA allele had an imputation value below 0.7, it was treated as an absent allele. The code applied this posterior threshold on the HLA imputation data for each participant, and the output was a list of HLA alleles that each participant had.
Afterwards, the CeD risk-associated HLA genotypes were called from the HLA allele data. The code calling CeD risk genotypes from the HLA imputation-derived alleles is available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/blob/dropbox/ukbiobank_hla_typing/hla_imputation_only/ukbhla_fullcohort.ipynb (accessed on 2 September 2024).
To identify if a participant had CeD risk genotypes, the code looked for the presence of the risk alleles at the HLA-DQA1 and HLA-DQB1 loci. Participants who did not have alleles present at either locus were removed from the analysis. The participant was determined to have a CeD-associated HLA risk genotype if at least one copy of the risk HLA-DQA1 and the HLA-DQB1 alleles was present. If there were alleles present for more than one HLA risk genotype, the participant was typed as possessing both HLA risk genotypes.

Appendix E.2. Detailed Single-Variant Testing of BTN2A1, BTN3A1, and BTN3A2 SNPs in the UK Biobank

As the SNPs were annotated using the dbSNP instead of their genomic position, and the dbSNP was updated to GRCh38 data at the time of this analysis, the SNP data could be used without further modification.
The individual SNP data in the UK Biobank were provided as the number of dbSNP reference alleles at each site, where 2 indicates that the individual is homozygous for the reference allele, while 1 indicates heterozygosity. If a participant had 0 reference alleles at a site, this could indicate homozygosity for the alternate allele or heterozygosity for two of the alternate alleles, depending on the number of potential alternate alleles. However, at the time of the analysis, the dataset did not provide information on which alternate allele was present.
To summarise the butyrophilin variant association analysis, firstly, the UK Biobank genome-wide genotyping dataset was curated and preprocessed for analysis. The downloaded per-chromosome UK Biobank genotyping data were loaded into R as a BEDMatrix object using the BGData 2.4.1 R package [80]. Afterwards, the genotype and phenotype data for the selected UK Biobank participants were merged. Next, data for all SNPs recorded in the BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 human genes were obtained from the National Centre for Biotechnology Information (NCBI) SNP database [65,81]. These SNPs were intersected with the genome-wide genotyping data, to identify the butyrophilin SNPs present in the UK Biobank dataset, which were 27 BTN3A1, 21 BTN3A2, 10 BTNL3, 13 BTNL8, and 30 BTN2A1 SNPs. The final butyrophilin genotyping data in the UK Biobank dataset were provided as count data for the number of reference alleles at each SNP, identified by their rsIDs. Due to multiple testing, the resulting p-values were adjusted using Bonferroni correction.
Secondly, the association between butyrophilin variants and CeD risk was tested using binomial regression models, or binomial generalised linear models. In all of the linear models, the response variable was CeD status (CeD or no CeD, Table A6). The assumptions of the tests were that predictor variables were independent of each other. In the first test, the association between CeD risk HLA genotypes and CeD status was tested using one binomial model. In the second group of tests, individual binomial models were used to analyse the association between each butyrophilin family SNP and CeD risk. In the third group of tests, iterative binomial models were used that analysed the combined effect of butyrophilin family SNPs and HLA risk genotypes on CeD risk. In the fourth group of tests, the association between butyrophilin SNPs and CeD risk were analysed in HLA-matched groups.
Table A6. The binomial models tested the association between HLA risk genotypes and/or the individual butyrophilin family SNPs. Due to multiple testing, the resulting p-values were adjusted using Bonferroni correction. Abbreviations: CeD: coeliac disease; HLA: human leukocyte antigen; SNP: single-nucleotide polymorphism.
Table A6. The binomial models tested the association between HLA risk genotypes and/or the individual butyrophilin family SNPs. Due to multiple testing, the resulting p-values were adjusted using Bonferroni correction. Abbreviations: CeD: coeliac disease; HLA: human leukocyte antigen; SNP: single-nucleotide polymorphism.
Test/Group NumberAssociation Being TestedPredictor Variable(s)Response Variable
First testAssociation between HLA risk genotypes and CeD riskHLA risk genotypeCeD status:
CeD or control
Second group of tests (101 models)Association between individual butyrophilin SNPs and CeD riskButyrophilin SNP:
2 reference alleles
1 reference allele
0 reference allele
CeD status:
CeD or control
Third group of tests (101 models)Association between the combined effect of HLA genotypes and butyrophilin SNPs and CeD riskHLA risk genotype
Butyrophilin SNP:
2 reference alleles
1 reference allele
0 reference allele
CeD status:
CeD or control
Fourth group of testsAssociation between butyrophilin SNPs and CeD risk in HLA-matched groupsButyrophilin SNP:
2 reference alleles
1 reference allele
0 reference allele
CeD status:
CeD or control
Thirdly, the risk ratio or odds ratio (OR), the p-value, and the 95% confidence intervals were calculated for each binomial model assessing the association between butyrophilin SNPs and CeD risk. The direction of each SNP was calculated from the natural logarithm of the OR values (ln(OR)). SNPs where ln(OR) < 1 indicated that the SNP decreased CeD risk. SNPs where ln(OR) > 1 indicated that the SNP increased CeD risk. Due to multiple testing, Bonferroni correction was applied for each group of tests.
The rsnps 0.5.0.0 R package was used to annotate the butyrophilin family SNPs in the UK Biobank significantly associated with CeD using the NCBI database [93]. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg 1.7.7 R package [94].

Appendix F. Detailed TRGV4 Usage and HV4 Amino Acid Sequence Analysis Pipeline

Figure A5. CeD and healthy control patient TRG sequencing data from three cohorts were used to analyse differences in TRGV usage and germline HV4 sequences.
Figure A5. CeD and healthy control patient TRG sequencing data from three cohorts were used to analyse differences in TRGV usage and germline HV4 sequences.
Ijms 26 10697 g0a5
At the time of analysis, the software had a built-in reference library and could identify the clonotypes and gene segments used in the repertoire. The ‘analyze amplicon’ was a one-step command that aligned the sequencing data, assembled and exported the clonotypes found in the TCR repertoire. For the purposes of analysing the germline HV4 sequence, the region of interest was set to include the FR3 of the TRGV genes (‘--region-of-interest {FR3Begin:CDR3End}’). The resulting text file contained the nucleotide sequence (‘targetSequences’), amino acid sequence (‘aaSeq’), clone count (‘cloneCount’), and V and J segment usage (‘allVHitsWithScore’, ‘allJHitsWithScore’) for each unique TRG sequence. The results from the MiXCR output were analysed using the pipeline available at https://gitlab.developers.cam.ac.uk/path/soilleux/soilleux-group/ced_butyrophilin_phd/-/tree/dropbox/trgv_hv4_analysis (accessed on 20 March 2025).
To identify the differences in the TRGV usage of the duodenal TRG repertoire, the read count of the TRGV section (‘allVHitsWithScore’) from the MiXCR output was processed using Python 3. Only samples from the duodenum were analysed for TRGV usage. Therefore, TRGV data from blood samples were not subjected to TRGV usage analysis.

Appendix G. Supplementary Materials for Results Section 2.1

Figure A6. CeD patients (n = 46) had significantly higher proportions of CeD risk HLA genotypes compared to controls (n = 48). The (a) number and (b) percentage of individuals with CeD risk-associated HLA genotypes were significantly higher in CeD patients compared to controls in this dataset (Fisher’s exact test, p = 5.5 × 10−10). The HLA genotypes for participants were called using HLA-HD [95].
Figure A6. CeD patients (n = 46) had significantly higher proportions of CeD risk HLA genotypes compared to controls (n = 48). The (a) number and (b) percentage of individuals with CeD risk-associated HLA genotypes were significantly higher in CeD patients compared to controls in this dataset (Fisher’s exact test, p = 5.5 × 10−10). The HLA genotypes for participants were called using HLA-HD [95].
Ijms 26 10697 g0a6
Table A7. CeD-associated HLA genotypes were found in 95.8% of CeD patients (n = 48) and 45.7% of controls (n = 46). (a) The CeD risk-associated HLA genotypes were called using HLA-HD [95]. (b) The HLA-DQA1 and HLA-DQB1 alleles of the two CeD patients, who did not have CeD risk-associated HLA genotypes. The HLA-DQA1 allele of sample CD1 could not be typed by HLA-HD. CeD patients possessed a significantly higher proportion of risk HLA genotypes, when compared with controls in this dataset (Figure A6, Fisher’s exact test, p = 5.5 × 10−10).
Table A7. CeD-associated HLA genotypes were found in 95.8% of CeD patients (n = 48) and 45.7% of controls (n = 46). (a) The CeD risk-associated HLA genotypes were called using HLA-HD [95]. (b) The HLA-DQA1 and HLA-DQB1 alleles of the two CeD patients, who did not have CeD risk-associated HLA genotypes. The HLA-DQA1 allele of sample CD1 could not be typed by HLA-HD. CeD patients possessed a significantly higher proportion of risk HLA genotypes, when compared with controls in this dataset (Figure A6, Fisher’s exact test, p = 5.5 × 10−10).
(a)
HLA GenotypesHLA-DQ2.5HLA-DQ8HLA-DQ2.2HLA-DQ2.5 and DQ8HLA-DQ2.5 and DQ2.2HLA-DQ8 and DQ2.2Other
CeD patients223261302
Control participants102601225
(b)
SampleHLA-DQA1 AlleleHLA-DQB1 AllelePotential HLA-DQ Type
CB26HLA-DQA1*01:01:01,
HLA-DQA1*02:01
HLA-DQB1*02:01:01,
HLA-DQB1*05:01:01
HLA-DQ5.1
CD1Not typedHLA-DQB1*02:01:01,
HLA-DQB1*06:02:01
Unknown
Table A8. At least 47% of CeD patients (n = 48) and controls (n = 46) had the BTNL8*BTNL3 deletion variant using the rs72494581 surrogate SNP. The BTNL8*BTNL3 deletion variant encodes a truncated BTNL8-BTNL3 fusion protein [41]. Dart et al. [51] identified rs72494581 in the intronic region of BTNL3, which serves as a surrogate SNP, and the alleles are associated with the BTNL8*BTNL3 copy number variant. The major allele, the T allele, is associated with the full-length BTNL3 and BTNL8 genes, while the minor allele, the C allele, is associated with the BTNL8*BTNL3 deletion. The differences between the frequencies in CeD and control individuals failed to reach statistical significance (Figure A8, Fisher’s exact test, p = 0.2144).
Table A8. At least 47% of CeD patients (n = 48) and controls (n = 46) had the BTNL8*BTNL3 deletion variant using the rs72494581 surrogate SNP. The BTNL8*BTNL3 deletion variant encodes a truncated BTNL8-BTNL3 fusion protein [41]. Dart et al. [51] identified rs72494581 in the intronic region of BTNL3, which serves as a surrogate SNP, and the alleles are associated with the BTNL8*BTNL3 copy number variant. The major allele, the T allele, is associated with the full-length BTNL3 and BTNL8 genes, while the minor allele, the C allele, is associated with the BTNL8*BTNL3 deletion. The differences between the frequencies in CeD and control individuals failed to reach statistical significance (Figure A8, Fisher’s exact test, p = 0.2144).
rs72494581 GenotypeTTCTCC
BTNL8-BTNL3 genesHomozygous for full-length sequenceHeterozygous for BTNL8*BTNL3 deletionHomozygous for BTNL8*BTNL3 deletion
Coeliac disease patients20262
Control participants24175
Figure A7. There were no significant differences in the frequency of the BTNL8*BTNL3 copy number variants between CeD patients (n = 48) and controls (n = 46). The BTNL8*BTNL3 deletion variant encodes a truncated BTNL8-BTNL3 fusion protein [41]. Dart et al. [51] identified rs72494581 in the intronic region of BTNL3, which serves as a surrogate SNP, and the alleles are associated with the BTNL8*BTNL3 copy number variants. The major allele, the T allele, is associated with the full-length BTNL3 and BTNL8 genes, while the minor allele, the C allele, is associated with the BTNL8*BTNL3 deletion. Fisher’s exact testing showed that there were no significant differences (adjusted p = 0.2144) in the frequency of the BTNL8*BTNL3 deletion variant-associated rs72494581 genotypes in the CeD and control groups.
Figure A7. There were no significant differences in the frequency of the BTNL8*BTNL3 copy number variants between CeD patients (n = 48) and controls (n = 46). The BTNL8*BTNL3 deletion variant encodes a truncated BTNL8-BTNL3 fusion protein [41]. Dart et al. [51] identified rs72494581 in the intronic region of BTNL3, which serves as a surrogate SNP, and the alleles are associated with the BTNL8*BTNL3 copy number variants. The major allele, the T allele, is associated with the full-length BTNL3 and BTNL8 genes, while the minor allele, the C allele, is associated with the BTNL8*BTNL3 deletion. Fisher’s exact testing showed that there were no significant differences (adjusted p = 0.2144) in the frequency of the BTNL8*BTNL3 deletion variant-associated rs72494581 genotypes in the CeD and control groups.
Ijms 26 10697 g0a7
Table A9. Less than 40% of sites found in CeD samples were shared with controls. Shared polymorphic sites between CeD and control samples in (a) the whole hybridisation capture dataset, (b) in non-synonymous coding sites, (c) in read depth-filtered non-synonymous coding sites. Read depth was defined as the number of sequence reads per site. Read depth filtering was applied as a quality control step. Sites where the read depth (dp) was more than 10, in more than 90% of the samples in each group, passed the read depth filter. Sites described above can be multi-allelic, meaning they may have more than one alternative allele.
Table A9. Less than 40% of sites found in CeD samples were shared with controls. Shared polymorphic sites between CeD and control samples in (a) the whole hybridisation capture dataset, (b) in non-synonymous coding sites, (c) in read depth-filtered non-synonymous coding sites. Read depth was defined as the number of sequence reads per site. Read depth filtering was applied as a quality control step. Sites where the read depth (dp) was more than 10, in more than 90% of the samples in each group, passed the read depth filter. Sites described above can be multi-allelic, meaning they may have more than one alternative allele.
(a)
Number of Butyrophilin Family SitesSites Unique to GroupShared SitesNon-Matching Overlapping Sites
Coeliac116870143532
Control769302
(b)
Number of Butyrophilin Family non-Synonymous Coding SitesSites Unique to GroupShared Sites
Coeliac1087921
Control5829
(c)
Number of Butyrophilin Family Non-Synonymous Coding Sites (>90% Samples in Group dp > 10)Sites Unique to GroupShared Sites
Coeliac60546
Control2115
Non-matching overlapping sites were defined as polymorphic sites, where the reference and/or alternate alleles identified in each group were different alleles. For example, a non-matching overlapping SNP would have the same base pair position on the chromosome with the same reference allele, but the alternate allele in one group is different from the other group’s. Comparisons were carried out using vcftools 0.1.17 [90].
Table A10. Percentage data of burden testing of butyrophilin variants in CeD samples against controls from the hybridisation capture dataset.
Table A10. Percentage data of burden testing of butyrophilin variants in CeD samples against controls from the hybridisation capture dataset.
GeneQual. SNPsCeD %(≥1 HET)CeD %(≥2 HET)CeD %(HOM ALT)CeD Total Qual. allele FreqControl %(≥1 HET)Control %(≥2 HET)Control %(HOM ALT)Control Total Qual. Allele FreqDominant Model p-ValueRecessive Model p-Value
BTN2A1345.843.86.30.28110.98.70.00.0471.46 × 10−53.70 × 10−8
BTN3A2110.40.02.10.07319.60.02.20.1200.9290.946
ERMAP143.80.016.70.38543.50.015.20.3700.5160.988
Burden testing was carried out on butyrophilin variants in CeD samples against controls from the dataset with read depth filtering. During read depth filtering, only sites where more than 90% of samples had a read depth coverage of >10 were selected. Percentage values in columns 3–5 and 7–9 show the percentage of individuals with each genotype within the CeD and the control groups of the dataset, respectively. Significant results are highlighted in bold. Abbreviations: CeD: coeliac disease; freq: frequency; HET: heterozygous; HOM ALT: homozygous for the alternative allele; N: number; qual.: qualifying; SNP: single-nucleotide polymorphism.
Table A11. Per sample genotype of participants at significant BTN2A1 qualifying SNPs from the hybridisation capture cohort.
Table A11. Per sample genotype of participants at significant BTN2A1 qualifying SNPs from the hybridisation capture cohort.
(a) CeD participants
Position6:264634326:264680986:26468317
rsIDrs13195509rs3734542rs3734543
CB22hom althom althom alt
CB24hom althom althom alt
CB26hethethet
CB27hethethet
CB29hethethet
CB31hethethet
CB32hom refhom refhom ref
CB33hom refhom refhom ref
CB34hethethet
CB35hethethet
CB36hethethet
CB37hethethet
CB38hethethet
CB39hom refhom refhom ref
CB40hom refhom refhom ref
CB41hom refhom refhom ref
CB43hom refhom refhom ref
CB44hom refhom refhom ref
CB45hethethet
CB46hom refhom refhom ref
CB48hom refhom refhom ref
CB49hom refhom refhom ref
CB50hom refhom refhom ref
CB51hethethet
CB52hom refhom refhom ref
CB53hethethet
CB54hethethet
CB55hethethet
CB56hom refhom refhom ref
CB57hom refhom refhom ref
CB58hethethet
CB59hom refhom refhom ref
CB60hom refhom refhom ref
CB61hethethet
CB62hethethet
CB63hom althom althom alt
CB65hom refhom refhom ref
CB67hethethet
CB69hethethet
CB70hom refhom refhom ref
CD1hom refhethom ref
CD2hom refhom refhom ref
CD3hom refhom refhom ref
CD4hethethet
CD5hethethet
CD6hom refhom refhom ref
CD7hom refhom refhom ref
CD8hom refhom refhom ref
(b) Control participants
Position6:264634326:264680986:26468317
rsIDrs13195509rs3734542rs3734543
NB11hom refhom refhom ref
NB13hom refhom refhom ref
NB14hom refhom refhom ref
NB16hom refhom refhom ref
NB19hom refhom refhom ref
NB2hom refhom refhom ref
NB20hom refhom refhom ref
NB21hethethet
NB22hom refhom refhom ref
NB25hom refhom refhom ref
NB28hethethet
NB29hom refhom refhom ref
NB3hom refhom refhom ref
NB30hom refhom refhom ref
NB31hom refhom refhom ref
NB32hom refhom refhom ref
NB34hom refhom refhom ref
NB35hom refhom refhom ref
NB37hom refhom refhom ref
NB38hom refhom refhom ref
NB39hom refhom refhom ref
NB41hom refhom refhom ref
NB42hom refhom refhom ref
NB44hom refhom refhom ref
NB46hom refhom refhom ref
NB47hom refhom refhom ref
NB48hom refhom refhom ref
NB49hom refhom refhom ref
NB5hom refhom refhom ref
NB50hethethet
NB52hom refhom refhom ref
NB56hom refhom refhom ref
NB58hom refhom refhom ref
NB68hom refhom refhom ref
NB69hom refhom refhom ref
NB7hom refhom refhom ref
NB70hom refhom refhom ref
NB71hom refhom refhom ref
ND1hom refhom refhom ref
ND10hom refhom refhom ref
ND2hom refhom refhom ref
ND5hethethet
ND6hom refhom refhom ref
ND7hom refhom refhom ref
ND8hom refhom refhom ref
ND9hom refhethom ref
Abbreviations: het: heterozygous; hom alt: homozygous for the alternative allele; hom ref: homozygous for the reference allele.

Appendix H. Demographic Data of the Selected UK Biobank Participants

Firstly, the ages at recruitment of the participants were analysed. The distribution of the age at recruitment of neither controls nor of CeD patients followed a normal distribution (Figure A8). The mean age at recruitment for controls was 62, which was significantly higher than that of CeD patients (t = 27.297, df = 3539.3, p < 2.2 × 10−16), whose average age at recruitment was 58.
Figure A8. CeD patients were significantly younger than control individuals when they were recruited for the initial UK Biobank study. The distribution of the age at recruitment of (a) 29,762 control participants and (b) 3094 CeD patients from the UK Biobank dataset did not follow a normal distribution. (c) The mean age at recruitment was significantly higher in control individuals (t = 27.297, df = 3539.3, p-value < 2.2 × 10−16).
Figure A8. CeD patients were significantly younger than control individuals when they were recruited for the initial UK Biobank study. The distribution of the age at recruitment of (a) 29,762 control participants and (b) 3094 CeD patients from the UK Biobank dataset did not follow a normal distribution. (c) The mean age at recruitment was significantly higher in control individuals (t = 27.297, df = 3539.3, p-value < 2.2 × 10−16).
Ijms 26 10697 g0a8aIjms 26 10697 g0a8b
Secondly, the sex of CeD and control participants in the UK Biobank was investigated (Figure A9). Interestingly, the proportion of female participants was significantly higher in the CeD group (64.8%, 2005/3094) than in the control group (40.4%, 12 010/29,762) (X-squared = 683.91, df = 1, p < 2.2 × 10−16).
Figure A9. A significantly higher proportion of individuals diagnosed with CeD were female participants, compared with the control cohort in the UK Biobank dataset. The sex of 3094 CeD patients and 29,762 control participants was analysed from the UK Biobank. The figure shows the (a) number and (b) percentage of each sex for CeD and control participants. The CeD group in the UK Biobank had a significantly higher proportion of female participants (X-squared = 683.91, df = 1, p < 2.2 × 10−16).
Figure A9. A significantly higher proportion of individuals diagnosed with CeD were female participants, compared with the control cohort in the UK Biobank dataset. The sex of 3094 CeD patients and 29,762 control participants was analysed from the UK Biobank. The figure shows the (a) number and (b) percentage of each sex for CeD and control participants. The CeD group in the UK Biobank had a significantly higher proportion of female participants (X-squared = 683.91, df = 1, p < 2.2 × 10−16).
Ijms 26 10697 g0a9
Thirdly, the ethnic background of the UK Biobank participants was analysed. In both CeD and control groups, the majority of participants had White British backgrounds, at 91.8% (26 850/29,762) and 90.2% (2841/3094), respectively (Figure A10). In both groups, Irish and any other white background were the second and third most frequent ethnic backgrounds, respectively. When the ethnic background of CeD and control participants was compared (X-squared = 42.294, df = 21, p = 0.00384), no significant difference was present after Bonferroni correction for multiple testing (adjusted p > 0.05).
Figure A10. White British was the most common ethnic background in the UK Biobank participants, regardless of CeD status. The ethnic backgrounds of 3094 CeD patients and 29,762 control participants were analysed from the UK Biobank. There were no significant differences in the ethnic background of CeD and control participants after Bonferroni correction.
Figure A10. White British was the most common ethnic background in the UK Biobank participants, regardless of CeD status. The ethnic backgrounds of 3094 CeD patients and 29,762 control participants were analysed from the UK Biobank. There were no significant differences in the ethnic background of CeD and control participants after Bonferroni correction.
Ijms 26 10697 g0a10
Finally, the dietary web questionnaire answers of CeD and control individuals were examined for any differences between the two groups. GFD was excluded from the statistical analysis of dietary differences between CeD and control individuals in the UK Biobank. Firstly, being on a GFD was one of the exclusion criteria for the control cohort, to exclude potential, undiagnosed CeD cases. Therefore, this diet would have zero occurrences in the control group. Secondly, CeD patients generally follow a GFD, as it is currently the only treatment for CeD [4]. This created a higher occurrence of GFD in CeD patients compared to controls selected from the UK Biobank dataset. The majority of UK Biobank participants did not adhere to any special diet, with 94.7% of controls (28,185/29,762) and 71.2% of CeD patients (2204/3094) reporting no special diet (Figure A11). A low-calorie diet was the most common special diet in controls (3.1%, 926/29,762).
Figure A11. The majority of UK Biobank participants reported no special diet in both control and CeD groups. The diet of 3094 CeD patients and 29,762 control participants was analysed from the UK Biobank. The majority of UK Biobank participants reported no special diet in both control (94.7%, 28,185/29,762) and CeD (71.2%, 2204/3094) groups. When participants following a gluten-free diet were excluded from the analysis, the proportions of participants following each special diet were significantly different between the CeD and control groups (X-squared = 23.303, df = 11, p = 0.01602). This was likely caused by having to exclude patients who followed multiple diets that included a gluten-free diet. Participants on a gluten-free diet were excluded from the control group.
Figure A11. The majority of UK Biobank participants reported no special diet in both control and CeD groups. The diet of 3094 CeD patients and 29,762 control participants was analysed from the UK Biobank. The majority of UK Biobank participants reported no special diet in both control (94.7%, 28,185/29,762) and CeD (71.2%, 2204/3094) groups. When participants following a gluten-free diet were excluded from the analysis, the proportions of participants following each special diet were significantly different between the CeD and control groups (X-squared = 23.303, df = 11, p = 0.01602). This was likely caused by having to exclude patients who followed multiple diets that included a gluten-free diet. Participants on a gluten-free diet were excluded from the control group.
Ijms 26 10697 g0a11
A GFD was the most common special diet in CeD participants of the UK Biobank (22.2%, 686/3094), with an additional 3.6% (112/3094) of patients following a GFD in addition to other special diets. The second and third most common special diet in the CeD group was a combination of a GFD with a low-calorie diet, and a GFD with a lactose-free diet, at 1.5% (45/3094) and 1.3% (40/3094), respectively. When any variation in a GFD was excluded from the CeD group, the most common diet was a low-calorie diet (1.3%, 39/3094). Interestingly, 74.2% (2296/3094) of CeD participants did not follow a diet that excluded gluten.
When excluding a GFD from the analyses, the proportions of participants following the special diets within the control group were significantly different from the CeD group (X-squared = 23.303, df = 11, p = 0.01602). The lactose-free–low-calorie, vegan, lactose-free–low-calorie–vegetarian, lactose-free–vegetarian, and lactose-free–low-calorie–vegan diets were only found in the control group after excluding CeD patients adhering to a GFD. This could be due to CeD patients following the aforementioned diets in addition to being on a GFD (Table A12).
Table A12. The significant difference between the special diets of controls (n = 29,762) and CeD participants not on a gluten-free diet (n = 2296) may stem from CeD patients following multiple diets in addition to following a gluten-free diet.
Table A12. The significant difference between the special diets of controls (n = 29,762) and CeD participants not on a gluten-free diet (n = 2296) may stem from CeD patients following multiple diets in addition to following a gluten-free diet.
Without GFDWith GFD
DietN Controls on Diet% Controls on DietN CeD on Diet% CeD on Diet (Out of 2296)N CeD on Diet% CeD on Diet (Out of 3094)
Lactose-free–low calorie200.0670050.162
Vegan150.0500010.032
Lactose-free–low calorie–vegetarian20.0070010.032
Lactose-free–vegetarian20.0070010.032
Lactose-free–low calorie–vegan10.0030020.065

Appendix I. Supplementary Materials for Results Section 2.2

Table A13. CeD-associated HLA genotypes were found in 92.4% of CeD (n = 3094) and 57.6% of control (n = 29,762) participants from the UK Biobank’s 500,000 genome-wide genotyping dataset. The HLA genotype of selected participants from the 500,000 genome-wide genotyping dataset was called using the HLA imputation values provided by the UK Biobank. CeD risk HLA genotypes were significantly more frequent in CeD patients compared to control participants (X-squared = 4062.5, df = 6, p < 2.2 × 10−16).
Table A13. CeD-associated HLA genotypes were found in 92.4% of CeD (n = 3094) and 57.6% of control (n = 29,762) participants from the UK Biobank’s 500,000 genome-wide genotyping dataset. The HLA genotype of selected participants from the 500,000 genome-wide genotyping dataset was called using the HLA imputation values provided by the UK Biobank. CeD risk HLA genotypes were significantly more frequent in CeD patients compared to control participants (X-squared = 4062.5, df = 6, p < 2.2 × 10−16).
HLA GenotypesHLA-DQ2.5HLA-DQ8HLA-DQ2.2HLA-DQ2.5 and DQ8HLA-DQ2.5 and DQ2.2HLA-DQ8 and DQ2.2Other
CeD participants165217119960618250234
Control participants64164203415489588659012,618
Figure A12. CeD patients (n = 3094) had significantly higher proportions of CeD risk-associated HLA genotypes compared to controls (n = 29,762) in the UK Biobank’s 500,000 genome-wide genotyping dataset. The (a) number and (b) percentage of participants with CeD risk HLA genotypes were significantly more frequent in CeD patients compared to control participants (X-squared = 4062.5, df = 6, p < 2.2 × 10−16). All participants’ HLA genotypes were called using the HLA imputation values provided by the UK Biobank.
Figure A12. CeD patients (n = 3094) had significantly higher proportions of CeD risk-associated HLA genotypes compared to controls (n = 29,762) in the UK Biobank’s 500,000 genome-wide genotyping dataset. The (a) number and (b) percentage of participants with CeD risk HLA genotypes were significantly more frequent in CeD patients compared to control participants (X-squared = 4062.5, df = 6, p < 2.2 × 10−16). All participants’ HLA genotypes were called using the HLA imputation values provided by the UK Biobank.
Ijms 26 10697 g0a12
Table A14. Coefficients of the binomial regression model investigating CeD risk HLA genotypes as a predictor variable for CeD status in the UK Biobank dataset. Results of the binomial generalised linear model testing the association between CeD status and CeD risk-associated HLA genotypes in the UK Biobank dataset. Abbreviations: NA: not applicable; ns: not significant.
Table A14. Coefficients of the binomial regression model investigating CeD risk HLA genotypes as a predictor variable for CeD status in the UK Biobank dataset. Results of the binomial generalised linear model testing the association between CeD status and CeD risk-associated HLA genotypes in the UK Biobank dataset. Abbreviations: NA: not applicable; ns: not significant.
HLA GenotypeCoefficient EstimateStandard Errorz Valuep-ValueCeD Risk
HLA-DQ2.2, HLA-DQ2.52.6490.09029.550<2 × 10−16Increase
HLA-DQ2.2, HLA-DQ80.5700.1643.4745.13 × 10−4Increase
HLA-DQ2.51.6820.07821.662<2 × 10−16Increase
HLA-DQ2.5, HLA-DQ81.4560.10913.352<2 × 10−16Increase
HLA-DQ8−0.1630.107−1.5330.125ns
Other HLA genotype−0.9490.098−9.677<2 × 10−16Decrease
Constant−3.0390.073−41.872<2 × 10−16NA
Table A15. Single-variant analysis of butyrophilin SNPs and CeD status without taking the HLA loci into account using the UK Biobank dataset. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
Table A15. Single-variant analysis of butyrophilin SNPs and CeD status without taking the HLA loci into account using the UK Biobank dataset. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
SNP NameGeneORUpperLowerAdjusted p-Valueln(OR)
rs10484441_GBTN2A11.1383861.2499551.0388350.607890.129612
rs12660069_CBTN2A11.1069151.2995920.94899410.101577
rs13195402_GBTN2A10.3970020.4246440.3712614.67 × 10−158−0.92381
rs13195509_GBTN2A10.4243580.452310.3982391.61 × 10−151−0.85718
rs13437351_GBTN2A11.4855381.913261.1761980.141510.395777
rs1407045_ABTN2A11.3141121.3858271.2463096.07 × 10−220.273161
rs142951857_ABTN2A11.3910653.1076410.72316910.33007
rs143104579_GBTN2A11.140681.4057530.93555210.131625
rs146399224_TBTN2A110963.24NA0.00393819.302303
rs148111655_GBTN2A11.1306132.5370720.58356310.12276
rs2273558_ABTN2A10.6731250.7119940.6364281.69 × 10−41−0.39582
rs2893856_TBTN2A10.8397080.9111010.7727660.0032259−0.1747
rs2893857_CBTN2A11.1428951.2552361.0426950.480830.133565
rs3734539_CBTN2A14032.285NA0.00718.302088
rs3734542_GBTN2A10.4252830.4532920.399118.59 × 10−151−0.855
rs3734543_GBTN2A10.4300120.4589930.4029741.59 × 10−140−0.84394
rs3799380_TBTN2A10.5776320.6117260.5455558.59 × 10−77−0.54882
rs56296968_CBTN2A10.546440.5795040.5153759.70 × 10−89−0.60433
rs6456724_TBTN2A10.8386970.910040.7718030.00287−0.17591
rs6907857_TBTN2A11.4331061.8340391.1409560.294140.359844
rs6911470_CBTN2A11.4728091.9637261.1324070.578290.387171
rs6929846_TBTN2A10.8138980.8752610.7560013.60 × 10−6−0.20592
rs7773913_CBTN2A11.4330611.8339811.140920.294390.359813
rs7773938_CBTN2A10.5489270.5820920.5177651.15 × 10−87−0.59979
rs77870445_TBTN2A11.2197231.6099430.94247310.198624
rs9348718_ABTN2A11.2670161.4542431.1094110.061140.236664
rs9358943_CBTN2A11.76806931.864290.36313710.569888
rs9358944_ABTN2A10.5464430.5793550.5155121.55 × 10−89−0.60432
rs9358945_ABTN2A10.545520.5783670.5146494.37 × 10−90−0.60602
rs9461254_GBTN2A11.4644191.9537561.1251510.667990.381458
rs10456045_GBTN3A10.6388260.6743710.6052022.97 × 10−57−0.44812
rs10807008_GBTN3A11.0915841.1922891.00109210.087629
rs12200782_CBTN3A11.1389991.2501211.0398090.565810.13015
rs12207930_CBTN3A11.1476021.2414181.0622010.054180.137674
rs12208447_CBTN3A11.2847791.6245541.03090810.250586
rs12214924_TBTN3A11.1475141.2410761.0623240.052830.137597
rs143476765_ABTN3A11.1088684.6167950.39688810.10334
rs144114619_TBTN3A11.2128542.1479410.74060410.192976
rs145059723_ABTN3A11.4127674.0343110.62989510.34555
rs1741738_ABTN3A11.1443671.2421761.055730.116230.134851
rs17610161_GBTN3A11.0970341.1992231.00530610.09261
rs1796520_CBTN3A10.7586460.8000040.7192972.40 × 10−22−0.27622
rs3799378_ABTN3A10.5855590.619570.5534992.92 × 10−75−0.53519
rs3857549_CBTN3A11.2474081.4010991.1145940.015260.221068
rs3902051_ABTN3A11.0903651.1879591.00236610.086513
rs41266839_GBTN3A10.3967460.4234980.371782.12 × 10−168−0.92446
rs4609015_TBTN3A11.1515941.2455781.0660330.038170.141147
rs4712990_CBTN3A11.1017641.2032691.01053910.096912
rs55676749_TBTN3A11.139931.383680.94827310.130967
rs56161420_GBTN3A11.1389091.2308151.0551420.093810.130071
rs6900725_TBTN3A11.1494011.2428041.0643410.043330.139241
rs6912853_CBTN3A11.164881.2572361.0805770.007850.152618
rs6920986_CBTN3A11.1482381.2418841.0629750.049950.138228
rs6921148_TBTN3A11.1651761.4115890.97103310.152872
rs742090_ABTN3A10.7590780.8005360.7196393.58 × 10−22−0.27565
rs7770214_GBTN3A11.1457791.2391551.0607580.060480.136085
rs80153343_GBTN3A11.1368991.4396220.91073310.128305
rs11758089_TBTN3A21.1928781.2885141.1056740.000630.176369
rs12176317_ABTN3A20.4453070.4740990.418376.72 × 10−140−0.80899
rs12194095_CBTN3A21.1187981.2359141.01516410.112255
rs12199613_CBTN3A20.670370.7068510.6357511.76 × 10−47−0.39993
rs12205731_GBTN3A21.1147551.2320631.01102910.108634
rs144016445_GBTN3A281101.36NA217.0251111.30346
rs1977_ABTN3A20.4456970.4748310.4184551.17 × 10−136−0.80811
rs1979_GBTN3A20.445370.4741710.4184268.23 × 10−140−0.80885
rs1985732_ABTN3A20.632890.6682260.5994672.87 × 10−59−0.45746
rs2073526_GBTN3A20.744350.7855790.7051159.15 × 10−25−0.29524
rs35183513_GBTN3A21.1028611.2036591.01218410.097908
rs58367598_TBTN3A21.2342561.4493831.0575230.890910.210469
rs7765566_GBTN3A21.2697871.4991631.0831570.398470.23885
rs9104_GBTN3A21.0929041.1875671.00725510.088838
rs9358934_GBTN3A20.4478240.4768450.4206782.34 × 10−137−0.80335
rs9379855_TBTN3A20.4476820.476660.4205758.85 × 10−138−0.80367
rs9379858_TBTN3A20.4484490.4774740.4212993.19 × 10−137−0.80196
rs9379859_CBTN3A20.4478430.4769030.4206625.37 × 10−137−0.80331
rs9379861_GBTN3A21.2255071.6193630.94571310.203355
rs9393713_GBTN3A20.4429580.4716020.4161591.07 × 10−141−0.81428
rs9393714_GBTN3A20.4434190.472140.4165516.99 × 10−141−0.81324
rs186813312_CBTNL30.103966NANANA−2.26369
rs199970076_GBTNL30.54401310.424760.0876971−0.60878
rs201534771_GBTNL30.108957NANANA−2.2168
rs201813197_CBTNL31.1418424.749260.40959910.132642
rs35157246_CBTNL31.0698311.2422640.92605210.067501
rs4700774_GBTNL30.9434460.9997780.890611−0.05822
rs59220426_CBTNL31.0060541.1325990.89666910.006036
rs73815153_GBTNL31.0099881.1385640.89902810.009938
rs7713324_ABTNL31.0042941.1307360.89500110.004284
rs7726604_CBTNL31.0047511.131210.89544410.00474
rs112469887_GBTNL81.0842341.3302250.89313310.080874
rs113071395_GBTNL80.8673721.0060070.7516811−0.14229
rs113534626_ABTNL81.0199561.2065310.86825210.019759
rs141492316_TBTNL80.8918061.0955880.7334831−0.11451
rs145199317_ABTNL80.9077651.2549320.6731981−0.09677
rs151174174_CBTNL80.7704410.9327220.6416250.63031−0.26079
rs17704291_CBTNL80.9402160.9962830.887631−0.06165
rs200633883_CBTNL80.3115521.1149680.1084571−1.16619
rs201214790_TBTNL84044.713NA1.08 × 10−718.305166
rs201891387_GBTNL80.623552.6630980.2109531−0.47233
rs2276995_ABTNL80.9836811.0384220.9319871−0.01645
rs2619739_CBTNL81.1012461.2124021.00238610.096442
rs7724813_GBTNL81.0786211.1695430.99616910.075683
Table A16. SNP and allele count data for the significant SNPs from the non-HLA model. These SNPs were significantly associated with CeD status in single-variant testing of the UK Biobank dataset. The SNPs in bold remained significantly associated with CeD status in the binomial regression models that took the HLA genotype of individuals into account.
Table A16. SNP and allele count data for the significant SNPs from the non-HLA model. These SNPs were significantly associated with CeD status in single-variant testing of the UK Biobank dataset. The SNPs in bold remained significantly associated with CeD status in the binomial regression models that took the HLA genotype of individuals into account.
SNP, Reference AlleleGeneNumber of SNPs in ControlNumber of SNPs in CeDTotal Allele Count in ControlTotal allele Count in CeDTotal Number of SNPs in the UK BiobankTotal Allele Count in UK Biobank
rs13195402_GBTN2A152,060463158,392602856,69164,420
rs13195509_GBTN2A152,271465459,474617656,92565,650
rs1407045_ABTN2A130,596360359,306617234,19965,478
rs2273558_ABTN2A134,599324851,134557037,84756,704
rs2893856_TBTN2A1781569759,4526182851265,634
rs3734542_GBTN2A152,209465659,432618056,86565,612
rs3734543_GBTN2A152,002464459,146611456,64665,260
rs3799380_TBTN2A146,887421359,354616451,10065,518
rs56296968_CBTN2A147,941429559,422617052,23665,592
rs6456724_TBTN2A1781369659,4186178850965,596
rs6929846_TBTN2A110,35590359,460618011,25865,640
rs7773938_CBTN2A147,953428959,472616252,24265,634
rs9358944_ABTN2A147,929429459,462618252,22365,644
rs9358945_ABTN2A147,944429259,478618252,23665,660
rs10456045_GBTN3A141,458368059,434617445,13865,608
rs1796520_CBTN3A128,075250459,240617630,57965,416
rs3799378_ABTN3A145,113401759,206615249,13065,358
rs3857549_CBTN3A155,572584859,430617061,42065,600
rs41266839_GBTN3A152,985472059,430617857,70565,608
rs4609015_TBTN3A150,759537159,452617056,13065,622
rs6900725_TBTN3A150,682537859,392618256,06065,574
rs6912853_CBTN3A150,145532959,434617655,47465,610
rs6920986_CBTN3A150,787538159,464618256,16865,646
rs742090_ABTN3A128,172250659,438617430,67865,612
rs11758089_TBTN3A250,176535459,432618255,53065,614
rs12176317_ABTN3A251,604459759,488618256,20165,670
rs12199613_CBTN3A236,321317859,396617839,49965,574
rs1977_ABTN3A250,506449758,436607455,00364,510
rs1979_GBTN3A251,551459059,448617656,14165,624
rs1985732_ABTN3A241,492367459,442617445,16665,616
rs2073526_GBTN3A226,272228859,434617628,56065,610
rs9358934_GBTN3A251,477459159,412617456,06865,586
rs9379855_TBTN3A251,458457959,392616256,03765,554
rs9379858_TBTN3A251,474458659,422617056,06065,592
rs9379859_CBTN3A251,530459559,452617456,12565,626
rs9393713_GBTN3A251,601459059,472617856,19165,650
rs9393714_GBTN3A251,581459459,456618056,17565,636
Table A17. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the non-HLA model. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
Table A17. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the non-HLA model. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
SNP, Reference AlleleGeneNumber of
Controls Homozygous for the Reference Allele
Number of
Controls Heterozygous for the Reference Allele
Number of
Control Individuals Without the Reference Allele
Allele Freq in ControlsHWE
Adjusted p-Value
rs13195402_GBTN2A123,18856843240.8921
rs13195509_GBTN2A123,01862354840.8790.340
rs1407045_ABTN2A1795114,69470080.5161
rs2273558_ABTN2A111,80410,99127720.6770.179
rs2893856_TBTN2A1514678722,4250.8691
rs3734542_GBTN2A122,97862534850.8781
rs3734543_GBTN2A122,86662704370.8791
rs3799380_TBTN2A118,595969713850.7901
rs56296968_CBTN2A119,326928910960.8071
rs6456724_TBTN2A1515678322,4110.8691
rs6929846_TBTN2A1971841320,3460.8261
rs7773938_CBTN2A119,333928711160.8061
rs9358944_ABTN2A119,319929111210.8061
rs9358945_ABTN2A119,326929211210.8061
rs10456045_GBTN3A114,46012,53827190.6981
rs1796520_CBTN3A1673414,60782790.5261
rs3799378_ABTN3A117,15210,80916420.7621
rs3857549_CBTN3A126,08434042270.9351
rs41266839_GBTN3A123,66256613920.8921
rs4609015_TBTN3A121,65774456240.8541
rs6900725_TBTN3A121,62574326390.8531
rs6912853_CBTN3A121,16878097400.8441
rs6920986_CBTN3A121,67274436170.8541
rs742090_ABTN3A1672714,71882740.5261
rs11758089_TBTN3A221,18278127220.8441
rs12176317_ABTN3A222,42067645600.8671
rs12199613_CBTN3A211,04714,22744240.6121
rs1977_ABTN3A221,82868505400.8641
rs1979_GBTN3A222,38867755610.8671
rs1985732_ABTN3A214,43312,62626620.6981
rs2073526_GBTN3A2587414,52493190.5581
rs9358934_GBTN3A222,33068175590.8661
rs9379855_TBTN3A222,32768045650.8661
rs9379858_TBTN3A222,32968165660.8661
rs9379859_CBTN3A222,35868145540.8671
rs9393713_GBTN3A222,42267575570.8681
rs9393714_GBTN3A222,40467735510.8681
Table A18. Single-variant analysis of butyrophilin SNPs and CeD status in binomial regression models that took the HLA loci into account using the UK Biobank dataset. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
Table A18. Single-variant analysis of butyrophilin SNPs and CeD status in binomial regression models that took the HLA loci into account using the UK Biobank dataset. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
SNP NameGeneORUpperLowerAdjusted p-Valueln(OR)
rs10484441_GBTN2A10.9793071.0821860.8877071−0.02091
rs12660069_CBTN2A10.9879371.1714440.8379751−0.01214
rs13195402_GBTN2A10.8128010.8768870.7536578.15 × 10−6−0.20727
rs13195509_GBTN2A10.8249830.8866940.7678191.62 × 10−5−0.19239
rs13437351_GBTN2A11.3599391.7818391.05623610.30744
rs1407045_ABTN2A11.0616811.1250651.00197210.059854
rs142951857_ABTN2A11.1893932.7444210.58987710.173443
rs143104579_GBTN2A11.0455981.3067060.84453410.044589
rs146399224_TBTN2A12678.29NA0.00110717.892934
rs148111655_GBTN2A11.068882.4954560.52204210.066611
rs2273558_ABTN2A10.9240460.9836370.8681861−0.07899
rs2893856_TBTN2A10.9788441.0681090.8959061−0.02138
rs2893857_CBTN2A10.9832761.08680.8911361−0.01687
rs3734539_CBTN2A112320.17NA2.26 × 10−519.418993
rs3734542_GBTN2A10.8283580.8903180.7709642.94 × 10−5−0.18831
rs3734543_GBTN2A10.8458240.9105560.7859660.000823−0.16744
rs3799380_TBTN2A10.9062990.9663170.8502320.260718−0.09839
rs56296968_CBTN2A10.8891180.9492060.8330530.042016−0.11753
rs6456724_TBTN2A10.9755041.0644510.8928561−0.0248
rs6907857_TBTN2A11.2963621.688371.01212310.259562
rs6911470_CBTN2A11.2256261.6660650.9217710.203452
rs6929846_TBTN2A10.9566911.0343470.8840331−0.04427
rs7773913_CBTN2A11.299821.6928431.01484410.262226
rs7773938_CBTN2A10.8924430.952690.836230.062934−0.11379
rs77870445_TBTN2A10.9710981.3001780.7381081−0.02933
rs9348718_ABTN2A11.1006241.2744430.95470310.095877
rs9358943_CBTN2A10.4559628.2666020.0919031−0.78535
rs9358944_ABTN2A10.8888250.9486390.8330010.038293−0.11786
rs9358945_ABTN2A10.8867650.9464070.8310990.02906−0.12018
rs9461254_GBTN2A11.2068761.6429760.9064210.188035
rs10456045_GBTN3A10.9186880.9750870.8656660.527112−0.08481
rs10807008_GBTN3A10.940891.0339810.8574121−0.06093
rs12200782_CBTN3A10.9875531.0900840.8961831−0.01253
rs12207930_CBTN3A10.9948111.0819460.915661−0.0052
rs12208447_CBTN3A10.9718641.2460940.7677551−0.02854
rs12214924_TBTN3A10.9977771.084790.9187121−0.00223
rs143476765_ABTN3A10.5419972.3918240.1759831−0.61249
rs144114619_TBTN3A10.9378761.7082750.5522091−0.06414
rs145059723_ABTN3A10.8512242.5342790.3561781−0.16108
rs1741738_ABTN3A11.0551871.1529990.96685410.053718
rs17610161_GBTN3A10.9486851.043370.8638651−0.05268
rs1796520_CBTN3A10.9253320.9804910.8731660.877114−0.0776
rs3799378_ABTN3A10.8665170.9224760.8141110.000704−0.14327
rs3857549_CBTN3A11.207271.3661461.07030.2499290.188361
rs3902051_ABTN3A10.9478931.0389190.8659911−0.05351
rs41266839_GBTN3A10.8067930.8685080.7497111.06 × 10−6−0.21469
rs4609015_TBTN3A10.9997711.0870840.9204471−0.00023
rs4712990_CBTN3A10.9530221.0471480.86861−0.04812
rs55676749_TBTN3A11.0551361.2951190.86702910.05367
rs56161420_GBTN3A11.0033191.0900260.9244610.003313
rs6900725_TBTN3A10.9985191.0853430.9196131−0.00148
rs6912853_CBTN3A11.0609481.1501250.97968410.059162
rs6920986_CBTN3A10.9972321.084250.9181671−0.00277
rs6921148_TBTN3A11.0759011.3176050.88598610.073159
rs742090_ABTN3A10.9273910.9827920.8750041−0.07538
rs7770214_GBTN3A10.9927371.0793630.9140261−0.00729
rs80153343_GBTN3A11.1443791.4659450.90467910.134862
rs11758089_TBTN3A21.0931631.1882491.0067510.089075
rs12176317_ABTN3A20.8208620.8806470.7653773.50 × 10−6−0.1974
rs12194095_CBTN3A20.9713521.0794780.8758241−0.02907
rs12199613_CBTN3A20.8842970.937140.8344460.003312−0.12296
rs12205731_GBTN3A20.9704151.0790250.8745281−0.03003
rs144016445_GBTN3A238750.66NA151.7424110.5649
rs1977_ABTN3A20.8167810.876780.7611212.06 × 10−6−0.20238
rs1979_GBTN3A20.8207290.8804990.7652553.40 × 10−6−0.19756
rs1985732_ABTN3A20.8960550.9514630.843980.033488−0.10975
rs2073526_GBTN3A20.9253120.9809860.8726480.940603−0.07762
rs35183513_GBTN3A20.9515371.0448430.8677841−0.04968
rs58367598_TBTN3A21.0892931.2908350.92422110.085529
rs7765566_GBTN3A21.1450851.3635070.96778510.135479
rs9104_GBTN3A20.9450891.0329730.865751−0.05648
rs9358934_GBTN3A20.8246010.8847670.768777.53 × 10−6−0.19286
rs9379855_TBTN3A20.823610.8836360.7679056.04 × 10−6−0.19406
rs9379858_TBTN3A20.8256710.8858670.7698118.99 × 10−6−0.19156
rs9379859_CBTN3A20.8248020.8850580.7688938.10 × 10−6−0.19261
rs9379861_GBTN3A21.0395211.3992420.78563410.03876
rs9393713_GBTN3A20.8141570.8734530.7591239.27 × 10−7−0.2056
rs9393714_GBTN3A20.8180220.8776630.7626712.08 × 10−6−0.20087
rs186813312_CBTNL30.387143NANANA−0.94896
rs199970076_GBTNL30.89009619.021050.1056491−0.11643
rs201534771_GBTNL30.373628NANANA−0.98449
rs201813197_CBTNL30.9060253.983190.2950741−0.09869
rs35157246_CBTNL31.0612691.2437490.90964710.059466
rs4700774_GBTNL30.9531211.0141140.8960741−0.04801
rs59220426_CBTNL30.9643961.0947240.8520691−0.03625
rs73815153_GBTNL30.9798581.1135140.8648251−0.02035
rs7713324_ABTNL30.9624341.0925630.8502781−0.03829
rs7726604_CBTNL30.9633251.0935410.8510961−0.03736
rs112469887_GBTNL81.046411.2992620.85063110.045365
rs113071395_GBTNL80.9083911.0654360.7778991−0.09608
rs113534626_ABTNL81.0080721.2048410.84856310.00804
rs141492316_TBTNL80.9303351.1604080.7526161−0.07221
rs145199317_ABTNL80.8846861.2501810.6396861−0.12252
rs151174174_CBTNL80.8280691.0164120.6793781−0.18866
rs17704291_CBTNL80.9522911.0130250.8954811−0.04888
rs200633883_CBTNL80.4195431.6752230.1236181−0.86859
rs201214790_TBTNL8740.6942NA1.97 × 10−816.607588
rs201891387_GBTNL80.4958962.2695130.1473221−0.70139
rs2276995_ABTNL80.9847541.0431620.9297551−0.01536
rs2619739_CBTNL81.0782541.1947650.97492810.075343
rs7724813_GBTNL81.0549831.1503190.96875110.053525
Table A19. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the binomial models that took the HLA loci into account. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
Table A19. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the binomial models that took the HLA loci into account. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
SNP, Reference AlleleGeneNumber of Controls Homozygous for the Reference AlleleNumber of Controls Heterozygous for the Reference AlleleNumber of Control Individuals Without the Reference AlleleAllele Freq in ControlsHWE Adjusted p-Value
rs13195402_GBTN2A123,18856843240.8921
rs13195509_GBTN2A123,01862354840.8790.184
rs3734542_GBTN2A122,97862534850.8780.246
rs3734543_GBTN2A122,86662704370.8791
rs56296968_CBTN2A119,326928910960.8071
rs9358944_ABTN2A119,319929111210.8061
rs9358945_ABTN2A119,326929211210.8061
rs3799378_ABTN3A117,15210,80916420.7621
rs41266839_GBTN3A123,66256613920.8921
rs12176317_ABTN3A222,42067645600.8671
rs12199613_CBTN3A211,04714,22744240.6121
rs1977_ABTN3A221,82868505400.8641
rs1979_GBTN3A222,38867755610.8671
rs1985732_ABTN3A214,43312,62626620.6981
rs9358934_GBTN3A222,33068175590.8661
rs9379855_TBTN3A222,32768045650.8661
rs9379858_TBTN3A222,32968165660.8661
rs9379859_CBTN3A222,35868145540.8671
rs9393713_GBTN3A222,42267575570.8681
rs9393714_GBTN3A222,40467735510.8681
Table A20. Single-variant analysis of butyrophilin SNPs and CeD status using binomial regression models on the HLA-DQ2.5-matched case-control cohort of the UK Biobank database. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
Table A20. Single-variant analysis of butyrophilin SNPs and CeD status using binomial regression models on the HLA-DQ2.5-matched case-control cohort of the UK Biobank database. SNPs significantly associated with CeD status are in bold. Bonferroni correction was applied due to multiple testing.
SNP NameGeneORUpperLowerAdjusted p-Valueln(OR)
rs10484441_GBTN2A11.0268791.1826550.89451310.026524
rs12660069_CBTN2A10.9543821.2075790.7618031−0.04669
rs13195402_GBTN2A10.7572060.8313280.6898645.10 × 10−7−0.27812
rs13195509_GBTN2A10.774590.8466480.7088371.75 × 10−6−0.25542
rs13437351_GBTN2A11.4007422.0904780.97392110.337002
rs1407045_ABTN2A11.0801931.1707560.99697710.07714
rs142951857_ABTN2A11.2877484.4319290.48653610.252895
rs143104579_GBTN2A11.3316161.8858970.96302710.286393
rs146399224_TBTN2A10.25741NANANA−1.35708
rs148111655_GBTN2A10.9442974.1789250.2944811−0.05731
rs2273558_ABTN2A10.8928850.9713880.8207180.848901−0.1133
rs2893856_TBTN2A10.9274041.0488090.8180431−0.07537
rs2893857_CBTN2A11.0405651.1992790.90585410.039764
rs3734539_CBTN2A127166.76NA9.99 × 10−14110.20975
rs3734542_GBTN2A10.776970.8491810.7110732.52 × 10−6−0.25235
rs3734543_GBTN2A10.7923170.8679520.7234635.43 × 10−5−0.23279
rs3799380_TBTN2A10.8695480.9455730.7997870.107572−0.13978
rs56296968_CBTN2A10.8527550.9283010.7835130.02331−0.15928
rs6456724_TBTN2A10.9245811.0454790.8156541−0.07841
rs6907857_TBTN2A11.4011322.0910610.97419210.33728
rs6911470_CBTN2A11.5654362.6797230.97808810.448164
rs6929846_TBTN2A10.8708730.9738320.7772551−0.13826
rs7773913_CBTN2A11.4084092.101830.97932610.34246
rs7773938_CBTN2A10.8542330.9297780.7849850.026611−0.15755
rs77870445_TBTN2A11.0224511.5301460.70415110.022203
rs9348718_ABTN2A11.3088131.6367931.05784910.26912
rs9358943_CBTN2A10.257602NANANA−1.35634
rs9358944_ABTN2A10.8501210.9249660.7814820.016059−0.16238
rs9358945_ABTN2A10.8479050.9225710.779430.012607−0.16499
rs9461254_GBTN2A11.5650012.6789730.97781810.447886
rs10456045_GBTN3A10.8720240.9441980.8053710.074344−0.13694
rs10807008_GBTN3A10.9882641.1289720.8675141−0.01181
rs12200782_CBTN3A10.9692411.1121660.847191−0.03124
rs12207930_CBTN3A11.0594071.193610.9422810.05771
rs12208447_CBTN3A11.1830331.714850.83856210.168081
rs12214924_TBTN3A11.0596171.1928430.94324910.057908
rs143476765_ABTN3A10.3859352.9318950.0638951−0.95209
rs144114619_TBTN3A10.8003061.801420.3919481−0.22276
rs145059723_ABTN3A11.54693429.225790.26401210.436275
rs1741738_ABTN3A11.1263981.2884650.98768910.119025
rs17610161_GBTN3A10.9993361.1429650.8762831−0.00066
rs1796520_CBTN3A10.9019740.9775990.8318771−0.10317
rs3799378_ABTN3A10.8293460.9003970.7639680.00081−0.18712
rs3857549_CBTN3A11.3078211.5574821.1051470.2174460.268362
rs3902051_ABTN3A10.9796111.1134040.8640431−0.0206
rs41266839_GBTN3A10.7537670.8247920.689037.25 × 10−8−0.28267
rs4609015_TBTN3A11.0551221.1877870.93924110.053657
rs4712990_CBTN3A11.0097581.1539060.88612610.00971
rs55676749_TBTN3A11.3560151.8678021.00729310.30455
rs56161420_GBTN3A11.0603541.1932950.94418410.058603
rs6900725_TBTN3A11.0613491.1943130.94517510.059541
rs6912853_CBTN3A11.0874231.2162030.97415110.08381
rs6920986_CBTN3A11.0627341.1966510.945810.060845
rs6921148_TBTN3A11.3402821.8345221.00085810.29288
rs742090_ABTN3A10.9035540.9794230.8332411−0.10142
rs7770214_GBTN3A11.0530311.1853690.93742110.051673
rs80153343_GBTN3A10.9792161.3475280.7248041−0.021
rs11758089_TBTN3A21.1913271.3520071.0524990.6181460.175068
rs12176317_ABTN3A20.7666280.8369490.7023712.82 × 10−7−0.26575
rs12194095_CBTN3A20.9436721.0923860.8180341−0.05798
rs12199613_CBTN3A20.8422310.9113730.778190.002057−0.1717
rs12205731_GBTN3A20.9339981.0819090.8091171−0.06828
rs144016445_GBTN3A273914.07NA1.75 × 10−6111.21066
rs1977_ABTN3A20.7649040.8358010.7001682.99 × 10−7−0.268
rs1979_GBTN3A20.768160.8385820.703813.63 × 10−7−0.26376
rs1985732_ABTN3A20.8601690.9320120.7938570.023502−0.15063
rs2073526_GBTN3A20.902520.9790480.8316121−0.10256
rs35183513_GBTN3A21.0006531.1402680.88048610.000652
rs58367598_TBTN3A21.1539841.4704450.91519710.143221
rs7765566_GBTN3A21.1717541.4971360.92819210.158502
rs9104_GBTN3A21.0106741.1451960.89413410.010617
rs9358934_GBTN3A20.7724040.8435490.7074238.85 × 10−7−0.25825
rs9379855_TBTN3A20.7713060.8421740.7065646.76 × 10−7−0.25967
rs9379858_TBTN3A20.7744070.8456250.7093521.18 × 10−6−0.25566
rs9379859_CBTN3A20.7702420.8412540.7053846.31 × 10−7−0.26105
rs9379861_GBTN3A20.9876221.5867570.6391981−0.01246
rs9393713_GBTN3A20.7609960.8308680.6971491.06 × 10−7−0.27313
rs9393714_GBTN3A20.7650740.8353560.7008592.25 × 10−7−0.26778
rs186813312_CBTNL30.257602NANANA−1.35634
rs199970076_GBTNL30.2729266.9044920.0107881−1.29856
rs201534771_GBTNL30.273671NANANA−1.29583
rs201813197_CBTNL30.5146973.7153010.1003721−0.66418
rs35157246_CBTNL31.037511.2875240.84255110.036824
rs4700774_GBTNL30.9788371.0655250.8997061−0.02139
rs59220426_CBTNL31.0177621.2164050.85625210.017606
rs73815153_GBTNL31.024741.2258250.86144210.024438
rs7713324_ABTNL31.0181191.2168410.85654610.017957
rs7726604_CBTNL31.0207631.2199530.8588110.02055
rs112469887_GBTNL81.0029181.3332870.76518710.002914
rs113071395_GBTNL80.931551.1626340.7523381−0.07091
rs113534626_ABTNL80.93711.1852340.7478451−0.06496
rs141492316_TBTNL80.9637981.3143910.7186011−0.03687
rs145199317_ABTNL81.125441.9123930.69744810.118174
rs151174174_CBTNL80.7297640.9534570.5642231−0.31503
rs17704291_CBTNL80.9694511.0551550.8912121−0.03103
rs200633883_CBTNL80.1713721.0351190.0225581−1.76392
rs201214790_TBTNL80.256248NANANA−1.36161
rs201891387_GBTNL80.1714761.0357470.0225721−1.76331
rs2276995_ABTNL81.0020571.0843320.92628310.002055
rs2619739_CBTNL80.9669681.1078120.8464351−0.03359
rs7724813_GBTNL81.0416031.1719880.92772510.040761
Table A21. SNP and allele count of the SNPs significantly associated with CeD in UK Biobank participants with the HLA-DQ2.5 genotype. These SNPs were significantly associated with CeD status in the HLA-DQ2.5-matched single-variant testing of the UK Biobank dataset.
Table A21. SNP and allele count of the SNPs significantly associated with CeD in UK Biobank participants with the HLA-DQ2.5 genotype. These SNPs were significantly associated with CeD status in the HLA-DQ2.5-matched single-variant testing of the UK Biobank dataset.
Participants with the HLA-DQ2.5 Genotype
SNP, Reference AlleleGeneNumber of SNPs in Controls with HLA-DQ2.5Number of SNPs in CeD with HLA-DQ2.5Total Allele count in Control with HLA-DQ2.5Total Allele Count in CeD with HLA-DQ2.5Total Number of SNPs in the UK Biobank with HLA-DQ2.5Total Allele Count in UK Biobank with HLA-DQ2.5
rs13195402_GBTN2A19398225612,514320611,65415,720
rs13195509_GBTN2A19408226512,820329611,67316,116
rs3734542_GBTN2A19386226612,808330011,65216,108
rs3734543_GBTN2A19366226512,722325811,63115,980
rs56296968_CBTN2A18609210712,798329010,71616,088
rs7773938_CBTN2A18606210612,804329010,71216,094
rs9358944_ABTN2A18603210812,818330410,71116,122
rs9358945_ABTN2A18607210612,818330210,71316,120
rs3799378_ABTN3A18129195912,768328610,08816,054
rs41266839_GBTN3A19577229812,820329811,87516,118
rs12176317_ABTN3A29347224212,826330211,58916,128
rs12199613_CBTN3A26396151512,8023300791116,102
rs1977_ABTN3A29135219712,574324811,33215,822
rs1979_GBTN3A29330224212,808330211,57216,110
rs1985732_ABTN3A27353177712,8163294913016,110
rs9358934_GBTN3A29333224012,810329211,57316,102
rs9379855_TBTN3A29324223912,802329411,56316,096
rs9379858_TBTN3A29321224212,804329611,56316,100
rs9379859_CBTN3A29340224412,806329611,58416,102
rs9393713_GBTN3A29345223712,812329811,58216,110
rs9393714_GBTN3A29346224112,818330011,58716,118
Table A22. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the HLA-DQ2.5 matched case-control models. The frequency of all the examined SNPs significantly differed from the Hardy–Weinberg equilibrium. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
Table A22. The genotypes and Hardy–Weinberg equilibrium of the significant SNPs in the control participants from the HLA-DQ2.5 matched case-control models. The frequency of all the examined SNPs significantly differed from the Hardy–Weinberg equilibrium. The Hardy–Weinberg equilibrium of each SNP in the control group was assessed using the HardyWeinberg R package [94]. Bonferroni correction was applied due to multiple testing.
SNP, Reference AlleleGeneNumber of Controls Homozygous for the Reference AlleleNumber of Controls Heterozygous for the Reference AlleleNumber of Control Individuals Without the Reference AlleleAllele Freq in ControlsHWE Adjusted p-Value
rs13195402_GBTN2A1335326922120.7512.65 × 10−31
rs13195509_GBTN2A1330328023050.7343.26 × 10−20
rs3734542_GBTN2A1329028063080.7333.69 × 10−20
rs3734543_GBTN2A1327828102730.7361.35 × 10−26
rs56296968_CBTN2A1273731355270.6732.65 × 10−31
rs7773938_CBTN2A1273731325330.6722.65 × 10−31
rs9358944_ABTN2A1273431355400.6712.65 × 10−31
rs9358945_ABTN2A1273731335390.6712.65 × 10−31
rs3799378_ABTN3A1244632377010.6372.65 × 10−31
rs41266839_GBTN3A1342927192620.7472.65 × 10−31
rs12176317_ABTN3A2326728133330.7292.65 × 10−31
rs12199613_CBTN3A21505338615100.5002.65 × 10−31
rs1977_ABTN3A2317227913240.7262.65 × 10−31
rs1979_GBTN3A2326028103340.7282.65 × 10−31
rs1985732_ABTN3A21974340510290.5742.65 × 10−31
rs9358934_GBTN3A2325928153310.7292.65 × 10−31
rs9379855_TBTN3A2325728103340.7282.65 × 10−31
rs9379858_TBTN3A2325328153340.7282.65 × 10−31
rs9379859_CBTN3A2326328143260.7292.65 × 10−31
rs9393713_GBTN3A2326928073300.7292.65 × 10−31
rs9393714_GBTN3A2326728123300.7292.65 × 10−31

Appendix J. Supplementary Materials for Results Section 2.3

Table A23. There were no significant differences in the TRGV usage of FFPE CeD (n = 45) and healthy control (n = 108) samples after Bonferroni correction was applied. Raw p-values from Mann–Whitney U (MWU) tests were adjusted using Bonferroni correction to account for false positives due to multiple testing.
Table A23. There were no significant differences in the TRGV usage of FFPE CeD (n = 45) and healthy control (n = 108) samples after Bonferroni correction was applied. Raw p-values from Mann–Whitney U (MWU) tests were adjusted using Bonferroni correction to account for false positives due to multiple testing.
FFPE CeD (n = 45) vs. FFPE Healthy Control (n = 108)
Raw p-Value (MWU)Adjusted p-Value
TRGV20.7751
TRGV30.4111
TRGV40.8121
TRGV50.9061
TRGV5P0.6841
TRGV70.5661
TRGV80.3821
TRGV90.0700.70
TRGV100.2481
TRGV110.0250.25
Figure A13. The TRGV usage of (a) healthy control (n = 108) and (b) CeD FFPE duodenal samples (n = 45) was not normally distributed in the duodenal samples subjected to TRGV usage analysis.
Figure A13. The TRGV usage of (a) healthy control (n = 108) and (b) CeD FFPE duodenal samples (n = 45) was not normally distributed in the duodenal samples subjected to TRGV usage analysis.
Ijms 26 10697 g0a13aIjms 26 10697 g0a13b
Figure A14. More than 82% of both healthy control and CeD samples were homozygous for the reference HV4 amino acid sequence in the combined cohort. The HV4 analysis was carried out on a cohort of 238 healthy controls and 141 CeD samples. The homozygous reference HV4 sequence was the most common phenotype in both CeD and healthy control samples. Only 10 healthy control and 3 CeD samples did not have any WT HV4 sequences.
Figure A14. More than 82% of both healthy control and CeD samples were homozygous for the reference HV4 amino acid sequence in the combined cohort. The HV4 analysis was carried out on a cohort of 238 healthy controls and 141 CeD samples. The homozygous reference HV4 sequence was the most common phenotype in both CeD and healthy control samples. Only 10 healthy control and 3 CeD samples did not have any WT HV4 sequences.
Ijms 26 10697 g0a14
Table A24. There were no significant differences in the HV4 distribution between healthy controls (n = 238) and CeD patients (n = 141). The reference amino acid sequence KYDTYGSTRKNLRMILR is noted as WT in the table. Sequences with amino acid substitutions are provided in full. Pairwise Fisher’s exact test with Bonferroni correction was applied on the different HV4 phenotypes in CeD and healthy control patients.
Table A24. There were no significant differences in the HV4 distribution between healthy controls (n = 238) and CeD patients (n = 141). The reference amino acid sequence KYDTYGSTRKNLRMILR is noted as WT in the table. Sequences with amino acid substitutions are provided in full. Pairwise Fisher’s exact test with Bonferroni correction was applied on the different HV4 phenotypes in CeD and healthy control patients.
141 CeD vs. 238 Healthy Control Samples
Raw p-Values (Fisher)Adjusted p-Values
WT vs. WT, KYDTYGSTRQNLRMIL0.39251
WT vs. KYDTYGSTRQNLRMILR0.33921
WT vs. KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA0.36681
WT vs. WT, KYNTYGSTRKNLRMILR0.36681
WT vs. KYDTYGNTRKNLRMILR, WT11
WT vs. WT, KYDTYGSTRKSLRMILR0.62581
WT vs. KYDTYGSTRKSLRMILR11
WT vs. WT, KYDTYGSIRKNLRMILR0.36681
WT, KYDTYGSTRQNLRMILR vs. KYDTYGSTRQNLRMILR0.16971
WT, KYDTYGSTRQNLRMILR vs. KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA0.45241
WT, KYDTYGSTRQNLRMILR vs. WT, KYNTYGSTRKNLRMILR0.45241
WT, KYDTYGSTRQNLRMILR vs. WT, KYDTYGNTRKNLRMILR11
WT, KYDTYGSTRQNLRMILR vs. WT, KYDTYGSTRKSLRMILR11
WT, KYDTYGSTRQNLRMILR vs. KYDTYGSTRKSLRMILR11
WT, KYDTYGSTRQNLRMILR vs. WT, KYDTYGSIRKNLRMILR0.45241
KYDTYGSTRQNLRMILR vs. KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA0.251
KYDTYGSTRQNLRMILR vs. WT, KYNTYGSTRKNLRMILR0.251
KYDTYGSTRQNLRMILR vs. WT, KYDTYGNTRKNLRMILR11
KYDTYGSTRQNLRMILR vs. WT, KYDTYGSTRKSLRMILR0.51651
KYDTYGSTRQNLRMILR vs. KYDTYGSTRKSLRMILR11
KYDTYGSTRQNLRMILR vs. WT, KYDTYGSIRKNLRMILR0.251
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA vs. WT, KYNTYGSTRKNLRMILR11
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA vs. WT, KYDTYGNTRKNLRMILR11
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA vs. WT, KYDTYGSTRKSLRMILR11
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA vs. KYDTYGSTRKSLRMILR11
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA vs. WT, KYDTYGSIRKNLRMILR11
WT, KYNTYGSTRKNLRMILR vs. WT, KYDTYGNTRKNLRMILR11
WT, KYNTYGSTRKNLRMILR vs. WT, KYDTYGSTRKSLRMILR11
WT, KYNTYGSTRKNLRMILR vs. KYDTYGSTRKSLRMILR11
WT, KYNTYGSTRKNLRMILR vs. WT, KYDTYGSIRKNLRMILR11
WT, KYDTYGNTRKNLRMILR vs. WT, KYDTYGSTRKSLRMILR11
WT, KYDTYGNTRKNLRMILR vs. KYDTYGSTRKSLRMILR11
WT, KYDTYGNTRKNLRMILR vs. WT, KYDTYGSIRKNLRMILR11
WT, KYDTYGSTRKSLRMILR vs. KYDTYGSTRKSLRMILR11
WT, KYDTYGSTRKSLRMILR vs. WT, KYDTYGSIRKNLRMILR11
KYDTYGSTRKSLRMILR vs. WT, KYDTYGSIRKNLRMILR11

References

  1. Abadie, V.; Sollid, L.M.; Barreiro, L.B.; Jabri, B. Integration of genetic and immunological insights into a model of celiac disease pathogenesis. Annu. Rev. Immunol. 2011, 29, 493–525. [Google Scholar] [CrossRef]
  2. Jabri, B.; Sollid, L.M. Tissue-mediated control of immunopathology in coeliac disease. Nat. Rev. Immunol. 2009, 9, 858–870. [Google Scholar] [CrossRef]
  3. Trier, J.S. Diagnosis of celiac sprue. Gastroenterology 1998, 115, 211–216. [Google Scholar] [CrossRef] [PubMed]
  4. NICE. Coeliac Disease: Recognition, Assessment and Management. Available online: https://www.nice.org.uk/guidance/ng20/chapter/Recommendations (accessed on 27 November 2020).
  5. Al-Toma, A.; Goerres, M.S.; Meijer, J.W.; Pena, A.S.; Crusius, J.B.; Mulder, C.J. Human leukocyte antigen-DQ2 homozygosity and the development of refractory celiac disease and enteropathy-associated T-cell lymphoma. Clin. Gastroenterol. Hepatol. 2006, 4, 315–319. [Google Scholar] [CrossRef] [PubMed]
  6. Ayesh, B.M.; Zaqout, E.K.; Yassin, M.M. HLA-DQ2 and -DQ8 haplotypes frequency and diagnostic utility in celiac disease patients of Gaza strip, Palestine. Autoimmun Highlights 2017, 8, 11. [Google Scholar] [CrossRef] [PubMed]
  7. Björck, S.; Brundin, C.; Lörinc, E.; Lynch, K.F.; Agardh, D. Screening detects a high proportion of celiac disease in young HLA-genotyped children. J. Pediatr. Gastroenterol. Nutr. 2010, 50, 49–53. [Google Scholar] [CrossRef]
  8. Karell, K.; Louka, A.S.; Moodie, S.J.; Ascher, H.; Clot, F.; Greco, L.; Ciclitira, P.J.; Sollid, L.M.; Partanen, J.; European Genetics Cluster on Celiac, D. HLA types in celiac disease patients not carrying the DQA1*05-DQB1*02 (DQ2) heterodimer: Results from the European Genetics Cluster on Celiac Disease. Hum. Immunol. 2003, 64, 469–477. [Google Scholar] [CrossRef]
  9. Karhus, L.L.; Thuesen, B.H.; Skaaby, T.; Rumessen, J.J.; Linneberg, A. The distribution of HLA DQ2 and DQ8 haplotypes and their association with health indicators in a general Danish population. United Eur. Gastroenterol. 2018, 6, 866–878. [Google Scholar] [CrossRef]
  10. Murad, H.; Jazairi, B.; Khansaa, I.; Olabi, D.; Khouri, L. HLA-DQ2 and -DQ8 genotype frequency in Syrian celiac disease children: HLA-DQ relative risks evaluation. BMC Gastroenterol. 2018, 18, 70. [Google Scholar] [CrossRef]
  11. Sollid, L.M.; Thorsby, E. HLA susceptibility genes in celiac disease: Genetic mapping and role in pathogenesis. Gastroenterology 1993, 105, 910–922. [Google Scholar] [CrossRef]
  12. Sollid, L.M. Molecular basis of celiac disease. Annu. Rev. Immunol. 2000, 18, 53–81. [Google Scholar] [CrossRef] [PubMed]
  13. Sollid, L.M.; Markussen, G.; Ek, J.; Gjerde, H.; Vartdal, F.; Thorsby, E. Evidence for a primary association of celiac disease to a particular HLA-DQ alpha/beta heterodimer. J. Exp. Med. 1989, 169, 345–350. [Google Scholar] [CrossRef] [PubMed]
  14. Sollid, L.M.; Thorsby, E. The primary association of celiac disease to a given HLA-DQ alpha/beta heterodimer explains the divergent HLA-DR associations observed in various Caucasian populations. Tissue Antigens 1990, 36, 136–137. [Google Scholar] [CrossRef] [PubMed]
  15. Rubio-Tapia, A.; Hill, I.D.; Kelly, C.P.; Calderwood, A.H.; Murray, J.A.; American College of Gastroenterology. ACG clinical guidelines: Diagnosis and management of celiac disease. Am. J. Gastroenterol. 2013, 108, 656–676. [Google Scholar] [CrossRef]
  16. Djilali-Saiah, I.; Caillat-Zucman, S.; Schmitz, J.; Chaves-Vieira, M.L.; Bach, J.F. Polymorphism of antigen processing (TAP, LMP) and HLA class II genes in celiac disease. Hum. Immunol. 1994, 40, 8–16. [Google Scholar] [CrossRef]
  17. Hunt, K.A.; Zhernakova, A.; Turner, G.; Heap, G.A.; Franke, L.; Bruinenberg, M.; Romanos, J.; Dinesen, L.C.; Ryan, A.W.; Panesar, D.; et al. Newly identified genetic risk variants for celiac disease related to the immune response. Nat. Genet. 2008, 40, 395–402. [Google Scholar] [CrossRef]
  18. Dubois, P.C.; van Heel, D.A. Translational mini-review series on the immunogenetics of gut disease: Immunogenetics of coeliac disease. Clin. Exp. Immunol. 2008, 153, 162–173. [Google Scholar] [CrossRef]
  19. Goudey, B.; Abraham, G.; Kikianty, E.; Wang, Q.; Rawlinson, D.; Shi, F.; Haviv, I.; Stern, L.; Kowalczyk, A.; Inouye, M. Interactions within the MHC contribute to the genetic architecture of celiac disease. PLoS ONE 2017, 12, e0172826. [Google Scholar] [CrossRef]
  20. Pietz, G.; De, R.; Hedberg, M.; Sjoberg, V.; Sandstrom, O.; Hernell, O.; Hammarstrom, S.; Hammarstrom, M.L. Immunopathology of childhood celiac disease-Key role of intestinal epithelial cells. PLoS ONE 2017, 12, e0185025. [Google Scholar] [CrossRef]
  21. Mayassi, T.; Ladell, K.; Gudjonson, H.; McLaren, J.E.; Shaw, D.G.; Tran, M.T.; Rokicka, J.J.; Lawrence, I.; Grenier, J.C.; van Unen, V.; et al. Chronic inflammation permanently reshapes tissue-resident immunity in celiac disease. Cell 2019, 176, 967–981. [Google Scholar] [CrossRef] [PubMed]
  22. Rhodes, D.A.; Stammers, M.; Malcherek, G.; Beck, S.; Trowsdale, J. The cluster of BTN genes in the extended major histocompatibility complex. Genomics 2001, 71, 351–362. [Google Scholar] [CrossRef]
  23. Arnett, H.A.; Viney, J.L. Immune modulation by butyrophilins. Nat. Rev. Immunol. 2014, 14, 559–569. [Google Scholar] [CrossRef] [PubMed]
  24. Rhodes, D.A.; Reith, W.; Trowsdale, J. Regulation of immunity by butyrophilins. Annu. Rev. Immunol. 2016, 34, 151–172. [Google Scholar] [CrossRef]
  25. Malcherek, G.; Mayr, L.; Roda-Navarro, P.; Rhodes, D.; Miller, N.; Trowsdale, J. The B7 homolog butyrophilin BTN2A1 is a novel ligand for DC-SIGN. J. Immunol. 2007, 179, 3804–3811. [Google Scholar] [CrossRef]
  26. Messal, N.; Mamessier, E.; Sylvain, A.; Celis-Gutierrez, J.; Thibult, M.L.; Chetaille, B.; Firaguay, G.; Pastor, S.; Guillaume, Y.; Wang, Q.; et al. Differential role for CD277 as a co-regulator of the immune signal in T and NK cells. Eur. J. Immunol. 2011, 41, 3443–3454. [Google Scholar] [CrossRef]
  27. Di Marco Barros, R.; Roberts, N.A.; Dart, R.J.; Vantourout, P.; Jandke, A.; Nussbaumer, O.; Deban, L.; Cipolat, S.; Hart, R.; Iannitto, M.L.; et al. Epithelia use butyrophilin-like molecules to shape organ-specific gamma delta T cell compartments. Cell 2016, 167, 203–218. [Google Scholar] [CrossRef]
  28. Jandke, A.; Melandri, D.; Monin, L.; Ushakov, D.S.; Laing, A.G.; Vantourout, P.; East, P.; Nitta, T.; Narita, T.; Takayanagi, H.; et al. Butyrophilin-like proteins display combinatorial diversity in selecting and maintaining signature intraepithelial gammadelta T cell compartments. Nat. Commun. 2020, 11, 3769. [Google Scholar] [CrossRef] [PubMed]
  29. Melandri, D.; Zlatareva, I.; Chaleil, R.A.G.; Dart, R.J.; Chancellor, A.; Nussbaumer, O.; Polyakova, O.; Roberts, N.A.; Wesch, D.; Kabelitz, D.; et al. The γδ TCR combines innate immunity with adaptive immunity by utilizing spatially distinct regions for agonist selection and antigen responsiveness. Nat. Immunol. 2018, 19, 1352–1365. [Google Scholar] [CrossRef]
  30. Vantourout, P.; Laing, A.; Woodward, M.J.; Zlatareva, I.; Apolonia, L.; Jones, A.W.; Snijders, A.P.; Malim, M.H.; Hayday, A.C. Heteromeric interactions regulate butyrophilin (BTN) and BTN-like molecules governing gammadelta T cell biology. Proc. Natl. Acad. Sci. USA 2018, 115, 1039–1044. [Google Scholar] [CrossRef]
  31. Willcox, C.R.; Vantourout, P.; Salim, M.; Zlatareva, I.; Melandri, D.; Zanardo, L.; George, R.; Kjaer, S.; Jeeves, M.; Mohammed, F.; et al. Butyrophilin-like 3 Directly Binds a Human Vγ4(+) T Cell Receptor Using a Modality Distinct from Clonally-Restricted Antigen. Immunity 2019, 51, 813–825 e814. [Google Scholar] [CrossRef] [PubMed]
  32. Lewis, J.M.; Girardi, M.; Roberts, S.J.; Barbee, S.D.; Hayday, A.C.; Tigelaar, R.E. Selection of the cutaneous intraepithelial γδ+ T cell repertoire by a thymic stromal determinant. Nat. Immunol. 2006, 7, 843–850. [Google Scholar] [CrossRef]
  33. Cano, C.E.; Pasero, C.; De Gassart, A.; Kerneur, C.; Gabriac, M.; Fullana, M.; Granarolo, E.; Hoet, R.; Scotet, E.; Rafia, C.; et al. BTN2A1, an immune checkpoint targeting Vγ9Vδ2 T cell cytotoxicity against malignant cells. Cell Rep. 2021, 36, 109359. [Google Scholar] [CrossRef] [PubMed]
  34. Hayday, A.C.; Vantourout, P. The innate biologies of adaptive antigen receptors. Annu. Rev. Immunol. 2020, 38, 487–510. [Google Scholar] [CrossRef]
  35. Karunakaran, M.M.; Gobel, T.W.; Starick, L.; Walter, L.; Herrmann, T. Vγ9 and Vδ2 T cell antigen receptor genes and butyrophilin 3 (BTN3) emerged with placental mammals and are concomitantly preserved in selected species like alpaca (Vicugna pacos). Immunogenetics 2014, 66, 243–254. [Google Scholar] [CrossRef]
  36. Fichtner, A.S.; Karunakaran, M.M.; Gu, S.; Boughter, C.T.; Borowska, M.T.; Starick, L.; Nohren, A.; Gobel, T.W.; Adams, E.J.; Herrmann, T. Alpaca (Vicugna pacos), the first nonprimate species with a phosphoantigen-reactive Vγ9Vδ2 T cell subset. Proc. Natl. Acad. Sci. USA 2020, 117, 6697–6707. [Google Scholar] [CrossRef] [PubMed]
  37. Rigau, M.; Ostrouska, S.; Fulford, T.S.; Johnson, D.N.; Woods, K.; Ruan, Z.; McWilliam, H.E.G.; Hudson, C.; Tutuka, C.; Wheatley, A.K.; et al. Butyrophilin 2A1 is essential for phosphoantigen reactivity by gammadelta T cells. Science 2020, 367, eaay5516. [Google Scholar] [CrossRef] [PubMed]
  38. Sandstrom, A.; Peigne, C.M.; Leger, A.; Crooks, J.E.; Konczak, F.; Gesnel, M.C.; Breathnach, R.; Bonneville, M.; Scotet, E.; Adams, E.J. The intracellular B30.2 domain of butyrophilin 3A1 binds phosphoantigens to mediate activation of human Vγ9Vδ2 T cells. Immunity 2014, 40, 490–500. [Google Scholar] [CrossRef] [PubMed]
  39. Hu, W.; Shang, R.; Yang, J.; Chen, C.; Liu, Z.; Liang, G.; He, W.; Luo, G. Skin γδ T tells and their function in wound healing. Front. Immunol. 2022, 13, 875076. [Google Scholar] [CrossRef]
  40. Han, A.; Newell, E.W.; Glanville, J.; Fernandez-Becker, N.; Khosla, C.; Chien, Y.H.; Davis, M.M. Dietary gluten triggers concomitant activation of CD4+ and CD8+ alphabeta T cells and gammadelta T cells in celiac disease. Proc. Natl. Acad. Sci. USA 2013, 110, 13073–13078. [Google Scholar] [CrossRef]
  41. Aigner, J.; Villatoro, S.; Rabionet, R.; Roquer, J.; Jimenez-Conde, J.; Marti, E.; Estivill, X. A common 56-kilobase deletion in a primate-specific segmental duplication creates a novel butyrophilin-like protein. BMC Genet. 2013, 14, 61. [Google Scholar] [CrossRef]
  42. Mitsunaga, S.; Hosomichi, K.; Okudaira, Y.; Nakaoka, H.; Kunii, N.; Suzuki, Y.; Kuwana, M.; Sato, S.; Kaneko, Y.; Homma, Y.; et al. Exome sequencing identifies novel rheumatoid arthritis-susceptible variants in the BTNL2. J. Hum. Genet. 2013, 58, 210–215. [Google Scholar] [CrossRef] [PubMed]
  43. Sirota, M.; Schaub, M.A.; Batzoglou, S.; Robinson, W.H.; Butte, A.J. Autoimmune disease classification by inverse association with SNP alleles. PLoS Genet. 2009, 5, e1000792. [Google Scholar] [CrossRef]
  44. Orozco, G.; Eerligh, P.; Sanchez, E.; Zhernakova, S.; Roep, B.O.; Gonzalez-Gay, M.A.; Lopez-Nevot, M.A.; Callejas, J.L.; Hidalgo, C.; Pascual-Salcedo, D.; et al. Analysis of a functional BTNL2 polymorphism in type 1 diabetes, rheumatoid arthritis, and systemic lupus erythematosus. Hum. Immunol. 2005, 66, 1235–1241. [Google Scholar] [CrossRef]
  45. Traherne, J.A.; Barcellos, L.F.; Sawcer, S.J.; Compston, A.; Ramsay, P.P.; Hauser, S.L.; Oksenberg, J.R.; Trowsdale, J. Association of the truncating splice site mutation in BTNL2 with multiple sclerosis is secondary to HLA-DRB1*15. Hum. Mol. Genet. 2006, 15, 155–161. [Google Scholar] [CrossRef]
  46. Hippich, M.; Beyerlein, A.; Hagopian, W.A.; Krischer, J.P.; Vehik, K.; Knoop, J.; Winker, C.; Toppari, J.; Lernmark, A.; Rewers, M.J.; et al. Genetic contribution to the divergence in type 1 diabetes risk between children from the general population and children from affected families. Diabetes 2019, 68, 847–857. [Google Scholar] [CrossRef] [PubMed]
  47. He, C.; Hamon, S.; Li, D.; Barral-Rodriguez, S.; Ott, J.; Diabetes Genetics Consortium. MHC fine mapping of human type 1 diabetes using the T1DGC data. Diabetes Obes. Metab. 2009, 11 (Suppl. 1), 53–59. [Google Scholar] [CrossRef]
  48. Boyle, A.P.; Hong, E.L.; Hariharan, M.; Cheng, Y.; Schaub, M.A.; Kasowski, M.; Karczewski, K.J.; Park, J.; Hitz, B.C.; Weng, S.; et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012, 22, 1790–1797. [Google Scholar] [CrossRef] [PubMed]
  49. Dong, S.; Zhao, N.; Spragins, E.; Kagda, M.S.; Li, M.; Assis, P.; Jolanki, O.; Luo, Y.; Cherry, J.M.; Boyle, A.P.; et al. Annotating and prioritizing human non-coding variants with RegulomeDB. bioRxiv 2022. [Google Scholar] [CrossRef]
  50. Spurkland, A.; Sollid, L.M.; Polanco, I.; Vartdal, F.; Thorsby, E. HLA-DR and -DQ genotypes of celiac disease patients serologically typed to be non-DR3 or non-DR5/7. Hum. Immunol. 1992, 35, 188–192. [Google Scholar] [CrossRef]
  51. Dart, R.J.; Zlatareva, I.; Vantourout, P.; Theodoridis, E.; Amar, A.; Kannambath, S.; East, P.; Recaldin, T.; Mansfield, J.C.; Lamb, C.A.; et al. Conserved gammadelta T cell selection by BTNL proteins limits progression of human inflammatory bowel disease. Science 2023, 381, eadh0301. [Google Scholar] [CrossRef]
  52. Guo, M.H.; Plummer, L.; Chan, Y.M.; Hirschhorn, J.N.; Lippincott, M.F. Burden Testing of Rare Variants Identified through Exome Sequencing via Publicly Available Control Data. Am. J. Hum. Genet. 2018, 103, 522–534. [Google Scholar] [CrossRef] [PubMed]
  53. Guo, M.H. Burden Testing Against Public Controls. Available online: https://github.com/mhguo1/TRAPD (accessed on 24 April 2023).
  54. Viken, M.K.; Blomhoff, A.; Olsson, M.; Akselsen, H.E.; Pociot, F.; Nerup, J.; Kockum, I.; Cambon-Thomsen, A.; Thorsby, E.; Undlien, D.E.; et al. Reproducible association with type 1 diabetes in the extended class I region of the major histocompatibility complex. Genes Immun. 2009, 10, 323–333. [Google Scholar] [CrossRef]
  55. Horton, R.; Wilming, L.; Rand, V.; Lovering, R.C.; Bruford, E.A.; Khodiyar, V.K.; Lush, M.J.; Povey, S.; Talbot, C.C., Jr.; Wright, M.W.; et al. Gene map of the extended human MHC. Nat. Rev. Genet. 2004, 5, 889–899. [Google Scholar] [CrossRef]
  56. Bycroft, C.; Freeman, C.; Petkova, D.; Band, G.; Elliott, L.T.; Sharp, K.; Motyer, A.; Vukcevic, D.; Delaneau, O.; O’Connell, J.; et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 2018, 562, 203–209. [Google Scholar] [CrossRef]
  57. Foers, A.D.; Shoukat, M.S.; Welsh, O.E.; Donovan, K.; Petry, R.; Evans, S.C.; FitzPatrick, M.E.; Collins, N.; Klenerman, P.; Fowler, A.; et al. Classification of intestinal T-cell receptor repertoires using machine learning methods can identify patients with coeliac disease regardless of dietary gluten status. J. Pathol. 2021, 253, 279–291. [Google Scholar] [CrossRef]
  58. Falchuk, Z.M.; Rogentine, G.N.; Strober, W. Predominance of histocompatibility antigen HL-A8 in patients with gluten-sensitive enteropathy. J. Clin. Investig. 1972, 51, 1602–1605. [Google Scholar] [CrossRef] [PubMed]
  59. Stokes, P.L.; Asquith, P.; Holmes, G.K.; Mackintosh, P.; Cooke, W.T. Histocompatibility antigens associated with adult coeliac disease. Lancet 1972, 2, 162–164. [Google Scholar] [CrossRef] [PubMed]
  60. Lindfors, K.; Ciacci, C.; Kurppa, K.; Lundin, K.E.A.; Makharia, G.K.; Mearin, M.L.; Murray, J.A.; Verdu, E.F.; Kaukinen, K. Coeliac disease. Nat. Rev. Dis. Primers 2019, 5, 3. [Google Scholar] [CrossRef]
  61. Karunakaran, M.M.; Willcox, C.R.; Salim, M.; Paletta, D.; Fichtner, A.S.; Noll, A.; Starick, L.; Nohren, A.; Begley, C.R.; Berwick, K.A.; et al. Butyrophilin-2A1 directly binds germline-encoded regions of the Vγ9Vδ2 TCR and is essential for phosphoantigen sensing. Immunity 2020, 52, 487–498 e486. [Google Scholar] [CrossRef]
  62. Rhodes, D.A.; Chen, H.C.; Price, A.J.; Keeble, A.H.; Davey, M.S.; James, L.C.; Eberl, M.; Trowsdale, J. Activation of human gammadelta T cells by cytosolic interactions of BTN3A1 with soluble phosphoantigens and the cytoskeletal adaptor periplakin. J. Immunol. 2015, 194, 2390–2398. [Google Scholar] [CrossRef]
  63. Sudlow, C.; Gallacher, J.; Allen, N.; Beral, V.; Burton, P.; Danesh, J.; Downey, P.; Elliott, P.; Green, J.; Landray, M.; et al. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015, 12, e1001779. [Google Scholar] [CrossRef]
  64. Human Protein Atlas, P. Human Protein Atlas. Available online: http://www.proteinatlas.org (accessed on 20 April 2021).
  65. Sayers, E.W.; Bolton, E.E.; Brister, J.R.; Canese, K.; Chan, J.; Comeau, D.C.; Connor, R.; Funk, K.; Kelly, C.; Kim, S.; et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2022, 50, D20–D26. [Google Scholar] [CrossRef]
  66. Nonacus. Nonacus Probe Design Tool. Available online: https://mynonacus.nonacus.com/view-panel-designs (accessed on 8 February 2021).
  67. Simms, V. 5 Tips for Using the Nonacus Panel Design Tool. Available online: https://nonacus.com/blog-get-great-coverage-for-the-genes-you-care-about/ (accessed on 27 September 2024).
  68. Nonacus. Custom NGS Panel Design Tool. Available online: https://nonacus.com/panel-design/ (accessed on 2 October 2024).
  69. Andrews, S. FastQC: A Quality Control Analysis Tool for High Throughput Sequencing Data. Available online: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 2 October 2024).
  70. Bolger, A.M.; Lohse, M.; Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 2014, 30, 2114–2120. [Google Scholar] [CrossRef]
  71. Van der Auwera, G.A.; O’Connor, B.D. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra, 1st ed.; O’Reilly Media: Sebastopol, CA, USA, 2020. [Google Scholar]
  72. Zhao, S.; Agafonov, O.; Azab, A.; Stokowy, T.; Hovig, E. Accuracy and efficiency of germline variant calling pipelines for human genome data. Sci. Rep. 2020, 10, 20222. [Google Scholar] [CrossRef]
  73. Cucco, F.; Barrans, S.; Sha, C.; Clipson, A.; Crouch, S.; Dobson, R.; Chen, Z.; Thompson, J.S.; Care, M.A.; Cummin, T.; et al. Distinct genetic changes reveal evolutionary history and heterogeneous molecular grade of DLBCL with MYC/BCL2 double-hit. Leukemia 2020, 34, 1329–1341. [Google Scholar] [CrossRef]
  74. Cucco, F.; Clipson, A.; Kennedy, H.; Sneath Thompson, J.; Wang, M.; Barrans, S.; van Hoppe, M.; Ochoa Ruiz, E.; Caddy, J.; Hamid, D.; et al. Mutation screening using formalin-fixed paraffin-embedded tissues: A stratified approach according to DNA quality. Lab Investig. 2018, 98, 1084–1092. [Google Scholar] [CrossRef] [PubMed]
  75. Matthews, J. A Snakemake Pipeline for Analysing (Cancer) DNA Sequencing Data. Available online: https://gitlab.com/jdm204/dnaseq_snakemake (accessed on 21 October 2022).
  76. Kawaguchi, S. HLA-HD. Available online: https://w3.genome.med.kyoto-u.ac.jp/HLA-HD/ (accessed on 17 March 2023).
  77. Danecek, P.; Bonfield, J.K.; Liddle, J.; Marshall, J.; Ohan, V.; Pollard, M.O.; Whitwham, A.; Keane, T.; McCarthy, S.A.; Davies, R.M.; et al. Twelve years of SAMtools and BCFtools. Gigascience 2021, 10, giab008. [Google Scholar] [CrossRef] [PubMed]
  78. McLaren, W.; Gil, L.; Hunt, S.E.; Riat, H.S.; Ritchie, G.R.; Thormann, A.; Flicek, P.; Cunningham, F. The Ensembl Variant Effect Predictor. Genome Biol. 2016, 17, 122. [Google Scholar] [CrossRef] [PubMed]
  79. Yu, Y.; Fedele, G.; Celardo, I.; Loh, S.H.Y.; Martins, L.M. Parp mutations protect from mitochondrial toxicity in Alzheimer’s disease. Cell Death Dis. 2021, 12, 651. [Google Scholar] [CrossRef] [PubMed]
  80. Grueneberg, A.; de Los Campos, G. BGData—A Suite of R Packages for Genomic Analysis with Big Data. G3 2019, 9, 1377–1383. [Google Scholar] [CrossRef]
  81. NCBI. SNP. Available online: https://www.ncbi.nlm.nih.gov/snp (accessed on 25 June 2024).
  82. Lefranc, M.P.; Giudicelli, V.; Duroux, P.; Jabado-Michaloud, J.; Folch, G.; Aouinti, S.; Carillon, E.; Duvergey, H.; Houles, A.; Paysan-Lafosse, T.; et al. IMGT(R), the international ImMunoGeneTics information system(R) 25 years on. Nucleic Acids Res. 2015, 43, D413–D422. [Google Scholar] [CrossRef]
  83. Bolotin, D.A.; Poslavsky, S.; Mitrophanov, I.; Shugay, M.; Mamedov, I.Z.; Putintseva, E.V.; Chudakov, D.M. MiXCR: Software for comprehensive adaptive immunity profiling. Nat. Methods 2015, 12, 380–381. [Google Scholar] [CrossRef]
  84. Bolotin, D.A.; Poslavsky, S.; Davydov, A.N.; Frenkel, F.E.; Fanchi, L.; Zolotareva, O.I.; Hemmers, S.; Putintseva, E.V.; Obraztsova, A.S.; Shugay, M.; et al. Antigen receptor repertoire profiling from RNA-seq data. Nat. Biotechnol. 2017, 35, 908–911. [Google Scholar] [CrossRef] [PubMed]
  85. McDonald, J.H. Handbook of Biological Statistics, 3rd ed.; Sparky House Publishing: Baltimore, MD, USA, 2014. [Google Scholar]
  86. Uhlen, M.; Fagerberg, L.; Hallstrom, B.M.; Lindskog, C.; Oksvold, P.; Mardinoglu, A.; Sivertsson, A.; Kampf, C.; Sjostedt, E.; Asplund, A.; et al. Tissue-based map of the human proteome. Science 2015, 347, 1260419. [Google Scholar] [CrossRef] [PubMed]
  87. Mölder, F.; Jablonski, K.; Letcher, B.; Hall, M.; Tomkins-Tinch, C.; Sochat, V.; Forster, J.; Lee, S.; Twardziok, S.; Kanitz, A.; et al. Sustainable data analysis with Snakemake [version 1; peer review: 1 approved, 1 approved with reservations]. F1000Research 2021, 10, 33. [Google Scholar] [CrossRef]
  88. Schneider, V.A.; Graves-Lindsay, T.; Howe, K.; Bouk, N.; Chen, H.C.; Kitts, P.A.; Murphy, T.D.; Pruitt, K.D.; Thibaud-Nissen, F.; Albracht, D.; et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017, 27, 849–864. [Google Scholar] [CrossRef]
  89. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013. [Google Scholar] [CrossRef]
  90. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
  91. Wang, K.; Li, M.; Hakonarson, H. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010, 38, e164. [Google Scholar] [CrossRef]
  92. Dilthey, A.; Leslie, S.; Moutsianas, L.; Shen, J.; Cox, C.; Nelson, M.R.; McVean, G. Multi-population classical HLA type imputation. PLoS Comput. Biol. 2013, 9, e1002877. [Google Scholar] [CrossRef]
  93. Gustavsen, J.; Rüeger, S.; Chamberlain, S.; Ushey, K.; Zhu, H. rsnps: Get ‘SNP’ (‘Single-Nucleotide’ ‘Polymorphism’) Data on the Web. 2024. Available online: https://github.com/ropensci/rsnps/ (accessed on 21 July 2024).
  94. Graffelman, J. Exploring Diallelic Genetic Markers: The HardyWeinberg Package. J. Stat. Softw. 2015, 64, 1–23. [Google Scholar] [CrossRef]
  95. Kawaguchi, S.; Higasa, K.; Shimizu, M.; Yamada, R.; Matsuda, F. HLA-HD: An accurate HLA typing algorithm for next-generation sequencing data. Hum. Mutat. 2017, 38, 788–797. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The germline-encoded HV4 loop of the T cell receptor (TCR) of Vγ4+ γδ IELs directly binds to BTNL3. (a) HV4 is located at amino acid positions 10–25 in the FR3 of the TRGV4 segment [31]. (b) The HV4 of Vγ4+ γδ T cells binds to the C, C”, F, and G canonical immunoglobulin-fold β-strands (CFG face) of the BTNL3 protein [29]. Abbreviations: CDR: complementarity-determining region; FR: framework region; HV4: hypervariable region 4; TRGJ: T cell receptor γ joining region; TRGV: T cell receptor γ variable region.
Figure 1. The germline-encoded HV4 loop of the T cell receptor (TCR) of Vγ4+ γδ IELs directly binds to BTNL3. (a) HV4 is located at amino acid positions 10–25 in the FR3 of the TRGV4 segment [31]. (b) The HV4 of Vγ4+ γδ T cells binds to the C, C”, F, and G canonical immunoglobulin-fold β-strands (CFG face) of the BTNL3 protein [29]. Abbreviations: CDR: complementarity-determining region; FR: framework region; HV4: hypervariable region 4; TRGJ: T cell receptor γ joining region; TRGV: T cell receptor γ variable region.
Ijms 26 10697 g001
Figure 2. Workflow of the study on the association of the butyrophilin family loci and the TRGV4-HV4 sequences with CeD predisposition.
Figure 2. Workflow of the study on the association of the butyrophilin family loci and the TRGV4-HV4 sequences with CeD predisposition.
Ijms 26 10697 g002
Figure 3. There were no significant differences in the TRGV usage of CeD (n = 45) and healthy control (n = 108) duodenal samples.
Figure 3. There were no significant differences in the TRGV usage of CeD (n = 45) and healthy control (n = 108) duodenal samples.
Ijms 26 10697 g003
Figure 4. BTN2A1, BTN3A1, and BTN3A2 may be involved in CeD pathogenesis by modulating T cell and innate immune cell activity. BTN2A1 gene burden was significantly higher in CeD patients in a cohort of 94 samples. Meanwhile, BTN2A1, BTN3A1, and BTN3A2 SNPs were significantly associated with CeD status in the UK Biobank database. Based on our results and evidence on the immunomodulatory role of butyrophilins on innate and adaptive immune cells, butyrophilins could contribute to CeD pathogenesis in multiple potential manners [25,26,37,38,61]: (a) via the novel, hypothesised pAg-dependent activation of Vγ4+ γδ T cells; (b) via the interaction of BTN2A1 with dendritic cells through the DC-SIGN receptor on the DC cell surface; (c) by increasing the co-stimulation and IFN-γ production of CD4+ αβ T cells, or by modulating the activity and IFN-γ production of NK cells depending on whether BTN3A1 or BTN3A2 is expressed predominantly on the surface of the NK cell; or (d) via the pAg-dependent activation of potentially gut-homing Vγ9Vδ2+ γδ T cells in the small intestine.
Figure 4. BTN2A1, BTN3A1, and BTN3A2 may be involved in CeD pathogenesis by modulating T cell and innate immune cell activity. BTN2A1 gene burden was significantly higher in CeD patients in a cohort of 94 samples. Meanwhile, BTN2A1, BTN3A1, and BTN3A2 SNPs were significantly associated with CeD status in the UK Biobank database. Based on our results and evidence on the immunomodulatory role of butyrophilins on innate and adaptive immune cells, butyrophilins could contribute to CeD pathogenesis in multiple potential manners [25,26,37,38,61]: (a) via the novel, hypothesised pAg-dependent activation of Vγ4+ γδ T cells; (b) via the interaction of BTN2A1 with dendritic cells through the DC-SIGN receptor on the DC cell surface; (c) by increasing the co-stimulation and IFN-γ production of CD4+ αβ T cells, or by modulating the activity and IFN-γ production of NK cells depending on whether BTN3A1 or BTN3A2 is expressed predominantly on the surface of the NK cell; or (d) via the pAg-dependent activation of potentially gut-homing Vγ9Vδ2+ γδ T cells in the small intestine.
Ijms 26 10697 g004
Table 1. Butyrophilins maintain and activate the γδ T cell compartments of mice and humans. Human butyrophilin family members are shortened with all letters capitalised, while only the first letter of mouse butyrophilins is capitalised [24].
Table 1. Butyrophilins maintain and activate the γδ T cell compartments of mice and humans. Human butyrophilin family members are shortened with all letters capitalised, while only the first letter of mouse butyrophilins is capitalised [24].
Butyrophilinsγδ T Cell SubsetRole of ButyrophilinsReferences
Peripheral bloodMouse unidentifiedUnidentifiedUnidentifiedNA
Alpaca BTN3Vγ9Vδ2+ T cellsNo interaction has been identified[35,36]
Human BTN3A homodimers/
heterodimers and BTN2A1 homodimer
Vγ9Vδ2+ T cellsPhosphoantigen-mediated, CDR3-independent γδ T cell activation[30,33,37,38]
SkinMouse Skint1 and Skint2Vγ5Vδ1+ DETCThymic selection, tissue homing of dendritic epidermal T cells to the skin[27,28,32]
Human?Vδ1+ T cellsUnidentified, unknown if there is butyrophilin involvement[39]
Intestinal epitheliumMouse Btnl1 and Btnl6Vγ7+ IELPhenotypic maintenance of the intestinal IEL compartment[27,28]
Human BTNL3 and BTNL8Vγ4Vδ1+ IELPhenotypic maintenance of the intestinal IEL compartment[21,27,29]
Table 2. Gene-based burden testing of butyrophilin family non-synonymous coding variants in CeD patients (n = 48) against controls (n = 46) showed significant differences in the disease burden of BTN2A1 variants. Non-synonymous coding variants that were predicted to be pathogenic or had low minor allele frequencies were considered qualifying variants for burden testing. (a) Burden tests were carried out using the TRAPD program [52] on butyrophilin family qualifying variants in CeD patients (n = 48) against controls (n = 46). Multi-allelic sites were separated into bi-allelic SNPs, as required by the TRAPD documentation [53]. The dominant model defines carriers for gene burden as individuals with at least one qualifying variant within a gene, while the recessive model requires at least two or more qualifying variants. Significant results were highlighted in bold. A version of table (a) with the percentage of individuals and alleles within the CeD and the control groups can be found in Table A10. (b) The BTN2A1 qualifying SNPs demonstrated a significant burden in CeD samples. Count data of individuals and alleles are in parentheses after the percentage value in columns 6–9 and columns 10–11, respectively. The percentage and count data were calculated from the per sample genotypes found in Table A11.
Table 2. Gene-based burden testing of butyrophilin family non-synonymous coding variants in CeD patients (n = 48) against controls (n = 46) showed significant differences in the disease burden of BTN2A1 variants. Non-synonymous coding variants that were predicted to be pathogenic or had low minor allele frequencies were considered qualifying variants for burden testing. (a) Burden tests were carried out using the TRAPD program [52] on butyrophilin family qualifying variants in CeD patients (n = 48) against controls (n = 46). Multi-allelic sites were separated into bi-allelic SNPs, as required by the TRAPD documentation [53]. The dominant model defines carriers for gene burden as individuals with at least one qualifying variant within a gene, while the recessive model requires at least two or more qualifying variants. Significant results were highlighted in bold. A version of table (a) with the percentage of individuals and alleles within the CeD and the control groups can be found in Table A10. (b) The BTN2A1 qualifying SNPs demonstrated a significant burden in CeD samples. Count data of individuals and alleles are in parentheses after the percentage value in columns 6–9 and columns 10–11, respectively. The percentage and count data were calculated from the per sample genotypes found in Table A11.
(a)
GeneQual.
SNPs
CeD N(≥1 HET)CeD N(≥2 HET)CeD N(HOM ALT)CeD Total Allele CountControl N(≥1 HET)Control N(≥2 HET)Control N(HOM ALT)Control Total Allele CountDominant Model p-ValueRecessive Model p-Value
BTN2A132221381540131.46 × 10−53.70 × 10−8
BTN3A215017901110.9290.946
ERMAP12108372007340.5160.988
(b) BTN2A1 variants significantly associated with CeD risk
Position (GRCh38)rsIDVariationImpactHET CeDHOM ALT CeDHET ControlHOM ALT ControlAlt Allele in CeDAlt Allele in Controls
6:26463432rs13195509G > AMissense variant, Val > Met43.8% (21)6.3% (3)8.7% (4)0.0% (0)27.1% (26)4.3% (4)
6:26468098rs3734542G > AMissense variant, Arg > Gln45.8% (22)6.3% (3)10.9% (5)0.0% (0)29.2% (28)5.4% (5)
6:26468317rs3734543G > CMissense variant, Gly > Ala43.8% (21)6.3% (3)8.7% (4)0.0% (0)28.1% (27)4.3% (4)
Abbreviations: Alt allele: alternative or minor allele; CeD: coeliac disease; GRCh38: Genome Reference Consortium Human Build 38; HET: heterozygous; HOM ALT: homozygous for alternative allele; N(≥1 HET): number of individuals carrying at least one heterozygous qualifying variant within the gene; N(≥2 HET): number of individuals carrying at least two heterozygous qualifying variant within the gene; N(HOM ALT): number of individuals carrying at least one homozygous qualifying variant within the gene; qual: qualifying; SNP: single-nucleotide polymorphism.
Table 3. SNPs of selected butyrophilin genes present in the UK Biobank.
Table 3. SNPs of selected butyrophilin genes present in the UK Biobank.
GeneSNPs in NCBIUnique SNPs in NCBISNPs in UK Biobank
BTN2A17912760530
BTN3A15348516427
BTN3A25905561121
BTNL36164592910
BTNL818,88918,19713
Table 4. SNPs from BTN2A1, BTN3A1, and BTN3A2 genes were significantly associated with CeD status in the UK Biobank. The name of the SNPs in the UK Biobank database is a combination of the reference SNP ID (rsID) from the SNP database (dbSNP) and the reference allele. All BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model, meaning that the reference allele is less frequent in CeD patients. SNPs in bold remained significantly associated with CeD in the binomial regression models that also took the HLA genotype into account. SNP count and allele count data for the significant SNPs can be found in Table A16. All significant SNPs in control participants were in Hardy–Weinberg equilibrium (Table A17).
Table 4. SNPs from BTN2A1, BTN3A1, and BTN3A2 genes were significantly associated with CeD status in the UK Biobank. The name of the SNPs in the UK Biobank database is a combination of the reference SNP ID (rsID) from the SNP database (dbSNP) and the reference allele. All BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model, meaning that the reference allele is less frequent in CeD patients. SNPs in bold remained significantly associated with CeD in the binomial regression models that also took the HLA genotype into account. SNP count and allele count data for the significant SNPs can be found in Table A16. All significant SNPs in control participants were in Hardy–Weinberg equilibrium (Table A17).
Position (GRCh38)SNP, Reference AlleleGeneSNP ConsequenceCeD Allele FreqControl Allele FreqTotal Allele Freqln(OR)CeD RiskAdjusted p-Value
6:26463347rs13195402BTN2A1STOP gained0.7680.8920.880−0.924decrease4.67 × 10−158
6:26463432rs13195509BTN2A1missense0.7540.8790.867−0.857decrease1.61 × 10−151
6:26475927rs1407045BTN2A1intronic0.5840.5160.5220.273increase6.07 × 10−22
6:26465807rs2273558BTN2A1intronic0.5830.6770.667−0.396decrease1.69 × 10−41
6:26460493rs2893856BTN2A1intronic0.1130.1310.130−0.175decrease3.23 × 10−3
6:26468098rs3734542BTN2A1missense0.7530.8780.867−0.855decrease8.59 × 10−151
6:26468317rs3734543BTN2A1missense0.7600.8790.868−0.844decrease1.59 × 10−140
6:26466954rs3799380BTN2A1intronic0.6830.7900.780−0.549decrease8.59 × 10−77
6:26474343rs56296968BTN2A1intronic0.6960.8070.796−0.604decrease9.70 × 10−89
6:26456215rs6456724BTN2A12 kb upstream0.1130.1310.130−0.176decrease2.87 × 10−3
6:26458037rs6929846BTN2A15′ UTR0.1460.1740.172−0.206decrease3.60 × 10−6
6:26473816rs7773938BTN2A1intronic0.6960.8060.796−0.600decrease1.15 × 10−87
6:26469647rs9358944BTN2A1intronic0.6950.8060.796−0.604decrease1.55 × 10−89
6:26471886rs9358945BTN2A1intronic0.6940.8060.796−0.606decrease4.37 × 10−90
6:26404730rs10456045BTN3A1intronic0.5960.6980.688−0.448decrease2.97 × 10−57
6:26410572rs1796520BTN3A1intronic0.4050.4740.467−0.276decrease2.40 × 10−22
6:26404146rs3799378BTN3A1intronic0.6530.7620.752−0.535decrease2.92 × 10−75
6:26405825rs3857549BTN3A1intronic0.9480.9350.9360.221increase1.53 × 10−2
6:26409662rs41266839BTN3A1missense0.7640.8920.880−0.924decrease2.12 × 10−168
6:26407180rs4609015BTN3A1intronic0.8710.8540.8550.141increase3.82 × 10−2
6:26412860rs6900725BTN3A1intronic0.8700.8530.8550.139increase4.33 × 10−2
6:26401210rs6912853BTN3A12 kb upstream0.8630.8440.8460.153increase7.85 × 10−3
6:26413007rs6920986BTN3A1intronic0.8700.8540.8560.138increase4.99 × 10−2
6:26415409rs742090BTN3A1500 b downstream0.4060.4740.468−0.276decrease3.58 × 10−22
6:26374321rs11758089BTN3A2intronic0.8660.8440.8460.176increase6.30 × 10−4
6:26372558rs12176317BTN3A2intronic0.7440.8670.856−0.809decrease6.72 × 10−140
6:26366990rs12199613BTN3A2intronic0.5140.6120.602−0.400decrease1.76 × 10−47
6:26377318rs1977BTN3A23′ UTR0.7400.8640.853−0.808decrease1.17 × 10−136
6:26377363rs1979BTN3A23′ UTR0.7430.8670.855−0.809decrease8.23 × 10−140
6:26375933rs1985732BTN3A2intronic0.5950.6980.688−0.457decrease2.87 × 10−59
6:26374430rs2073526BTN3A2intronic0.3700.4420.435−0.295decrease9.15 × 10−25
6:26363527rs9358934BTN3A22 kb upstream0.7440.8660.855−0.803decrease2.34 × 10−137
6:26364702rs9379855BTN3A22 kb upstream0.7430.8660.855−0.804decrease8.85 × 10−138
6:26367461rs9379858BTN3A2intronic0.7430.8660.855−0.802decrease3.19 × 10−137
6:26369321rs9379859BTN3A2intronic0.7440.8670.855−0.803decrease5.37 × 10−137
6:26373450rs9393713BTN3A2intronic0.7430.8680.856−0.814decrease1.07 × 10−141
6:26373512rs9393714BTN3A2intronic0.7430.8680.856−0.813decrease6.99 × 10−141
Abbreviations: CeD: coeliac disease; freq: frequency; GRCh38: Genome Reference Consortium Human Build 38; kb: kilobase; ln(OR): natural logarithm of the odds ratio; SNP: single-nucleotide polymorphism; UTR: untranslated region.
Table 5. Twenty SNPs from BTN2A1, BTN3A1, and BTN3A2 genes were significantly associated with CeD status in the UK Biobank when HLA genotypes were included in the single-variant testing models. BTN2A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model. All significant SNPs in control participants were in Hardy–Weinberg equilibrium (Table A19).
Table 5. Twenty SNPs from BTN2A1, BTN3A1, and BTN3A2 genes were significantly associated with CeD status in the UK Biobank when HLA genotypes were included in the single-variant testing models. BTN2A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model. All significant SNPs in control participants were in Hardy–Weinberg equilibrium (Table A19).
SNP, Reference AlleleGeneSNP Consequenceln(OR)CeD RiskAdjusted p-Value
rs13195402BTN2A1STOP gained−0.20727decrease8.15 × 10−6
rs13195509BTN2A1missense−0.19239decrease1.62 × 10−5
rs3734542BTN2A1missense−0.18831decrease2.94 × 10−5
rs3734543BTN2A1missense−0.16744decrease8.23 × 10−4
rs56296968BTN2A1intronic−0.11753decrease4.20 × 10−2
rs9358944BTN2A1intronic−0.11786decrease3.83 × 10−2
rs9358945BTN2A1intronic−0.12018decrease2.91 × 10−2
rs3799378BTN3A1intronic−0.14327decrease7.04 × 10−4
rs41266839BTN3A1missense−0.21469decrease1.06 × 10−6
rs12176317BTN3A2intronic−0.1974decrease3.50 × 10−6
rs12199613BTN3A2intronic−0.12296decrease3.31 × 10−3
rs1977BTN3A23′ UTR−0.20238decrease2.06 × 10−6
rs1979BTN3A23′ UTR−0.19756decrease3.40 × 10−6
rs1985732BTN3A2intronic−0.10975decrease3.35 × 10−2
rs9358934BTN3A22 kb upstream−0.19286decrease7.53 × 10−6
rs9379855BTN3A22 kb upstream−0.19406decrease6.04 × 10−6
rs9379858BTN3A2intronic−0.19156decrease8.99 × 10−6
rs9379859BTN3A2intronic−0.19261decrease8.10 × 10−6
rs9393713BTN3A2intronic−0.2056decrease9.27 × 10−7
rs9393714BTN3A2intronic−0.20087decrease2.08 × 10−6
Abbreviations: CeD: coeliac disease; ln(OR): natural logarithm of the odds ratio; SNP: single-nucleotide polymorphism.
Table 6. Single-variant testing in HLA-matched groups from the UK Biobank dataset only identified significant SNPs associated with CeD status in individuals with HLA-DQ2.5 genotypes. The CeD and control participants of the UK Biobank dataset were divided into HLA-matched case-control groups for single-variant testing. The association between BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs and CeD status was investigated. Significant association between the SNPs and CeD status was only present in HLA-DQ2.5-matched individuals (in bold).
Table 6. Single-variant testing in HLA-matched groups from the UK Biobank dataset only identified significant SNPs associated with CeD status in individuals with HLA-DQ2.5 genotypes. The CeD and control participants of the UK Biobank dataset were divided into HLA-matched case-control groups for single-variant testing. The association between BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs and CeD status was investigated. Significant association between the SNPs and CeD status was only present in HLA-DQ2.5-matched individuals (in bold).
HLA Genotype of Individuals in ModelNumber of CeD ParticipantsNumber of ControlsNumber of Significant SNPs
HLA-DQ2.219941540
HLA-DQ2.51652641621
HLA-DQ817142030
HLA-DQ2.2, HLA-DQ2.56068950
HLA-DQ2.2, HLA-DQ8505900
HLA-DQ2.5, HLA-DQ81828860
Other23412,6180
Table 7. Butyrophilin SNPs only remained significantly associated with CeD status in the HLA-DQ2.5-restricted UK Biobank analysis. The name of the SNPs in the UK Biobank database is a combination of the SNP name and the reference allele. All BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model. The SNP in bold was a novel SNP significantly associated with CeD unique to the HLA-DQ2.5 model, while the other SNPs were also significant in the non-HLA and the HLA models. SNP count and allele count data for the significant SNPs can be found in Table A21. The allele frequency of all the significantly associated SNPs in the control group significantly differed from the Hardy–Weinberg equilibrium (Table A22).
Table 7. Butyrophilin SNPs only remained significantly associated with CeD status in the HLA-DQ2.5-restricted UK Biobank analysis. The name of the SNPs in the UK Biobank database is a combination of the SNP name and the reference allele. All BTN2A1, BTN3A1, BTN3A2, BTNL3, and BTNL8 SNPs in the UK Biobank were subjected to single-variant testing to examine their association with CeD. Due to multiple testing, Bonferroni correction was applied. SNPs with a negative ln(OR) are associated with lower CeD risk in this binomial model. The SNP in bold was a novel SNP significantly associated with CeD unique to the HLA-DQ2.5 model, while the other SNPs were also significant in the non-HLA and the HLA models. SNP count and allele count data for the significant SNPs can be found in Table A21. The allele frequency of all the significantly associated SNPs in the control group significantly differed from the Hardy–Weinberg equilibrium (Table A22).
SNP, Reference AlleleGeneSNP ConsequenceCeD Allele FreqControl Allele FreqTotal Allele Freqln(OR)CeD RiskAdjusted p-Value
rs13195402BTN2A1STOP gained0.7040.7510.741−0.27812decrease5.10 × 10−7
rs13195509BTN2A1missense0.6870.7340.724−0.25542decrease1.75 × 10−6
rs3734542BTN2A1missense0.6870.7330.723−0.25235decrease2.52 × 10−6
rs3734543BTN2A1missense0.6950.7360.728−0.23279decrease5.43 × 10−5
rs56296968BTN2A1intronic0.6400.6730.666−0.15928decrease2.33 × 10−2
rs7773938BTN2A1intronic0.6400.6720.666−0.15755decrease2.66 × 10−2
rs9358944BTN2A1intronic0.6380.6710.664−0.16238decrease1.61 × 10−2
rs9358945BTN2A1intronic0.6380.6710.665−0.16499decrease1.26 × 10−2
rs3799378BTN3A1intronic0.5960.6370.628−0.18712decrease8.10 × 10−4
rs41266839BTN3A1missense0.6970.7470.737−0.28267decrease7.25 × 10−8
rs12176317BTN3A2intronic0.6790.7290.719−0.26575decrease2.82 × 10−7
rs12199613BTN3A2intronic0.4590.5000.491−0.1717decrease2.06 × 10−3
rs1977BTN3A23′ UTR0.6760.7260.716−0.268decrease2.99 × 10−7
rs1979BTN3A23′ UTR0.6790.7280.718−0.26376decrease3.63 × 10−7
rs1985732BTN3A2intronic0.5390.5740.567−0.15063decrease2.35 × 10−2
rs9358934BTN3A22 kb upstream0.6800.7290.719−0.25825decrease8.85 × 10−7
rs9379855BTN3A22 kb upstream0.6800.7280.718−0.25967decrease6.76 × 10−7
rs9379858BTN3A2intronic0.6800.7280.718−0.25566decrease1.18 × 10−6
rs9379859BTN3A2intronic0.6810.7290.719−0.26105decrease6.31 × 10−7
rs9393713BTN3A2intronic0.6780.7290.719−0.27313decrease1.06 × 10−7
rs9393714BTN3A2intronic0.6790.7290.719−0.26778decrease2.25 × 10−7
Abbreviations: CeD: coeliac disease; freq: frequency; ln(OR): natural logarithm of the odds ratio; SNP: single-nucleotide polymorphism.
Table 8. The coeliac disease and healthy control patient TRG datasets analysed for TRGV usage and HV4 sequence variations. FFPE: formalin-fixed, paraffin-embedded.
Table 8. The coeliac disease and healthy control patient TRG datasets analysed for TRGV usage and HV4 sequence variations. FFPE: formalin-fixed, paraffin-embedded.
Coeliac DiseaseHealthy ControlSequencing Method
Dataset 134 FFPE, 12 fresh frozen duodenal97 FFPE duodenalLymphotrack (Invivoscribe Inc., San Diego, CA, USA) and Illumina Miseq micro (San Diego, CA, USA)
Dataset 211 FFPE duodenal11 FFPE duodenalLymphotrack (Invivoscribe Inc.) and Illumina Miseq
Dataset 384 blood130 bloodIllumina NextSeq
Combined84 blood,
48 FFPE duodenal,
12 fresh frozen duodenal
130 blood,
108 FFPE duodenal
NA
Table 9. More than 95% of participants possessed at least one reference HV4 loop regardless of their CeD status. The dataset consisted of 238 healthy controls and 141 CeD samples. (a) Seven unique HV4 amino acid sequences were identified in the dataset. (b) The homozygous WT HV4 phenotype was the most frequent in both the healthy control and CeD groups.
Table 9. More than 95% of participants possessed at least one reference HV4 loop regardless of their CeD status. The dataset consisted of 238 healthy controls and 141 CeD samples. (a) Seven unique HV4 amino acid sequences were identified in the dataset. (b) The homozygous WT HV4 phenotype was the most frequent in both the healthy control and CeD groups.
(a)
HV4 Amino Acid SequenceAmino Acid ChangeEffectFreq. in Healthy Control Samples (n = 238)Freq. in CeD Samples (n = 141)Predicted Change in Binding [31]
KYDTYGSTRKNLRMILR (WT)--430/476 = 0.903254/282 = 0.901-
KYDTYGSTRQNLRMILRLysine > GlutaminePositive charge > polar uncharged41/476 = 0.08623/282 = 0.082Marginal reduction in binding
KYDTYGSTRKSLRMILRAsparagine > SerinePolar uncharged > polar uncharged4/476 = 0.0082/282 = 0.007Unknown
KYDTYGSTR_ELENDTALysine > frameshiftPositive charge > different sequence01/282 = 0.003Unknown
KYNTYGSTRKNLRMILRAspartic acid > AsparagineNegative charge > polar uncharged01/282 = 0.003Disrupted binding
KYDTYGNTRKNLRMILRSerine > AsparaginePolar uncharged > polar uncharged1/476 = 0.0020Unknown
KYDTYGSIRKNLRMILRThreonine > IsoleucinePolar uncharged > apolar01/282 = 0.003Unknown
(b)
PhenotypeCombined Healthy Control Samples (n = 238)Combined CeD Samples (n = 141)
WT202116
WT, KYDTYGSTRQNLRMILR2318
KYDTYGSTRQNLRMILR92
WT, KYDTYGSTRKSLRMILR22
KYDTYGSTRQNLRMILR, KYDTYGSTR_ELENDTA01
WT, KYNTYGSTRKNLRMILR01
WT, KYDTYGNTRKNLRMILR10
KYDTYGSTRKSLRMILR10
WT, KYDTYGSIRKNLRMILR01
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Luu Hoang, K.N.; Evans, S.; Willis, T.W.; Davies, K.; Kockelbergh, H.; Silcock, L.; Piechocki, K.; Fowler, A.; Soilleux, E.J. BTN2A1 and BTN3A1 as Novel Coeliac Disease Risk Loci: An In Silico Analysis. Int. J. Mol. Sci. 2025, 26, 10697. https://doi.org/10.3390/ijms262110697

AMA Style

Luu Hoang KN, Evans S, Willis TW, Davies K, Kockelbergh H, Silcock L, Piechocki K, Fowler A, Soilleux EJ. BTN2A1 and BTN3A1 as Novel Coeliac Disease Risk Loci: An In Silico Analysis. International Journal of Molecular Sciences. 2025; 26(21):10697. https://doi.org/10.3390/ijms262110697

Chicago/Turabian Style

Luu Hoang, Kim Ngan, Shelley Evans, Thomas W. Willis, Kate Davies, Hannah Kockelbergh, Lee Silcock, Kim Piechocki, Anna Fowler, and Elizabeth J. Soilleux. 2025. "BTN2A1 and BTN3A1 as Novel Coeliac Disease Risk Loci: An In Silico Analysis" International Journal of Molecular Sciences 26, no. 21: 10697. https://doi.org/10.3390/ijms262110697

APA Style

Luu Hoang, K. N., Evans, S., Willis, T. W., Davies, K., Kockelbergh, H., Silcock, L., Piechocki, K., Fowler, A., & Soilleux, E. J. (2025). BTN2A1 and BTN3A1 as Novel Coeliac Disease Risk Loci: An In Silico Analysis. International Journal of Molecular Sciences, 26(21), 10697. https://doi.org/10.3390/ijms262110697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop