Comparative Genetic Association Analysis of Human Genetic Susceptibility to Pulmonary and Lymph Node Tuberculosis

Background: Tuberculosis (TB) manifests itself primarily in the lungs as pulmonary disease (PTB) and sometimes disseminates to other organs to cause extra-pulmonary TB, such as lymph node TB (LNTB). This study aimed to investigate the role of host genetic polymorphism in immunity related genes to find a genetic basis for such differences. Methods: Sixty-three, Single nucleotide polymorphisms (SNPs) in twenty-three, TB-immunity related genes including eleven innate immunity (SLCA11, VDR, TLR2, TLR4, TLR8, IRGM, P2RX7, LTA4H, SP110, DCSIGN and NOS2A) and twelve cytokine (TNFA, IFNG, IL2, Il12, IL18, IL1B, IL10, IL6, IL4, rs1794068, IL8 and TNFB) genes were investigated to find genetic associations in both PTB and LNTB as compared to healthy community controls. The serum cytokine levels were correlated for association with the genotypes. Results: PTB and LNTB showed differential genetic associations. The genetic variants in the cytokine genes (IFNG, IL12, IL4, TNFB and IL1RA and TLR2, 4 associated with PTB susceptibility and cytokine levels but not LNTB (p < 0.05). Similarly, genetic variants in LTA4H, P2RX7, DCSIGN and SP110 showed susceptibility to LNTB and not PTB. Pathway analysis showed abundance of cytokine related variants for PTB and apoptosis related variants for LNTB. Conclusions: PTB and LNTB outcomes of TB infection have a genetic component and should be considered for any future functional studies or studies on susceptibility to pulmonary and extra-pulmonary TB.


Introduction
Tuberculosis (TB), a major health hazard worldwide, is characterized by different clinical manifestations including localized infection in the lungs or pulmonary TB (PTB) and various forms of extra-pulmonary (EPTB). PTB accounts for 80% of all forms of TB [1], while EPTB constitute about 15-20% of all immunocompetent TB cases and 50% in cases infected with Human Immunodeficiency Virus (HIV) [2]. The most common form of EPTB is tubercular lymphadenitis (LNTB) with 50% of the cases involving the peripheral lymph nodes [3]. The basis of the variability of disease manifestation by the same infectious organism is unclear. It is not well understood as to why some individuals have EPTB disease which can infect other sites such as lymph nodes, while most persons have localized infection in the lungs.
The propensity for such different manifestations can be attributed to environmental exposures, pathogen virulence traits and host genetics of immune response. It is not really understood which of the aforementioned factors is the most important. India being an endemic country for tuberculosis with highest number of incident TB cases in 2021 [1], the prominent role of environmental exposures would most likely not be a driving factor in this population. As for pathogen virulence traits, there is association between infectivity of Mycobacterium tuberculosis (Mtb) strain and extra-pulmonary infections [4]. Pathogen

Study Population
Venous blood Samples were collected from TB hospitals in and around New Delhi between 2009-2011. TB cases were 15 years or older culture confirmed or clinically diagnosed PTB cases with sputum smear microscopy for acid-fast bacilli (AFB), culture and chest X-ray data. Individuals included were not on any anti-tubercular therapy. For LNTB, patients were carefully selected to only have peripheral lymph node tuberculosis. Fine needle aspiration cytology (FNAC) was used to make histological confirmation of granulomatous structure and the FNAC was stained for AFB. Either histologically confirmed or AFB positive patients were considered for the LNTB group of the study. Patients with mixed EPTB and PTB infection were excluded from the study. All the enrolled patients were HIV negative. HIV negative community controls were enrolled from in and around New Delhi. The controls were confirmed to have never been diagnosed with TB and had no family history of TB. For the discovery cohort for cytokine genes, we enrolled Pulmonary TB (PTB), n = 110; Lymph Node TB, n = 35; Healthy controls, n = 78. For the validation cohort for 9 SNPs from six cytokine genes we had a sample size of PTB, n = 160, LNTB, n = 50: HC, n = 265. This was obtained by obtaining the genotypes for additional, ethnicity matched controls (n = 135), were added for the validation phase analysis and were obtained from the Indian genome variation consortium database [25]. For the Innate Panel we had PTB, n = 125; LNTB, n = 50 and HC n = 125.
All patients and volunteers were informed about the study and an informed written consent was obtained from all the study participants. This study (ID: 60(0081)/07/EMR-II), was approved by the Institutional ethics committee of Vallabhbhai Patel Chest Institute, University of Delhi.

SNP Selection
For the discovery panel, thirty-nine SNPs from twelve cytokine genes were selected. The SNPs mostly were in the intronic, exonic and 3 UTR regions. To avoid selection of non-polymorphic loci, as there was no data available on the Indian population at the time of the study initiation; we relied on the HapMap database (www.hapmap.org accessed on 3 November 2022). SNPs were selected based on the following criteria: 1. Reported frequency >10% in at least 3 world population in HapMap database; 2. Reported frequency >20% in at least 2 world populations in HapMap database; 3. Average heterozygosity, which is a measure of genetic diversity at population scale was considered from dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP accessed on 3 November 2022). It indicates the average proportion of individuals which are heterozygous in dbSNP from all the SNP data submitted to it, and this reduces selection on non-polymorphic loci. We have successfully implemented this strategy previously identifying novel genetic associations for TB [26]. New associations identified from the discovery panel in the study, were intersected with previously identified strong association with PTB [26] and used for the validation panel.

SNP Genotyping
Genomic DNA was extracted using QIAamp DNA kit (Qiagen, Hilden, Germany). The concentrations of the DNA samples were determined by Nanodrop using Nanoquant TM plate of Infinite ® Pro 200 system (Tecan, Männedorf, Switzerland), checked for purity on an 1% agarose gel and stored at −20 • C until further analyses. All the cytokine SNPs were genotyped using the matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (Sequenom Inc., San Diego, CA, USA). Assays for all SNPs were designed using Spectro DESIGNER software (Sequenom Inc., San Diego, CA, USA) and genotyped using the iPLEX assays (www.sequenom.com/iplex accessed on 3 November 2022) as described previously [26].
The innate panel SNPs were typed using tetra-primer Amplification refractory mutation system (ARMS) PCR following the method of Ye et al. [27] with modifications optimized for our SNP panel.
Briefly, primers for each selected polymorphism were designed using primer design software available at http://cedar.genetics.soton.ac.uk/public_html/primer.html accessed on 3 November 2022. The software has been optimized to include two deliberate mismatches in the inner primer sets at 3 termini and −2 bases from 3 terminal to aid in the allele specificity. Each PCR reaction was carried out in a total volume of 10 µL containing 30 ng of template DNA, 10 pmol of each outer and inner primers, 200 mM of dNTPs, appropriate. concentration of Mgcl 2 (2.5 −3.5 nm), 20 mM Tris-Cl pH 8.4, 50 mM KCl, 0.05% (v/v) W1 and 0.5 Units of thermostable Taq polymerase. The PCR conditions included 2 min at 95 • C, 1 min annealing (annealing temperatures different according to primers) and 1 min extension (72 • C), and additional two minutes extension at 72 • C at the end of 35 cycles. Representative images of all genotyped polymorphisms are available in Figure S1A-O.
To confirm that ARMS PCR detection of genotypes matches up with Sequenom genotyping, we selected a SNP (rs3212220) form IL12. This SNP was which was been genotyped on the Sequenom platform as well as ARMS-PCR was carried out for all the individuals of the control group and full concordance was observed (Table S1). LTA4H gene SNP rs17525495 was typed using allelic discrimination assay (catalog number: 4351379, Applied Biosystems) using manufacturer's instructions.

ELISA for Serum Cytokine Measurement
Serum collected from the abovementioned cohort was quantified for the level of circulating cytokines by using Enzyme-linked Immunosorbent assay (ELISA) using cytokine kits following the manufacturer's instructions. The unknown values were extrapolated from a standard curve within the linear range.

Statistical Analysis
Hardy-Weinberg equilibrium was assessed in cases and controls for all tested variants to ensure that the samples were within allelic population equilibrium by using Haploview v 4.2 (http://www.broad.mit.edu/mpg/haploview/ accessed on 3 November 2022). A stringent cut off offered by the Haploview v 4.2 was used to perform further analysis which was used as a filtering criterion which included the following parameters: Minimum genotype = 75% and minimum minor allele frequency 0.0010) and HWE controls p > 0.05. The samples and SNPs failing this test were not selected for further analysis. PLINK v 1.07 (http://pngu.mgh.harvard.edu/purcell/plink/ accessed on 3 November 2022) was used to correct for multiple comparison, using Bonferroni methods, p-value after correction considered significant in the validation panel. Haplotype block generation was performed using the algorithm by Gabriel et al., 2002 [28] implemented in the Haploview software (http://www.broad.mit.edu/mpg/haploview/ accessed on 3 November 2022) which was also used for initial association testing. Genetic association testing was done using a 2 × 2 contingency table. Odds ratio, two tailed p-value was calculated for alleles using GraphPad Prism (version 5.00 for Windows, Graph Pad Software, San Diego, CA, USA, www.graphpad.com accessed on 3 November 2022). A two-tailed p value < 0.05 was considered statistically significant. The Odds ratio was confirmed by PLINK v 1.07, using a general model with fisher's exact test options. Multidimensional scaling using pairwise identity-by-state distances which was inferred based on genotypes of the 34 SNPS of the cytokine was carried out in. PLINK v 9.0. Raw Distances were plotted on a 3D plot in R using the 'rgl' package.
For gene-gene interaction analysis we applied semi-exhaustive testing for pairwise interaction using PLINK v 1.07. The -fast epistasis along with -case-only option was used for this purpose. This has been hailed as a powerful approach by some workers. It provides a logistic regression test for interaction. This analysis exploits the fact that under certain conditions an interaction term in logistic regression equation corresponds to dependency or correlation between relative predictor variables within the population of cases. It uses an allelic model for both main effects and interactions and genotypes are not correlated.

SNP Targeted Pathway Analysis
We performed a SNP targeted pathway analysis using the PANOGA protocol [29] as shown by Bakir-Gungor et al. [30]. The PANOGA protocol uses the association information in terms of p-value and creates files that can be used as input files in Cytoscape [31] application of the JActiveModules [32], which takes the genes containing the SNPs information and extrapolates it to the human whole human protein-protein interaction network and derives network and sub-network based on the input genes in the query. The JActiveModules output consisting of networks is then used as an input in ClueGo app [33] in which one can look for gene annotations from various sources including the KEEG, WikiPathways [34], GO database for immunological, biological, molecular and other networks that has been used to visualize the pathways to the gene interaction scale. We chose the WikiPathways to visualize the genes in the results.

The Study Population Was Devoid of Population Stratification
False-positive associations can arise as a result of population stratification [35]. To investigate any hint of population-substructure, the self-reported ethnicity of each subject and his/her parents was carefully considered. To rule out population stratification a Multidimensional scaling (MDS) analysis was carried out on the genotyping data from the groups, which generated a compact cluster, without separating, indicating that population of patients (PTB and LNTB) and control subjects (HC) were homogenous with no substructures ( Figure 1).
including the KEEG, WikiPathways [34], GO database for immunological, biol molecular and other networks that has been used to visualize the pathways to th interaction scale. We chose the WikiPathways to visualize the genes in the results.

The Study Population was Devoid of Population Stratification
False-positive associations can arise as a result of population stratification [ investigate any hint of population-substructure, the self-reported ethnicity of each and his/her parents was carefully considered. To rule out population stratifica Multidimensional scaling (MDS) analysis was carried out on the genotyping dat the groups, which generated a compact cluster, without separating, indicatin population of patients (PTB and LNTB) and control subjects (HC) were homogenou no sub-structures ( Figure 1).

Cytokine Genetic Variants Show Significant Allelic and Haplotypic Association in PT not in LNTB
For analysis of the genetic association of the cytokine variants, we in constituted a discovery panel of 39 SNPs listed in Table S2. We genotyped these cy polymorphisms in 110 PTB cases, 78 HC and 35 LNTB cases. A Combined (PTB + comparison (Table 1) was followed by a separate PTB and LNTB comparis association (Table 1), with the aim to identify the SNPs-linked to differential suscep to PTB and LNTB. 26 SNPs for PTB, 23 SNPs for LNTB and 24 SNPs combined, pass filtering criteria, and were analyzed for allelic association are listed in Table S significant and borderline significant associations are enlisted in Table 1. Hamming distances as multidimensional scaling (MDS) co-ordinates are plotted on X, Y and Z-axes, to visualize genetic distance between the study groups.

Cytokine Genetic Variants Show Significant Allelic and Haplotypic Association in PTB and Not in LNTB
For analysis of the genetic association of the cytokine variants, we initially constituted a discovery panel of 39 SNPs listed in Table S2. We genotyped these cytokine polymorphisms in 110 PTB cases, 78 HC and 35 LNTB cases. A Combined (PTB + LNTB) comparison (Table 1) was followed by a separate PTB and LNTB comparison for association (Table 1), with the aim to identify the SNPs-linked to differential susceptibility to PTB and LNTB. 26 SNPs for PTB, 23 SNPs for LNTB and 24 SNPs combined, passed the filtering criteria, and were analyzed for allelic association are listed in Table S3. The significant and borderline significant associations are enlisted in Table 1.
When considering PTB and LNTB cases together for analysis (Combined) we could identify only one variant from IL10 gene rs1878672 of significance while certain others showed trend for association such as rs746868 of TNFB, rs1143643 IL1B and rs419598 of IL1RA. From the PTB only analysis we could identify only one variant from IL10 gene rs1878672 of significance with C allele showing 3.4-fold risk of developing PTB. While certain others which showed trend for association in all TB group such as rs746868 of TNFB (1.6-fold risk), rs1143643 of IL1B (3.2-fold risk) showed association with PTB group, indicating that disease type has a bearing on the susceptibility to TB. From the LNTB only analysis we could identify only variant from IL6 gene, rs1548216 of significance with C allele showing a 4.4-fold risk of developing LNTB. Analysis of the gene structure in the combined analysis revealed two haplotype blocks formed by SNPs in TNFB and IL18 ( Table 2). The haplotypic frequency among case and controls did not differ significantly for both PTB and combined groups ( Table 2). While the combined and PTB analysis showed two combinations each for TNFB and IL8, LNTB showed four combinations for TNFB, the haplotypic frequency of one of which, the TTC showing 4.4-fold risk of developing LNTB (Table 2). No multiple corrections were carried out at this stage in the analysis. The aim was not to prematurely discard SNPs and select them for further validation in a larger sample size. After, discovering that cytokine gene polymorphism associations with PTB and to a lesser extent in LNTB, we wanted to independently validate, these findings before making a conclusion. We selected, 15 SNPs for the validation panel from eight cytokine genes ( Table S2b). Seven of these genes and SNPs were selected association in the discovery panel (Tables 1 and S3) and the eight SNPs found from our previous study on 25 SNPs (Table S2b) from six cytokine genes [26], which achieved a replication sample size of 160 PTB cases, 50 LNTB cases and 265 controls, giving the validation panel a 91% power of study to detect an odds ratio of 1.8 and above for nine cytokine SNPs (Table S2b) [36].
Upon analysis of 15 cytokine SNPs after applying Bonferroni's correction for multiple testing, (Table 3), for PTB we found association for IFNG at rs1861493, IL4 at rs2853694 and IL12 at rs3212220 after correction for multiple testing. This replicated our previous findings about cytokine gene polymorphisms increasing the risk for PTB [26]. In contrast, for LNTB two variants rs2070874 of IL4 and rs2853694 of IL12, showed significance but these associations were lost after correction for multiple testing. (Table 3). Interestingly, out of the 7 SNPs from the discovery panel which was significantly associated in (Tables 1 and 2), only rs3024498 of IL10 gene achieved a borderline significance (p = 0.07) ( Table S4a) and the rest was not replicated in the validation cohort for LNTB. This could be related to a lower replication sample size of the validation panel for LNTB. Therefore, these SNPs need larger sample size for validation. Here, we could validate our previous cytokine gene association findings [26]. This analysis showed that cytokine genetic variants increase the risk of PTB but not LNTB. These observations add value to the argument that genetic polymorphisms play a critical role in manifestation of TB as pulmonary or extra-pulmonary TB.

Gene-Gene Epistatic Interaction Analysis Reveals a Higher Risk for Cytokine Genes Majorly in PTB and Not LNTB
After determining that cytokine gene polymorphisms contributed to increased risk for PTB susceptibility, we applied semi-exhaustive epistatic testing for pairwise interaction among the significantly associated SNPs from a previous panel [26] and the current cytokine gene validation panel to understand their genetic interaction. Thirteen significant interactions were identified and are enlisted in Table 4. Interestingly, the IL4 locus showed interaction in LNTB as well, highlighting a critical role for this SNP in the north Indian population (Table 4). This approach also identified some SNPs which were not associated in the single locus analysis. IL1RA emerged as the gene having a significant interaction with IL12, IL4 and TNFB genetic variants. Most of the IL1RA interactions were protective with odds ratio <1. Only one interaction between its own SNP was showing an eight-fold risk (p = 3.066 × 10 −5 ). This interaction could be important in defining the genetic susceptibility to TB. The other important player was IL4 which showed interaction with variants of IL12, IL1RA and TNFB. Interestingly, all the interactions of IL4, an anti-inflammatory cytokine with other proinflammatory cytokines such as IL12 and TNFB showed a very high risk (18-fold risk) with very highly significant p-values. These genetic interactions enabled us to test the hypothesis that the disease outcome in tuberculosis can be due to interaction of the cytokine gene polymorphism. Also, many of the loci identified here were not significant in single variant association analysis. This analysis confirmed that cytokine gene polymorphisms affect the outcome of PTB more than LNTB, adding evidence to support the role of genetic polymorphisms in differential disease manifestation in TB.

Lack of Major Association of Cytokine Levels with Genotypes in LNTB
We have previously shown that cytokine levels are affected by their genotypes, and individuals with a certain genotype secrete more of less of cytokines in their serum in people with PTB [37]. Since, we observed such stark differences in the association of cytokine gene polymorphism in PTB and LNTB, we carried out a similar analysis for the LNTB samples in this study. Overall, LNTB showed higher levels of the cytokine as compared to the healthy controls (Figure 2A). Out of 34 SNPs tested, none of the cytokine genotypes except for IL8 at rs3882891 showed any significant difference in cytokine levels as governed by their genotype (Figure 2B), lending credibility to a major role of cytokine gene polymorphism in PTB but not LNTB.

Innate Immunity Related Genes Are Majorly Associated with LNTB and Not PTB
Innate immunity forms the first line of defense and multiple of innate immune genes have been implicated in susceptibility to PTB in various populations of the world where TB is endemic [22]. Since we observed such stark differences in cytokine gene polymorphisms, we hypothesized that a number of these gene polymorphisms would be PTB or LNTB specific. Widely studied polymorphisms were selected on for the study, as it would offer us a great comparative insight with other world populations for TB susceptibility. The allelic association of the innate genes is listed in Table 5. P 2 RX 7 gene showed a 7-fold risk for: −762 T/C (rs2393799) C allele for the development of LNTB this association was marginally associated with risk of developing PTB. For rs37511431 we didn't detect any association. Out of three studied variants of the VDR gene variants i.e., FokI (rs2228570), TaqI (rs731236), BsmI (rs1544410), rs1544410 was found to be not polymorphic (presence of only one allele detected), rest of the two polymorphisms were not found to be associated with TB (both PTB and LNTB) risk or protection in this population. We did not find any association between either PTB or LNTB and IRGM genetic variant rs9637876 ( Table 5). The results indicated that NRAMP1/SLC11A1 gene polymorphic variants may not be associated with the susceptibility to TB in the studied population. In fact, we could detect the presence of only a single genotype in all cases and controls; a CC genotype for rs3731865 and a heterozygous AG genotype for rs17235409. No haplotypes were observed. G allele of the TLR2 genetic variant rs6265786 (Arg677Trp) of showed a high risk for PTB but not for LNTB. Similar results obtained for rs4986790 of TLR4 gene where the G allele of shows a 2-fold risk of development of PTB. For DCSIGN (CD209) in PTB cases, with allele 'A' of rs4804803 was overrepresented in cases as compared to healthy controls showed a very significant association posing a 4-fold risk for developing PTB in north Indians and a 1.9-fold risk in LNTB cases ( Table 5). None of the two tested NOS2A variants i.e., rs2274894 and rs7215373 showed any association either in PTB or LNTB, although the variant rs7215373 showed marginal association for both PTB and LNTB. Of the three studied three polymorphisms in the LTA4H gene rs1978331, rs2660898, and rs17525495, none of the variant showed association in either PTB or LNTB. LTA4H gene polymorphisms have been shown to provide heterozygous protection, implying that a heterozygous genotype is protective from TB [12]. When we compared the heterozygous genotypes vs the homozygotes as proposed by Tobin et al., we observed, that out of three typed variants, rs1978331 have a protective association (odds ratio < 1) in combined and LNTB but not in PTB (Table 6). Similar, odds were observed between the haplotype of rs1978331-rs2660898, where when both the SNPs are heterozygous, they are borderline protective for LNTB and not PTB (Table S4). We have previously shown that, SP110 gene polymorphisms were associated with risk of LNTB and not PTB in this population [19]. To continue exploring this gene in an independent cohort, we genotyped, SP110 variants rs6436915, rs1346311, rs7580900. As shown previously and none of these showed any allelic associations in PTB (Table 5). Due to limited independent samples for LNTB, these were not genotyped.
Genes 2023, 14, x FOR PEER REVIEW 9 of 18 cytokine gene polymorphism in PTB and LNTB, we carried out a similar analysis for the LNTB samples in this study. Overall, LNTB showed higher levels of the cytokine as compared to the healthy controls (Figure 2A). Out of 34 SNPs tested, none of the cytokine genotypes except for IL8 at rs3882891 showed any significant difference in cytokine levels as governed by their genotype (Figure 2B), lending credibility to a major role of cytokine gene polymorphism in PTB but not LNTB.     Since, we observed a uniform TLR gene polymorphism risk for both PTB and LNTB, we also genotyped four TLR8 gene polymorphisms, as their genetic association has been shown to be important for outcome of TB. Uniquely, TLR8 is located on the X chromosome, so the males as they have only one copy of the X chromosome, would be hemizygous. Analyzing the risk in males and females separately, revealed a higher risk for males as expected carrying A allele for rs3788935 (17-fold risk), rs3761624 (4-fold risk). A risk for female population was also detected for rs3761624 which was lost after multiple corrections testing. If A allele is risk factor for males as they carry only one copy a corresponding homozygous phenotype can be a risk factor for females too as depicted by rs3761624. The sample size of LNTB group (n = 50) was limited for a stratified analysis by sex, so it was not carried out for LNTB (Table 7). Interestingly, in a case vs control analysis not stratified by sex TLR8 PTB showed an increased risk for 3 (rs3788935, rs3761624, rs3764880) and LNTB 2 (rs3761624, rs3764879) among the four variants typed (Table S5).

Pathway Analysis Reveals an Apoptotic Axis for LNTB and a Cytokine Axis for PTB
SNP association and their respective p-values from the study was used as an input to identify associated modules from a protein-protein interaction network, which was used to identify the associated pathways and the results obtained were subjected to a gene-ontology annotation using WikiPathways [29,30]. All the SNPs that are significantly associated in this study and previous studies on this population [19,26] were considered. An abundance for cytokine pathways was seen for PTB ( Figure 3A), while abundance of apoptosis modules was seen for LNTB ( Figure 3B). This pathway level difference is in line with the genetic association findings presented above, showing these differential genetic association could contribute to differential pathway activation and hence activate the immune response distinctly. This adds another level of evidence of differential host genotype being responsible for different manifestation of PTB and LNTB. associated in this study and previous studies on this population [19,26] were considered. An abundance for cytokine pathways was seen for PTB ( Figure 3A), while abundance of apoptosis modules was seen for LNTB ( Figure 3B). This pathway level difference is in line with the genetic association findings presented above, showing these differential genetic association could contribute to differential pathway activation and hence activate the immune response distinctly. This adds another level of evidence of differential host genotype being responsible for different manifestation of PTB and LNTB.

Discussion
Human genetic diversity is hugely impacted by co-evolving pathogens such as Mtb [38]. Candidate gene studies using the case-control design provides one of the most direct means of identifying human genetic variants that currently impact on susceptibility to infectious disease. Such information would help improve the understanding of disease pathogenesis and disease resistance at an individual level, that could inform targeted intervention strategies based on their genotype as has been successfully implemented [12].
Several studies have shown that TB susceptibility has a genetic component (summarized in [22,39], but comparative studies on genetic susceptibility to different forms of TB are limited [21,40]. Such studies can provide insight into the role genetic polymorphisms in different manifestations of TB. Even rarer are studies on genetic susceptibility to EPTB. Some of the studied forms of EPTB have been involving multiple sites [16], LNTB [19,41,42], TB meningitis [42,43], intestinal TB [44], bone [42] and pleural [42]. In this study, we aimed to do a comparative study between PTB and LNTB to investigate differential genetic associations between PTB and LNTB. We tested several candidate gene polymorphisms, never investigated before, as associated with differential TB susceptibility (Table S1). In addition, we validated susceptibility loci previously identified in other populations [22,39] and our previous studies [19,26]. This is important as ethnic validation of commonly reported genetic variants in different populations is desirable. In total, 63 polymorphisms across 23 genes were selected and genotyped from both the innate and adaptive immune branches of immunity to TB in the north Indian population and their allele frequencies compared and linkage disequilibrium (LD) and haplotypes investigated. Thus, we have employed a comprehensive coverage of SNPs and genes to compare the genetic susceptibility differences between PTB and LNTB.
In our study, genetic variants in the cytokine were validated to be significant risk factors PTB (Table 3). We also showed that a significant gene-gene interaction among cytokine SNPs may further accentuate the importance of the identified SNPs in governing the genetic susceptibility to PTB. Interestingly, we didn't find significant cytokine gene polymorphisms associated with LNTB. The important difference was lack of a major association between cytokine SNPs and serum cytokine levels in LNTB, which has been shown to be associated with PTB in multiple studies [37,[45][46][47]. The difference between such association clearly shows that there are distinct genetic coordinates for with PTB and LNTB susceptibility. Highly enriched cytokine pathways in PTB and limited in LNTB ( Figure 3) add strength to this argument.
Interestingly the innate immunity genes, P2RX7 [48] and DCSIGN [49][50][51], which are critical for immune response to Mtb, were risk factors for both PTB and LNTB, as expected. P2RX7 has been very widely studied as risk factor for both PTB and EPTB. Macrophages from patients with loss of function homozygous allele for rs3751143, could not kill Mtb in vitro in EPTB [16]. We didn't observe any association with this variant in our study. Interestingly, another functional variant rs2393799, showed an increased risk for both PTB and LNTB, but the risk was much higher for LNTB (7-fold as compared to 1.5-fold for PTB). P2RX7 is known to have a role in apoptosis of Mtb infected macrophage [24]. Similar theme was seen for SP110 gene for which we have previously identified a risk for rs1427294 of in LNTB but not pulmonary TB [19]. Recently, this gene has been shown to inhibit apoptosis of infected macrophages, thereby resisting Mtb infection [23]. This in conjunction of identifying more apoptotic pathways do suggest that the apoptotic axis may be important in LNTB. The other genetic variants of importance in the LTA4H gene showed heterozygous protection in LNTB and not PTB (Table 6). So, we show in the current study that genetic variations in the innate immune genes have a closer relation to development of LNTB, whereas the cytokine genetic variants have little influence and associations in LNTB. Similarly, among the pattern recognition receptors, TLR2 and TLR4 showed risks for PTB but not for LNTB. TLR8 genetic variants showed risk for both PTB and LNTB with more risk for males in PTB. This adds to the theme of differential association between PTB and LNTB, TLRs variants have been shown to be critical risk factors for TB [52][53][54][55] but have not been studied for LNTB. Although, limited by sample size there appears to be differences in TLR gene polymorphisms in PTB and LNTB.
Similar to our study a few other studies have shown a selective genetic association with EPTB, for example, like our study (  [42]. Similarly, a GWAS could identify 4 loci that were only associated with EPTB and not PTB [21]. These studies support the differential nature of genetic polymorphism in EPTB, which is distinct from PTB. Similar studies are warranted for validation in a larger sample size and in multiple populations to test whether genetic polymorphism can associate with various forms of tuberculosis. The limitation of the study is that for certain polymorphisms we could not achieve a good sample size and thus the results need to be validated in a larger sample size.

Conclusions
Our study contributes to the growing knowledge that PTB and EPTB manifestations have a genetic basis. The highlight of our study was finding more polymorphic cytokine genes in PTB and more polymorphic apoptosis/innate genes in LNTB.