Host-Pathogen Interactions in K. pneumoniae Urinary Tract Infections: Investigating Genetic Risk Factors in the Taiwanese Population

Background: Klebsiella pneumoniae (K. pneumoniae) urinary tract infections pose a significant challenge in Taiwan. The significance of this issue arises because of the growing concerns about the antibiotic resistance of K. pneumoniae. Therefore, this study aimed to uncover potential genomic risk factors in Taiwanese patients with K. pneumoniae urinary tract infections through genome-wide association studies (GWAS). Methods: Genotyping data are obtained from participants with a history of urinary tract infections enrolled at the Tri-Service General Hospital as part of the Taiwan Precision Medicine Initiative (TPMI). A case-control study employing GWAS is designed to detect potential susceptibility single-nucleotide polymorphisms (SNPs) in patients with K. pneumoniae-related urinary tract infections. The associated genes are determined using a genome browser, and their expression profiles are validated via the GTEx database. The GO, Reactome, DisGeNET, and MalaCards databases are also consulted to determine further connections between biological functions, molecular pathways, and associated diseases between these genes. Results: The results identified 11 genetic variants with higher odds ratios compared to controls. These variants are implicated in processes such as adhesion, protein depolymerization, Ca2+-activated potassium channels, SUMOylation, and protein ubiquitination, which could potentially influence the host immune response. Conclusions: This study implies that certain risk variants may be linked to K. pneumoniae infections by affecting diverse molecular functions that can potentially impact host immunity. Additional research and follow-up studies are necessary to elucidate the influence of these risk variants on infectious diseases and develop targeted interventions for mitigating the spread of K. pneumoniae urinary tract infections.


Introduction
Urinary tract infection (UTI) is the most common bacterial infection worldwide.It is also an important issue in hospitalized patients because of increased antibiotic resistance and increased morbidity and mortality in immunocompromised patients or those with cancer [1].In both ambulatory and hospitalized patients, K. pneumonia is the second most common bacterial pathogen associated with UTIs around the world, and the frequency of K. pneumoniae infection has increased in hospitals or nursing homes [2].Although K. pneumonia is considered an opportunistic pathogen, several specific capsular serotypes of K. pneumonia with increased production of capsule polysaccharide are considered to be hypervirulent K. pneumonia.In addition to the increased production of capsule polysaccharide, which helps the bacteria resist phagocytosis, it is also believed that these hypervirulent K. pneumonia may have an impact on host immune reactions and cause invasive infections [3].They are also capable of hydrolyzing several kinds of antibiotics, especially carbapenems, which are so-called Klebsiella pneumoniae carbapenemase-producers, making them a threat to vulnerable patients due to their resistance to multiple antibiotics [4] and causing considerable healthcare costs [2].
To further understand the possible risk factors between K. pneumonia and the hosts, we use genome-wide association studies (GWAS) in order to determine possible common factors in the genomes of patients with UTIs caused by K. pneumonia.

Study Participants and Ethical Approval
All participants in this study were recruited from the Tri-Service General Hospital (Taipei, Taiwan; TSGH) to join the Taiwan Precision Medicine Initiative (TPMI) [5].The TPMI is held by Academia Sinica (Taipei, Taiwan) in partnership with 16 top medical centers in Taiwan and aims to establish a database consisting of comprehensive clinical data and the genetic profiles of one million Taiwanese Han population participants.Participants were recruited from medical centers and genotyped using Academia Sinica.The protocol of this study was reviewed and approved by the Institutional Review Board of the Tri-Service General Hospital (NO.: 2-108-05-038).
The profiles of all patients included in our study are detailed in Table 1.Additionally, we have listed several underlying diseases that are considered to pose a vulnerability to UTI, including diabetes mellitus, malignancy, chronic kidney disease, and a history of urinary tract stones.These medical histories were documented at the time of enrollment in the TPMI project for each patient.

Pathogen Identification in a Patient with UTI
Urine samples for bacterial culture were collected at the same time that the patients were transferred to our hospital because of UTI-related symptoms.All urine culture results were confirmed by MALDI-TOF with VITEK ® MS (bioMérieux, Marcy-l'Étoile, France).

Genotyping
Genotyping of TPMI participants was performed as follows: First, approximately 3 mL of peripheral blood per participant was collected into EDTA vacutainers.Genomic DNA was extracted from blood cells using the QIAsymphonyTM SP Stander protocol (Qiagen, Hilden, Germany).Next, the genomic DNA was characterized by its variants under a customized SNP array called the Axiom Genome-Wide TPM plate, which was developed by Academia Sinica (Taipei, Taiwan) and Thermo Fisher Scientific Inc. (Waltham, MA, USA).

GWAS Analysis
The steps of the GWAS are presented in Figure 1.A total of 1860 patients with UTI were selected from among the TSGH TPMI participants.After removing some unqualified samples, 1825 patients, including 62 UTIs with K. pneumoniae, were defined as case groups.The remaining 1763 patients with UTIs infected with other pathogens, such as E. coli, were included in the control group (Table 1).Genotyping data of these UTI patients from TPM array results were first filtered out by low typing call rate SNP (<80%) and then applied to association analysis by chi-squared test (case group vs. control) using PLINK 1.9 (https://zzz.bwh.harvard.edu/plink/(accessed on 16 January 2023)) software [6].Variants with low quality (minor allele frequency less than 0.05 and Hardy-Weinberg equilibrium less than 1 × 10 −6 ) were removed using PLINK, and highly significant p-values (less than 1 × 10 −5 ) were selected for further analysis.For linkage disequilibrium (LD) analysis, the GWAS results were loaded onto LocusZoom [7] to observe the LD relationship of each variant.

Variants from GWAS Analysis
After GWAS analysis, the filtered case and control groups (case group: 62 patients; and control group: 1763 patients), together with 241,217 SNP, passed through the SNP calling rate, minor allele frequency, and Hardy-Weinberg equilibrium.There were 13 significant variants obtained from the case group, and the Manhattan plot of the GWAS results is presented in Figure 2. It was found that variants from the case group had a higher odds ratio than the control groups (Table 2), and most SNP allele frequencies (rs10411896, Figure 1.GWAS analysis pipeline: Case groups (UTI with K. pneumonia only) and control groups (UTI with other pathogens) and genotyping identified by the TPMI project were loaded into PLINK and used the chi-squared test for detecting risk factors.High-significance variants were selected according to a p-value < 0.05.

Variants from GWAS Analysis
After GWAS analysis, the filtered case and control groups (case group: 62 patients; and control group: 1763 patients), together with 241,217 SNP, passed through the SNP calling rate, minor allele frequency, and Hardy-Weinberg equilibrium.There were 13 significant variants obtained from the case group, and the Manhattan plot of the GWAS results is presented in Figure 2. It was found that variants from the case group had a higher odds ratio than the control groups (Table 2), and most SNP allele frequencies (rs10411896, rs11672710, rs12313615, rs61875193, rs62126347, rs62126348, rs76541491, and rs117166327) were similar to the East Asian and Taiwan Han (TPMI and Taiwan Biobank) populations, which were different from other human races (Table 3).
Further analysis using linkage disequilibrium (LD) showed that in the case group, the variant rs61875193 had a high correlation with rs58910113 (Figure 3A), and rs73387413, which is highly associated with rs12313615, also showed the same result (Figure 3B).Surprisingly, rs11672710 not only had a high association with rs62126347, rs62126348, and rs10411896 in HSPBP1 but also with rs4806651 and rs4806653 in PPP6R1 and rs4337407 in TMEM86B (Figure 3B).rs11672710, rs12313615, rs61875193, rs62126347, rs62126348, rs76541491, and rs117166327) were similar to the East Asian and Taiwan Han (TPMI and Taiwan Biobank) populations, which were different from other human races (Table 3).According to the RefSeq results, these risk alleles belonged to the genes C12orf75, CASC18, HSPBP1, IQSEC1, KCNN3, MAGEC2, MICAL2, NUP210, PTCHD1-AS, SPANXN4, SUSD5, and TEX14 in the case group (Table 2).Interestingly, MIR4435-2HG and PTCHD1-AS are non-coding RNAs that regulate target genes without coding proteins.
Further analysis using linkage disequilibrium (LD) showed that in the case group, the variant rs61875193 had a high correlation with rs58910113 (Figure 3A), and rs73387413, which is highly associated with rs12313615, also showed the same result (Figure 3B).Surprisingly, rs11672710 not only had a high association with rs62126347, rs62126348, and rs10411896 in HSPBP1 but also with rs4806651 and rs4806653 in PPP6R1 and rs4337407 in TMEM86B (Figure 3B).

Functional Annotations of Risk Genes
Table 4 shows that 35 significant GO terms are identified, and further categories in Figure 4A present 14 classes, including carbohydrate catabolism, cation-potassium transport, cell-cell interaction, heart development, keratinocyte migration, lipid metabolism, mitotic phase, ncRNA transport, protein depolymerization, protein phosphoprotein, protein ubiquitination, purine metabolism, RNA catabolism, and tRNA catabolism.These biological functions belong to cell growth, metabolism, migration, and interactions.Notably, there are several genes involved in Ca 2+ /K + regulation (KCNN3), protein ubiquitination (HSPBP1 and MAGEC2), and the breaking down of protein polymers (MICAL2), which might have connections to the K. pneumonia infections.TMEM86B and PPP6R1, which were identified from LD analysis, played a role in lipid metabolism and phosphatase activity, respectively, which may affect immune cell activity and cause K. pneumoniae infection.

Pathway Analysis
After searching the Reactome database, 26 significant pathways were successfully identified (Table 5), and 11 classes of pathways are shown in Figure 4B: cation-potassium transport, MET, mRNA metabolism, mRNA transport, NS1/NS2 protein, nuclear pore complex, phosphatidylcholine acyl chain, Rev/Vpr protein, ribonucleoproteins, SUMOylation, and TPR in papillary thyroid carcinoma.Many pathways are involved in mRNA regulation and SUMoylation by the gene NUP210, which may alter the immune system relative to the UTI caused by K. pneumoniae.Pathways identified from KCNN3 (Ca 2+ -activated K + channels) and TMEM86B (acyl chain remodeling of phosphatidylcholine) fit the GO results, indicating that patients with abnormal cation-potassium transport or abnormal lipid metabolism are at a high risk of K. pneumoniae infection.

Investigation of Risk Gene Expressions
According to the Protein Atlas results in Table 6, the RNA expression of the genes C12orf75, FAM209B, HSPBP1, IQSEC1, KCNN3, LSM3, MICAL2, NUP210, PPP6R1, SOCAR, SUSD5, TMEM86B, and TNS3 was detected in immune cells.In particular, some of these proteins, such as C12orf75, FAM209B, KCNN3, MICAL2, PPP6R1, and TNS3, showed high immune cell specificity.Combined with the GO and Rectome analysis results, the changes in the variants further regulated the activity of immune cells.Unfortunately, few expression patterns were recorded in the GTEx database for these variants; only rs140411896 in HSPBP1 and rs4337407 in TMEM86B presented changes in gene expression in whole blood compared to the non-mutated alleles (Figure 5A: HSPBP1 down expression occurred with the rs10411896 mutation and Figure 5B: TMEM86B up expression with the rs4337407 mutation).These findings are worthy of further study to determine the relationship between the Taiwan Han population's special genetic allele affections in immune cells and K. pneumoniae infection in the urinary tract.where magnitude has no direct biological interpretation.The m-value means were made by using METASOFT to identify the posterior probability, and when the m-value is ≥0.9, it indicates that the tissue is predicted to have an eQTL effect.

Disease Association Analysis
K. pneumonia is a Gram-negative bacterium commonly found in the human gut and environment, and it is also known to be the major cause of healthcare-associated infections, especially pneumonia, bloodstream infections, and urinary tract infections.Previous studies have reported that K. pneumoniae infection is a rare cause of community-acquired pneumonia (CAP) in North America, Europe, and Australia [15].However, investigations in eight Asian countries, including Taiwan, have reported that K. pneumoniae is highly prevalent [16].Several drug-resistant K. pneumoniae variants have also posed a significant public health threat due to their difficulty in treatment and higher mortality rates, such as carbapenem-resistant K. pneumonia (CRKP), extended-spectrum beta-lactamase where magnitude has no direct biological interpretation.The m-value means were made by using METASOFT to identify the posterior probability, and when the m-value is ≥0.9, it indicates that the tissue is predicted to have an eQTL effect.

Disease Association Analysis
K. pneumonia is a Gram-negative bacterium commonly found in the human gut and environment, and it is also known to be the major cause of healthcare-associated infections, especially pneumonia, bloodstream infections, and urinary tract infections.Previous studies have reported that K. pneumoniae infection is a rare cause of community-acquired pneumonia (CAP) in North America, Europe, and Australia [15].However, investigations in eight Asian countries, including Taiwan, have reported that K. pneumoniae is highly prevalent [16].Several drug-resistant K. pneumoniae variants have also posed a significant public health threat due to their difficulty in treatment and higher mortality rates, such as carbapenemresistant K. pneumonia (CRKP), extended-spectrum beta-lactamase (ESBL)-producing K. pneumonia, and hypervirulent K. pneumoniae (hvKp).Due to the widespread resistance to antibiotics and the possibility of causing more severe infections, it is crucial to understand the specific resistance patterns and infectious pathogenesis, which can contribute to improving disease prevention, infection control, and further antibiotic development.
Patients with chronic diseases such as diabetes, cancer, and chronic kidney disease are believed to have a higher risk of developing K. pneumoniae UTIs [17].Although our research revealed no significant differences in the ratio of underlying conditions and diseases between the case group and the control group, which included diabetes mellitus, malignancy, chronic kidney disease, and other chronic viral infections, we did observe significant differences in certain genetic variants.Consequently, it suggests that the variation leading to the infection pathogen of urinary tract infections (UTI) in these patients might be more closely related to their genes than their underlying conditions.
Nevertheless, all the chronic diseases mentioned above have a prolonged and enduring course.Predicting whether patients in both groups will develop these diseases later on remains challenging.Nonetheless, this information could still provide valuable insights into the potential pathogenesis of UTI with K. pneumoniae infection.

Genotyping
By October 2022, the TPMI team had 382,259 genotyped samples.This genomic information is also combined with other clinical records, such as ICD10 disease codes, laboratory tests, medications (drug usage), vital signs, image descriptions, and operation notes, which makes the TPMI the largest genetic health association study compared to other similar projects held in the United States or Japan.The TPMI SNP array was modified from the Axiom Genome-Wide SNP Array Plate and can test approximately 130 thousand known risk variants, 580 thousand mapping SNPs, and 20 thousand copy number variant markers based on Taiwanese reference genomes and Taiwan Biobank whole genome sequencing data.
The risk variants from the GWAS in this study showed that most allele frequencies in the East Asian population, which were also close to the Taiwanese population, were the highest compared to other species.This phenomenon was also observed in the variants rs4337407, rs4806651, rs4806653, and rs58910113 from the LD analysis, not only showing a higher LD association with lead variants than with species, but also that their minor allele frequencies were higher in African or Asian population groups.Some variants of rs530922, rs1034726, rs6678353, rs12687449, and rs117166327 show non-or rare frequency records in the published database, which may be Taiwan-specific variants for K. pneumoniae infection risk factors.
K. pneumoniae is an opportunistic pathogen with compromised immune systems due to phagocytosis by epithelial cells, macrophages, neutrophils, and DCs, or weakened by other infections [4].As shown in Table 6 and Figure 5, the genes C12orf75, FAM209B, HSPBP1, IQSEC1, KCNN3, LSM3, MAGEC2, MICAL2, NUP210, PPP6R1, SOCAR, SPANXN4, SUSD5, TMEM86B, and TNS3 were involved in immune cells, and GTEx analysis showed that HSPBP1 and TMEM86B expression could be altered by specific genetic variants.Thus, our findings can also contribute to further clarifying the role of these risk genes in the relationship between K. pneumoniae infection and immune system defects.
Galeas-Pena reported that K. pneumoniae infection activates the signaling of nuclear factor kappa B (NF-κβ), promoting the recruitment of immune cells [18].The gene PPP6R1, which plays an important role in protein phosphorylation, has been shown to limit the activation of the NF-κβ pathway by reducing IκBε phosphorylation [19].This mechanism is probably caused by Slfn2 interacting with PPP6R1, leading to reduced type I IFN-induced activation of NF-κB signaling [20].PPP6R1 is also known as a regulatory subunit of PP6, which can negatively regulate NF-κβ signaling [21], highlighting the importance of phosphatase activity in the immune activation process during K. pneumoniae infection.
Bacterial cell-cell surface interactions are the core source of pathogen infection.Previous reports have demonstrated that alterations in focal adhesion and the actin cytoskeleton play important roles in the bacterial invasion of host cells [22].In addition, Hsu et al. indicated that Rho is involved in the activation of focal adhesion through the phosphorylation of focal adhesion kinase, which could affect the induction of cell cytoskeleton rearrangements.The same research also indicated that Cdc42 and the PI3K/Akt pathway are activated to induce cell cytoskeleton rearrangement via K. pneumoniae adhesion [23].Functional annotation of IQSEC1 and MICAL2, including focal adhesion and actin filament depolymerization, showed that mutations in both genes could pose potential risks to K. pneumoniae invasion.
In addition to the innate immune perturbations associated with K. pneumoniae, dysregulation of electrolyte homeostasis [24] and protein ubiquitination may also affect host immune immunity.Immune responses activated by ion channel transporters such as calcium, magnesium, sodium, potassium, and zinc have been reported, and pathogen infection also requires ion equilibrium changes in the living cell environment [25].According to the results of studying Galleria mellonella, which is used as a K. pneumoniae infection model organism, the amounts of calcium, potassium, magnesium, and phosphorus are altered during K. pneumoniae infection [26].Zhang also indicated that the calcium signaling pathway increased macrophage activity resulting from K. pneumoniae infection [27], and further research presented that TRPC1 (Transient receptor potential channel 1) could mediate Ca 2+ entry and activate NF-κβ/Jun, leading to the proinflammatory response [28].Thus, the mutation of KCNN3, which belongs to the Ca 2+ -activated potassium channels, may increase Ca 2+ accumulation in living cells through TRPC1 and promote Ca 2+ levels during K. pneumoniae infection.
Research has shown that K. pneumonia can interfere with or suppress host immunity and inflammation reactions by affecting the ubiquitination of various cellular signals [29,30].What is more, a small ubiquitin-related modifier (SUMO) can be involved in various biological processes, including immunopathology and inflammation [31], which can also be affected by K. pneumonia and cause extensive inflammation damage [32].In conclusion, according to these results, protein ubiquitination plays a role in the activation of the innate immune response and subsequent release of cytokines when facing bacterial invasion.Furthermore, our finding of risk genes for ubiquitination and SUMOylation may play a great part in the cellular immune responses against K. pneumoniae invasion.Thus, patients with risk genes affecting the regulation of the innate immune response may be more susceptible to K. pneumonia infection, although further research is needed to identify the relationships between these factors and K. pneumoniae infection, especially UTIs.

Limitations
Despite all the findings mentioned above, there are still some limitations to our research.Although the results had reached statistical significance, there still might be some pitfalls for our results to be extrapolated to the normal population due to the low number of case-group specimens.Our research excluded several samples that had co-infections with other pathogens in order to demonstrate our findings more clearly.However, it might have further information that can contribute to the pathogenesis.
Moreover, UTI has been known to have a high recurrence rate [2], but it is difficult to compare the recurrent rate and the prognosis between the case group and control group because it is difficult to define the first episode of UTI and track the following medical history of recurrent UTI events.We still need further delicate studies to reveal the difference between these patients.However, we believe our findings can pave the way for the following research.

Diagnostics 2024 , 15 Figure 1 .
Figure 1.GWAS analysis pipeline: Case groups (UTI with K. pneumonia only) and control groups (UTI with other pathogens) and genotyping identified by the TPMI project were loaded into PLINK and used the chi-squared test for detecting risk factors.High-significance variants were selected according to a p-value < 0.05.

Figure 3 .
Figure 3. Linkage disequilibrium (LD) relationship of variants in the case group: Using the lower variant as the leading SNP (violet diamond) and East Asian population frequency, we detected (A) rs61875193 and rs58910113 in chr11 and (B) rs73387413 and rs12313615 in chr12.(C) rs62126347, rs62126348, rs11672710, and rs10411896 have a high LD r square correlation.Moreover, rs4337407 and rs4806651 in gene PPP6R1 and rs4806653 in gene TMEM86B also have a high LD association with rs11672710.Think and bold lines are intron and exon regions; the arrows mean transcription direction (left to right: positive stand; right to left: negative strand).

Figure 3 .
Figure 3. Linkage disequilibrium (LD) relationship of variants in the case group: Using the lower variant as the leading SNP (violet diamond) and East Asian population frequency, we detected (A) rs61875193 and rs58910113 in chr11 and (B) rs73387413 and rs12313615 in chr12.(C) rs62126347, rs62126348, rs11672710, and rs10411896 have a high LD r square correlation.Moreover, rs4337407 and rs4806651 in gene PPP6R1 and rs4806653 in gene TMEM86B also have a high LD association with rs11672710.Think and bold lines are intron and exon regions; the arrows mean transcription direction (left to right: positive stand; right to left: negative strand).

Figure 4 .
Figure 4. Summary of GO (A) and Reactome pathway (B) annotation: Variant genes were uploade to the Gene Ontology and Reactome database, and the identified GO and Reactome terms (p-valu < 0.05) were collected.Similar functional terms were pooled together in the pie chart.

Figure 4 .
Figure 4. Summary of GO (A) and Reactome pathway (B) annotation: Variant genes were uploaded to the Gene Ontology and Reactome database, and the identified GO and Reactome terms (p-value < 0.05) were collected.Similar functional terms were pooled together in the pie chart.

15 Figure 5 .
Figure 5. GTEx tissue eQTLs of variants in HPSPBP (A) and TMEM86B (B): All tissue eQTLs were selected based on a p-value < 0.01, and normalized effect size (NES) is defined as the effect of the alt allele relative to the ref allele in the human genome reference by computing in a normalized space where magnitude has no direct biological interpretation.The m-value means were made by using METASOFT to identify the posterior probability, and when the m-value is ≥0.9, it indicates that the tissue is predicted to have an eQTL effect.

Figure 5 .
Figure 5. GTEx tissue eQTLs of variants in HPSPBP (A) and TMEM86B (B): All tissue eQTLs were selected based on a p-value < 0.01, and normalized effect size (NES) is defined as the effect of the alt allele relative to the ref allele in the human genome reference by computing in a normalized space where magnitude has no direct biological interpretation.The m-value means were made by using METASOFT to identify the posterior probability, and when the m-value is ≥0.9, it indicates that the tissue is predicted to have an eQTL effect.

Table 1 .
Sample group information for GWAS analysis.

Table 2 .
Variants obtained from the GWAS result with high significance in the case group.
CHRPosition SNP ID Ref a Alt b p-Value c a Allele from the control group.b Allele from the case group.c Filtered by p < 0.05.

Table 3 .
Variant allele frequency from other genetic projects.

Table 2 .
Variants obtained from the GWAS result with high significance in the case group.
a Allele from the control group.b Allele from the case group.c Filtered by p < 0.05.

Table 3 .
Variant allele frequency from other genetic projects.

Table 4 .
GO annotation of variant genes a .

Table 5 .
Reactome annotation of variant genes a .
a Selected from p-value < 0.05.

Table 6 .
Disease association analysis from DisGeNET a .