The Lipid A from the Lipopolysaccharide of the Phototrophic Bacterium Rhodomicrobium vannielii ATCC 17100 Revisited

The structure of lipid A from lipopolysaccharide (LPS) of Rhodomicrobium vannielii ATCC 17100 (Rv) a phototrophic, budding bacterium was re-investigated using high-resolution mass spectrometry, NMR, and chemical degradation protocols. It was found that the (GlcpN)-disaccharide lipid A backbone was substituted by a GalpA residue that was connected to C-1 of proximal GlcpN. Some of this GalpA residue was β-eliminated by alkaline de-acylation, which indicated the possibility of the presence of another so far unidentified substituent at C-4 in non-stoichiometric amounts. One Manp residue substituted C-4′ of distal GlcpN. The lipid A backbone was acylated by 16:0(3-OH) at C-2 of proximal GlcpN, and by 16:0(3-OH), i17:0(3-OH), or 18:0(3-OH) at C-2′ of distal GlcpN. Two acyloxy-acyl moieties that were mainly formed by 14:0(3-O-14:0) and 16:0(3-O-22:1) occupied the distal GlcpN of lipid A. Genes that were possibly involved in the modification of Rv lipid A were compared with bacterial genes of known function. The biological activity was tested at the model of human mononuclear cells (MNC), showing that Rv lipid A alone does not significantly stimulate MNC. At low concentrations of toxic Escherichia coli O111:B4 LPS, pre-incubation with Rv lipid A resulted in a substantial reduction of activity, but, when higher concentrations of E. coli LPS were used, the stimulatory effect was increased.


Introduction
The budding, purple non-sulphur, Gram-negative bacterium Rhodomicrobium vannielii (Rv) possesses a unique and complex life cycle, which is characterized by the three different cell types, i.e., budding and swarmer cells, and exospores [1]. Its cell envelope contains lipopolysaccharide (LPS) in its outer membrane, the composition of which has been investigated in several strains [2,3]. In addition, phenol-chloroform-light petroleum (PCP) extraction of the LPS resulted in the isolation of several compounds, namely the LPS and novel hopanoids [4,5], of which 35-aminobacteriohopane-32,33,34-triol that represents the basic moiety.. Two additional hopanoids were substituted at the amino group at C-35 by either tryptophane or ornithine.
Additionally, the structure of the phosphate-free lipid A from R. vannielii ATCC 17100 (Rv) was investigated [6], however, its structure could not be determined completely. It consisted of a central β-(1 →6)-linked GlcpN disaccharide, some 30% of which were substituted by Manp in (l→4 )-linkage at the non-reducing GlcpN. The reduction experiments with NaBH 4 indicated that the reducing GlcpN might not be substituted at C-1. The mannose-substituted backbone structure of the lipid A was proposed as β-Manp-(l→4)β-GlcpN-(1→6)-GlcpN. 16:0(3-OH) was linked to the amino group of the reducing GlcpN. The residue at the amino group of the non-reducing GlcpN could not be identified. Holst and co-workers [6] found O-acyl residues, as follows: 14:0(3-O-14:0), 14:0(3-OH), 22:1, and O-acetyl groups. All of the 3-OH fatty acids had R-configuration. The position of double bond in 22:1 was established as ∆ 14 . Furthermore, Holst and co-workers [6] indicated that Manp was not substituted.
Because this lipid A structure was incomplete and mostly based on chemical analysis and GC-MS data, we decided to re-investigate it, applying modern NMR, mass spectrometry, and chemical degradation protocols that allow for us to report a revised structure now. Additionally, the comparative analysis of genes that are responsible for lipid A backbone biosynthesis was done. The selected Rv gene sequences were compared to the analogous sequences that were described for Rhizobium leguminosarum bv. Viciae and Mesorhizobium loti used as reference strains. In addition, the endotoxic properties of Rv lipid A were investigated at the model of human mononuclear cells (MNC). A biologically active, bi-phosphorylated GlcpN-based lipid A of Escherichia coli O111:B4 LPS was used in these studies.

Results
LPS that was extracted from Rv cells utilizing the hot phenol/water method was mainly found (>95%) in the water phase. Fatty acid analysis of LPS revealed the presence of a complex set of 3-hydroxy fatty acids, containing from 14 to 18 carbon atoms. Among them, iso and anteiso isomers, as well as 18:1, were present ( Figure 1, Table 1). The most abundant were 16:0(3-OH) and 14:0(3-OH). Among the non-polar fatty acids, a high amount of two long-chain unsaturated ones were present, namely 22:1ω7 and 24:1ω7, accompanied by their saturated forms. The positions of double bond in unsaturated acyl residues were established analyzing Rv LPS fatty acids pyrrolide derivatives [7]. All the hydroxyl fatty acids (with the only exception of 14:0(3-OH)) were found to be amide-bound to the lipid A backbone, as confirmed by GLC-MS analysis ( Figure 1, region from 36 to 41 min). During chromatographic analysis of lipid A fatty acids (Figure 1), we found that the solvolysis conditions that were applied were insufficient to quantitatively liberate fatty acids that were bound via amide linkages. Some of them remained in the form of N-acyl-GlcpN residues. Their trimethylsilyl derivatives appeared late in chromatographic analysis, and they were identified as N-3-hydroxymyristylglucosamine (16:0(3-OH)-GlcpN, main peak), and i17:0(3-OH)-GlcpN, 18:1(3-OH)-GlcpN (both are small peaks), and 18:0(3-OH)-GlcpN (traces). They were present in a relative ratio of 1.00:0.04:0.04:0.01, respectively, as determined from the total ion current peak areas. The EI-MS fragmentation pattern of the main N-acyl-GlcpN peak (containing three indicative ions m/z at 457, 510, and 543) was the same, as described by Bhat and co-workers in the analysis of R. leguminosarum lipid A [8]. A very similar spectrum was also presented by Choma and Komaniecka in the work describing lipid A from Azospirillum lipoferum [9]. Lipid A that was obtained from Rv LPS by mild acid hydrolysis was subjected to sugar analysis. D-GlcpN, D-Manp, and D-galactopyranosuronic acid (D-GalpA) were identified as the components of the lipid A backbone.
In order to determine molecular mass and fatty acid distribution in the lipid A from Rv, the O-de-acylated and native lipid A preparations were analyzed while using electrospray ionization (ESI) mass spectrometry in the negative and positive modes of ionization (  Figure 3 shows the deciphered. All of the ions described, particularly type B ions, indicated D-Manp to be linked at C-4 . Thus, the D-GlcpN disaccharide should be terminated with D-GalpA on the reducing end.  The genomic comparative studies in silico were performed in order to confirm the above observations (Table 2). Proximal part of Rv lipid A resembles the mesorhizobial lipid [10,11]. Lipid A of M. loti possesses a D-GalpA residue linked at position C-1 of the amino sugar backbone. For this substitution, two enzymes encoded by genes lpxE (lipid A 1-phosphatase) and rgtF (α-(1,1)-GalA transferase) are responsible. The presence of sequences denoted as: Rvan_2973 (as well as Rvan_3636) having a significant similarity value to the lpxE, and Rvan_0660 with a significant similarity to the rgtF, clearly indicated that the position C-1 of reducing D-GlcpN was substituted by D-GalpA that was linked by α-(1→1)-glycosidic bond. From the other hand, R. leguminosarum bv. Viciae 3841 decorates its lipid A backbone with D-GalpA exclusively at its distal part. Two enzymes (LpxF and RgtD) are engaged in this process. Putative ORFs for these genes were not detected in Rv ATCC 17100 strain genome ( Table 2).
Rv ATCC 17100 (amino acids number) Expect value (amino acids number) Expect value The ESI mass spectrum of native lipid A ( Figure 4) exhibited two sets of peaks at m/z between 1714 and 1780, and above m/z 1942 with prominent signals. The first group of signals corresponded to a tetra-acylated and the second group to a penta-acylated lipid A. At least eight different molecular variants of lipid A could be distinguished in the second (main) group of signals. The difference of 14 u between successive signals proved that diversity originated from the acylation pattern of the lipid A backbone, as well as from primary fatty acids of different chain lengths. This acylation pattern variety mainly originated from ester-linked residues, since the amino groups of both GlcpN were almost exclusively substituted by 16:0(3-OH). The mass difference of 228 u between the signals from both groups of ions corresponded to the mass of 14:0 (see Figures 4 and 5), which indicated that signals from m/z 1714 to 1780 were due to the degradation (elimination of fatty acid in the ion source) of lipid A molecules carrying five acyl residues or can be considered to be a natural heterogeneity of this lipid A. In summary, the LPS of Rv mainly contained penta-acylated lipid A species.
To establish the fatty acid distribution in the native lipid A, we chose the ion at m/z 1942.389 to perform ESI MS-MS analysis ( Figure 5). The presence of a B 2¯i on (at m/z 1105.837 in this spectrum (a composition with 16:0(3-O-22:1) and 14:1 residues) provided to be the basis for assigning the asymmetric distribution of fatty acids. Similar conclusions could be drawn from the Y 2¯i on (at m/z 608.328) (see Figure 5, and the inserted chemical formula). Additionally, the presence of ion type 0,4 A 3¯( at m/z 618.351) having a characteristic crossring fragmentation indicated that position C-3 was not occupied by a 3-OH fatty acid ( Figure 5).
Thus, Rv native lipid A contained two secondary non-polar fatty acids. One of them, which was connected to the primary fatty acid at C2 -position, belonged to the group of long chain fatty acids and possessed either 22 or 24 carbon atoms. The enzyme transferring this residue from ACP (acyl carrier protein) should be the orthologue of E. coli LpxM protein.
Computational analysis revealed the presence of Rvan_3546 ORF in the Rv genome, which only showed 26% identity and an E value of 4 × 10 −9 when compared with E. coli K-12 lpxM (msbB). An analogous analysis carried out for lpxL (E. coli K-12) did not lead to the detection of a similar open reading frame in Rv.  NMR spectroscopy of lipid A completed the structural analyses. The signals were assigned by various two-dimensional (2D) techniques, as listed in Materials and Methods, and by comparison to data that were published earlier [12]. The 1 H NMR spectrum contained nine major signals in the region δ 5.40-4.60, of which those at δ 5.36 and 5.32 originated from the double bonds of 22:1/24:1, and those at δ 5. 19 Table 3 and corresponded cross-peaks of lipid A sugar region were assigned, as depicted in Figure 6.  In a further experiment, lipid A was completely de-acylated by mild hydrazinolysis and hot KOH treatment, and the product was obtained from purification by gel-filtration. Because no other components had been indicated to be present by MS, we expected to obtain the tetrasaccharide α-D-Manp-(1→4)-β-D-GlcpN-(1→6)-α-D-GlcpN-(1→1)-α-D-GalpA. However, a mixture of two compound was yielded, namely this tetrasaccharide and, in somewhat smaller amounts, the tetrasaccharide α-D-Manp-(1→4)-β-D-GlcpN-(1→6)-α-D-GlcpN-(1→1)-β-L-threo-hex-4-enuronopyranose (E'→F'→D'→C'), clearly proving that, in some amounts of lipid A, the GalpA residue was substituted at C-4, which resulted in βelimination under the harsh conditions of KOH treatment. Unfortunately, the substituting compound could not be identified, so far.
The structure of this next tetrasaccharide was elucidated by NMR spectroscopy, including an HMBC spectrum that identified the connectivity of H-4 C'/C-5 C' (δ 5.87/146.1) [12].  F'), however, only some 1 H shift differences could be observed. The chemical shifts of the Manp residues remained unchanged. All of the data are summarized in Table 4 and showed in Figure 7.   In summary, the structure of the main component of Rv lipid A (corresponding to the ion at m/z 1942.389 in the ESI-MS spectrum showed in Figure 4) is proposed, as depicted in Figure 8. It should be noted that the NMR data of the de-acylated product indicated the presence of another molecular species containing a, so far, unidentified residue linked to C-4 of GalpA in non-stoichiometric amounts. Four fatty acid residues were linked to the non-reducing β-D-GlcpN, and one to the reducing α-D-GlcpN (4 + 1 distribution). The biological activity of Rv lipid A was investigated in human mononuclear cells (MNC) (Figure 9). Rv lipid A did not possess any significant biological activity in terms of activating human immune cells itself, measured as an ability to induce the production of IL-1β and TNF-α. However, upon pre-incubation of the MNC with 10 µg/mL of Rv lipid A, the dose response curve of highly active E. coli O111:B4 LPS was shifted. The activity of low concentrations (0.1 and 1 ng/mL of E. coli O111:B4 LPS) was decreased, whereas the activity of higher concentrations (100 and 1000 ng/mL of E. coli O111:B4 LPS) was increased.

Discussion
Earlier structural studies on the phosphate-free lipid A of Rv [6] showed that this molecule differed considerably from the lipid A structures known then, described e.g., for Salmonella and several other Gram-negative bacteria. The authors postulated, that " . . . the position C-1 of reducing glucosamine can be either not substituted, or the substituent can be removed during the preparation of free lipid A or during reduction with NaBH 4 " [6]. During the current structural studies, utilizing modern analytical techniques and protocols, including high-field two-dimensional NMR spectroscopy and high-resolution MS, it was possible to revise the structure of this lipid A. Particularly, it was found that C-1 was occupied by α-GalpA, and that position C-4 of β-D-GlcpN was substituted by α-D-Manp, of which the β-linkage had been proposed in 1983. Additionally, the fatty acid substitution pattern could be identified by high-resolution MS, elucidating a penta-acyl lipid A with a fatty acid distribution of 4 + 1 (Figure 8), which was deprived of a 3-OH-fatty acid at C-3 of the proximal GlcpN of the lipid A backbone.
An α-GalpA residue that was linked to C-1 of lipid A has been reported for different bacteria several times [9][10][11]. The transferase encoded by the gene designated as rgtF is responsible for the transfer of GalpA from the donor dodecyl-P-GalA to lipid A, according to Brown and co-workers [11]. Similar genes have been identified in R. leguminosarum, M. loti, M. opportunistum, Caulobacter crescentus, Aquifex aeolicus, Azospirillum lipoferum [11], as well as in at least four strains from Phyllobacterium [13], and in many bacteria from the class α-proteobacteria, where, in several cases, structural studies have confirmed the presence of α-GalpA at C-1 of lipid A ( [14], and this work). Among them are also bradyrhizobia [15], photosynthetic Bradyrhizobium BTAi1 [16], and Rhodopseudomonas palustris BisA53 [17]. All these strains, together with Rv, possess an ortholog of lpxE that is required for a dephosphorylation of lipid A precursor at C-1 before the action of RgtF protein ( Table 2, and data from NCBI gene database).
Notably, some of the Rv lipid A molecules possessed a GalpA that was substituted at C-4 by a, so far, unidentified compound, as clearly shown by the structural analysis of the products that were obtained after mild hydrazinolysis and hot KOH treatment of lipid A. A β-eliminated GalA (β-L-threo-hex-4-enuronopyranose) had been identified after such treatment of certain core region and O-antigen structures earlier [12,18]. Contradictory, MS and NMR investigation of lipid A data did not allow for us to identify any additional substituent. However, there were some cross signals in the 1 H, 13 C HSQC-DEPT spectrum that could not be assigned.
The presence of Manp at C-4 of Rv lipid A was confirmed in this work, as was also observed by Holst and co-workers [6]. However, the β-anomeric configuration proposed in 1983, which was based on the rather preliminary results of chromic oxide treatment, could now be revised to α by the data that were obtained from modern two-dimensional NMR spectroscopy. Manp at C-4 of the lipid A backbone had been identified in several lipid A earlier, also as Manp disaccharides or phosphorylated [19][20][21].
In the Rhodomicrobium spp. genomes, groups of genes encoding proteins that were associated with the synthesis and modification of lipids A could be detected. Among them, the NCBI database indicates the presence of two sequences encoding acyloxyacyl hydrolases (accession: WP_088342779.1 (206aa) and WP_088346973.1 (159aa)), synonyme "lipid A 3-O-acylo deacylase, pagL", the protein that is responsible for removing an acyl substituent from C-3 of the lipid A precursor. These sequences have not an obvious equivalent (counterpart) in the genome of Rv. While using BLAST (tblastn), we could conclude that the gene Rvan_3506, annotated in the genome of the strain ATCC 17100 as encoding for ATP/cobalamin adenosyltransferase, possessed a fragment containing motives characteristic for pagL (Accession: CP002292.1). However, to demonstrate its modifying effect on Rv lipid A, mutagenesis inactivating the above mentioned gene should be carried out, which represents activities that go beyond the subject of the presented work.
The lipid A from Rv did not possess any significant stimulatory activity in human mononuclear cells, as seen in Figure 9. Based on the proposed structure, this was not unexpected, since lipid A molecules that differ from the canonical six-fold acylated E. coli lipid A structure are usually much less or not active at all [22]. However, the effect on the stimulatory activity of E. coli O111:B4 LPS is interesting: at low concentrations of LPS, the pre-incubation with lipid A from Rv led to a substantial reduction of activity. This reduction could be explained by the 10-100,000 fold excess of the inactive Rv lipid A and the competition of the inactive and active compounds for the molecules that are required for proper LPS signaling (LBP, CD14, TLR4, MD-2) [23]. Surprisingly, when higher concentrations of active LPS (100-1000 ng/mL, leading to a ratio of 1:100 to 1:1000 active vs. inactive molecules) were used, the stimulatory capacity was increased upon pre-incubation with Rv lipid A. This effect could hardly be explained by the same competition effect taking place at lower concentrations. Rather, this ratio apparently led to an enhanced accessibility of active LPS molecules, potentially enabling the above-mentioned signaling molecules an easier extraction of LPS molecules from LPS micelles.

Isolation of LPS
The bacterial cells were harvested, and the cell pellet (125 g wet mass) was washed twice with saline. The bacterial mass was subjected to de-lipidation and enzymatic digestion procedures according to the method that was described by Choma and co-workers (2012) [24]. The LPS was extracted from enzymatically degraded cells while using the hot 45% phenol/water extraction method [25,26], and purified by ultracentrifugation (104,000× g, 4 • C, 4 h). The yield of LPS (a water-soluble fraction) was established at the level of 4.4% of dry bacterial mass.

Isolation and Purification of Lipid A
Lipid A was isolated by mild hydrolysis of the water phase deriving LPS (120 mg LPS, 18 mL acetic acid/sodium acetate buffer, pH 4.4, 100 • C, 2.5 h). The hydrolysate was cooled and then converted into a two-phase Bligh-Dyer system, i.e., chloroform/methanol/ hydrolysate, 2:2:1.8 (v/v/v), by adding adequate amounts of chloroform and methanol. The chloroform phase containing lipid A was separated by centrifugation and then washed twice with the aqueous phase from a freshly prepared two-phase Bligh-Dyer mixture [21,27]. The lipid A-containing fractions were combined, dried by rotary evaporation, and then stored at −20 • C in chloroform/methanol (3:1, v/v). As yield, 36 mg (1.2% of the LPS) of pure lipid A was obtained.

Lipid A De-Acylation
For O-de-acylation, 1 mg of lipid A was incubated in a mixture containing chloroform/methanol/1 M aqueous NaOH (2:3:1, by vol.) for 1 h at room temperature, according to a previously described method [15,28].

Fatty Acids and Sugars Analysis
The sugar composition was established after the hydrolysis of lipid A with 2 M trifluoroacetic acid (100 • C, 4 h), after which the liberated monosaccharides were converted into (amino)alditol acetates [30]. The absolute configurations of the monosaccharides were established by the analysis of acetylated R-(-)-butyl glycosides according to a procedure of Gerwig and co-workers [31]. The presence of uronic acids in lipid A was established after carboxyl reduction (NaBD 4 , 4 • C, 48 h), hydrolysis, and conversion of the products into alditol acetates.
The qualitative fatty acid composition was established after the methanolysis of LPS with 2 M HCl/MeOH, (85 • C, 18 h). The quantitative fatty acid analysis was performed after hydrolysis of LPS using 4 M aqueous HCl (100 • C, 5 h), the extraction of free fatty acids into chloroform, and methanolysis (0.5 M HCl/methanol, 85 • C, 2 h). In both of the procedures, liberated hydroxy-fatty acid methyl esters were converted into their trimethylsilyl derivatives. To establish the position of double bond in unsaturated fatty acids, the Rv LPS fatty acids pyrrolidide derivatives were prepared and analyzed, as was described earlier by Andersson and Holman [7].
Sugar and fatty acid derivatives were analyzed while using a gas chromatograph 7890A (Agilent Technologies, Inc., Wilmington, DE, USA) that was connected to a mass selective detector (MSD 5975C, inert XL EI/CI; Agilent Technologies, Inc., Wilmington, DE, USA) (GLC-MS). Helium was used a carrier gas, with a flow rate of 1.0 mL/min. The chromatograph was equipped with a HP-5MS column (30 m × 0.25 mm). The temperature program was 150 • C/5 min, with 5 • C/min to 310 • C, and held then for 10 min.

Mass Spectrometry
ESI-MS spectrometry was performed with SYNAPT G2-Si HDMS instrument (Waters Corporation, Milford, MA, USA) operating in negative and positive ion electrospray mode. The acquisition of the data were performed while using MassLynx software, version 4.1 SCN916 (Waters Corporation, Wilmslow, United Kingdom).
Lipid A samples (native and O-de-acylated) were dissolved in chloroform/methanol (3:1, v/v) at a concentration of 10 µg/µL. For the negative ion mode, a sample of 50 µL was transferred into 2 mL vial and then dissolved in 450 µL of 2-propanol/water/triethylamine (50:50:0.001, by vol.), pH 8.5. For the positive ion mode, 50 µL was dissolved in 450 µL water/acetonitrile/acetic acid (30:10:0.4, by vol.) [21]. The samples were injected by infusion, at a flow rate of 20 µL/min. The capillary and cone voltages were set at 3.0 kV and 40 V for positive electrospray mode and 3.8 kV and 40 V for negative electrospray mode. The source temperature was set to 100 • C, the cone gas was set to a flow rate of 100 L/h, and the desolvation nitrogen gas was used at a flow rate of 600 L/h. For MS/MS experiments, isolated precursor ions were fragmented while using collision voltage of 60 V, 75 V, and 90 V. The data were collected for 120 s for each precursor ion. Mass spectra were assigned with a multi-point external calibration while using sodium iodide (Sigma) in positive and negative ion modes.

NMR Spectroscopy
Homo-and heteronuclear 1D ( 1 H, 13 C) and 2D NMR experiments, i.e., correlation spectroscopy ( 1 H, 1 H-COSY), double-quantum filtered phase sensitive correlation spectroscopy (DQF-COSY), total correlation spectroscopy ( 1 H, 1 H-TOCSY), rotating frame nuclear Overhauser effect spectroscopy ( 1 H, 1 H-ROESY), heteronuclear single quantum coherencedistortionless enhancement by polarization transfer spectroscopy ( 1 H, 13 C-HSQC-DEPT), and heteronuclear multiple bond correlation ( 1 H, 13 C-HMBC) were recorded on solutions of lipid A in DMSO-2 H 6 /C 2 H 3 Cl (1:1, v:v) at 27 • C with a Bruker DRX Avance 700 MHz spectrometer that was equipped with a 5 mm CPQCI multinuclear-inverse cryo probe head with a z gradient and Bruker software. The used frequencies were 700.75 MHz for 1 H NMR and 176.2 MHz for 13 C NMR. The NMR spectra of de-acylated lipid A sample, which were obtained from mild hydrazinolysis and hot KOH treatment, were recorded on a solution of 2 H 2 O. All 1 H, 13 C spectra were calibrated to internal acetone (δ H 2.225, δ C 31.45).

Assay in Human Mononuclear Cells (MNC)
Peripheral blood MNC from healthy human volunteers (prepared from heparinized blood by gradient centrifugation while using Biocoll, Merck, Darmstadt, Germany) were incubated at a concentration of 1 × 10 6 /mL in 96-well tissue culture plates at a volume of 150 µL using RPMI-1640 medium that was supplemented with 100 U/mL penicillin, 100 µg/mL streptomycin (both PAA Laboratories, GmbH, Cölbe, Germany), and 10% of heat-inactivated FCS (Merck Millipore, Biochrom AG, Berlin, Germany). The cells were then stimulated with either increasing concentrations of lipid A from Rv or E. coli O111:B4 LPS or pre-incubated with Rv lipid A (10 µg/mL) for 1 h and then stimulated with increasing concentrations of E. coli O111:B4 LPS. After a culture period of 20 h at 37 • C, the culture supernatants were harvested and the level of IL-1β and TNF-α production was determined while using an ELISA according to the manufacturers' protocol (Invitrogen GmbH, Karlsruhe, Germany). The data shown represent the mean ± SD from n = 3 independent experiments.

Bioinformatics Tools
Standard BLASTP was used in searching for genes encoding putative proteins that were engaged in the biosynthetic pathway of Rv lipid A. R. leguminosarum 3841 and M. loti MAFF 303099 protein sequences were used as queries in BLASTP searches against Rv registered in the Genomes OnLine Database. Individual protein sequences were then compared across their entire span with an on-line Global Alignment tool (using the Needleman-Wunsch algorithm) that was provided by the National Center for Biotechnology Information (NCBI). Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.