Acinetobacter baumannii K106 and K112: Two Structurally and Genetically Related 6-Deoxy-l-talose-Containing Capsular Polysaccharides

Whole genome sequences of two Acinetobacter baumannii clinical isolates, 48-1789 and MAR24, revealed that they carry the KL106 and KL112 capsular polysaccharide (CPS) biosynthesis gene clusters, respectively, at the chromosomal K locus. The KL106 and KL112 gene clusters are related to the previously described KL11 and KL83 gene clusters, sharing genes for the synthesis of l-rhamnose (l-Rhap) and 6-deoxy-l-talose (l-6dTalp). CPS material isolated from 48-1789 and MAR24 was studied by sugar analysis and Smith degradation along with one- and two-dimensional 1H and 13C NMR spectroscopy. The structures of K106 and K112 oligosaccharide repeats (K units) l-6dTalp-(1→3)-D-GlcpNAc tetrasaccharide fragment share the responsible genes in the respective gene clusters. The K106 and K83 CPSs also have the same linkage between K units. The KL112 cluster includes an additional glycosyltransferase gene, Gtr183, and the K112 unit includes α l-Rhap side chain that is not found in the K106 structure. K112 further differs in the linkage between K units formed by the Wzy polymerase, and a different wzy gene is found in KL112. However, though both KL106 and KL112 share the atr8 acetyltransferase gene with KL83, only K83 is acetylated.


Introduction
Acinetobacter baumannii is one of the leading bacterial agents of difficult-to-treat serious nosocomial infections on a global scale. Due to increasing and widespread resistance to carbapenems, one of the last line antibiotics, the World Health Organization listed A. baumannii of highest priority for the development of novel therapeutics [1]. However, development of effective alternate therapeutics is made challenging by the highly variable capsular polysaccharide (CPS), which surrounds the A. baumannii cell and protects the bacteria from the action of immune system components, as well as disinfectants, desiccation, and certain antimicrobial compounds [2][3][4] and from attack by many phages [5][6][7].
(for examples, see [8][9][10][11][12]). This high structural diversity in the species is predominantly due to extensive variation in the genetic content at the chromosomal K locus (KL) that drives CPS biosynthesis [13], and to date, more than 140 KL gene clusters have been identified at this location [14]. Chemical structures are now available for more than 40 different A. baumannii CPSs [15], and generally, the resolved CPS structures are consistent with the genes located at the K locus. One aspect of the diversity is that several structures have been found to include sugar substrates that are either only found in A. baumannii or otherwise rarely occur in nature; for example, 5,7-di-acetylacinetaminic acid [16], 5,7-di-acetyl-8-epiacinetaminic acid [17], N-acetylviosamine [18], and 6-deoxy-L-talose [19].
As further KL gene clusters are found and their corresponding CPS structures are determined, groups of related KL that share genetic features and direct synthesis of related CPS structures have emerged. Previously, a group of eight related KL gene clusters were reported [19], which carry a novel tle epimerase gene for the conversion of dTDP-Lrhamnose (dTDP-L-Rhap) to dTDP-6-deoxy-L-talose (dTDP-L-6dTalp) and rmlBDAC genes for the synthesis of dTDP-L-Rhap. Five of these gene clusters, KL11, KL83, KL29, KL105, and KL106, shared additional features, differing predominantly from each other in the specific combination of gtr glycosyltransferase and atr acetyltransferase genes present. The structures reported for K11 and K83 ( Figure 1) allowed the encoded enzymes to be assigned to the formation of specific linkages based on shared features identified. How-ever, the K29, K105, and K106 structures remained to be established.
In this work, we determine the K106 structure from A. baumannii isolate 48-1789 and correlate the structural data with the gene clusters of KL106 and related KL. We also identify a novel KL112 gene cluster in A. baumannii isolate MAR24, which also belongs to this group, and determine the corresponding K112 structure.  [19]. Glycosyltransferases are indicated next to the linkage they were assigned to.

Characterization of the KL106 and KL112 CPS Biosynthesis Gene Clusters
Whole genome sequences were obtained for A. baumannii clinical isolates 48-1789 and MAR24. The K locus in the 48-1789 genome was found to carry the KL106 CPS biosynthesis gene cluster, sharing 98.1% identity (99% sequence coverage) with KL106 from A. baumannii isolate 219_ABAU (WGS accession number JVPN01000008.1) described in a previous study [19]. KL106 includes rmlBDAC genes for dTDP-L-Rhap synthesis and a tle epimerase gene to generate dTDP-L-6dTalp. A novel but related gene cluster with rmlBDAC Gtr29 Figure 1. A. baumannii K11 and K83 CPS structures established previously [19]. Glycosyltransferases are indicated next to the linkage they were assigned to.
In this work, we determine the K106 structure from A. baumannii isolate 48-1789 and correlate the structural data with the gene clusters of KL106 and related KL. We also identify a novel KL112 gene cluster in A. baumannii isolate MAR24, which also belongs to this group, and determine the corresponding K112 structure.

Characterization of the KL106 and KL112 CPS Biosynthesis Gene Clusters
Whole genome sequences were obtained for A. baumannii clinical isolates 48-1789 and MAR24. The K locus in the 48-1789 genome was found to carry the KL106 CPS biosynthesis gene cluster, sharing 98.1% identity (99% sequence coverage) with KL106 from A. baumannii isolate 219_ABAU (WGS accession number JVPN01000008.1) described in a previous study [19]. KL106 includes rmlBDAC genes for dTDP-L-Rhap synthesis and a tle epimerase gene to generate dTDP-L-6dTalp. A novel but related gene cluster with rmlBDAC and tle genes was also identified at the K locus in the genome of isolate MAR24, and was designated as KL112.
KL106 and KL112 ( Figure 2) share most of the genes present, differing only in the region containing wzx and wzy genes, with an additional gtr gene (gtr183) found in KL112. Both KL106 and KL112 share a portion of the central region (gtr27-gtr60-atr8-tle-gtr29-itrA3) with the previously reported KL83 gene cluster ( Figure 2). Indeed, KL106 differs from KL83 only in the presence of a different wzx gene and the presence of an additional gtr in KL83. Thus, both K106 and K112 K-unit structures are predicted to include the same α-D-GlcpNAc-(1→2)-β-D-Glcp-(1→3)-α-L-6dTalp-(1→3)-β-D-GlcpNAc tetrasaccharide segment as K83 ( Figure 1) that is generated by the shared genes. The structure of K106 is also likely to include an identical linkage between the K units due to a shared wzy gene, but would lack the Rha side chain found in K83. However, given the unique wzy gene and additional gtr183 gene in KL112, K112 is predicted to include an additional sugar residue with a different linkage between the K units. and tle genes was also identified at the K locus in the genome of isolate MAR24, and was designated as KL112. KL106 and KL112 ( Figure 2) share most of the genes present, differing only in the region containing wzx and wzy genes, with an additional gtr gene (gtr183) found in KL112. Both KL106 and KL112 share a portion of the central region (gtr27-gtr60-atr8-tle-gtr29-itrA3) with the previously reported KL83 gene cluster ( Figure 2). Indeed, KL106 differs from KL83 only in the presence of a different wzx gene and the presence of an additional gtr in KL83. Thus, both K106 and K112 K-unit structures are predicted to include the same α-D-GlcpNAc-(1→2)-β-D-Glcp-(1→3)-α-L-6dTalp-(1→3)-β-D-GlcpNAc tetrasaccharide segment as K83 ( Figure 1) that is generated by the shared genes. The structure of K106 is also likely to include an identical linkage between the K units due to a shared wzy gene, but would lack the Rha side chain found in K83. However, given the unique wzy gene and additional gtr183 gene in KL112, K112 is predicted to include an additional sugar residue with a different linkage between the K units.

Monosaccharide Composition of K106 and K112
Sugar analysis of the CPS preparations from strains 48-1789 and MAR24 by GLC of the acetylated alditols revealed 6dTal, Glc, and GlcNAc in the ratios ~1.0:0.8:3, or Rha, 6dTal, Glc, and GlcNAc in the ratios ~0.1:0.1:0.2:1, respectively (see Materials and Methods and Supplementary Materials, Figures S2 and S3). Both D-and L-enantiomers of 6-deoxytalose are not common monosaccharides, though they were found in some bacterial polysaccharides, see additional References in Supplementary Materials. The CPSs were studied by NMR spectroscopy including one-dimensional 1 (Table 1). Spin systems for the same monosaccharides and, in addition, β-Rhap (unit E), were found in the CPS of MAR24 (

Monosaccharide Composition of K106 and K112
Sugar analysis of the CPS preparations from strains 48-1789 and MAR24 by GLC of the acetylated alditols revealed 6dTal, Glc, and GlcNAc in the ratios~1.0:0.8:3, or Rha, 6dTal, Glc, and GlcNAc in the ratios~0.  (Table 1). Spin systems for the same monosaccharides and, in addition, β-Rhap (unit E), were found in the CPS of MAR24 OS2 were found to be the expected products containing glyceraldehyde (C') as an aglycone that was derived from the 2-substituted β-Glc residue (unit C) ( Figure 5). OS3 was suggested to have a cyclic aglycone (C'') due to incomplete hydrolysis in the destroyed β-Glc residue (unit C). Therefore, the K112 CPS of A baumannii MAR24 has the structure presented in Figure  5.

The K112 Structure
Correlations for units A-D in the 1 H, 1 H TOCSY spectrum of the K112 CPS were similar to those in the spectrum of the K106 CPS. In addition, there was a correlation between H1 and H2 of unit E and there were no correlations between H1 with H5-H6 for unit A and H1 with H6 for unit D. Comparison of the 13 C NMR chemical shifts of units B and E with published data of the corresponding monosaccharides [20,21] showed that unit B was α-linked and unit E was β-linked.
Low-field positions at δ 77.3, 83.0, 76.3, 76.2, and 76.5 of the signals for C2 of unit C, C3 of units A, B, and D, and C4 of unit D, respectively, showed that the CPS is branched, with four monosaccharide residues (A-D) in the main chain, 3,4-disubstituted unit D at the branching point and unit E attached as a side-chain ( Figure 5).
The order of the monosaccharides in the K112 CPS of A. baumannii MAR24 was determined and the substitution pattern in the K unit was confirmed by the 1 H, 13 C HMBC experiment, which showed the following correlations of the anomeric protons with the linked carbons of the neighboring sugar residues: H1 of unit A with C3 of unit D, H1 of unit B with C3 of unit A, H1 of unit C with C3 of unit B, H1 of unit D with C2 of unit C, and H1 of unit E with C4 of unit D. The CPS structures established by NMR spectroscopy were corroborated by Smith degradation followed by identification of the resulting oligosaccharides (OS1 for K106 and OS2 and OS3 for K112) by NMR spectroscopy as described above for the CPS (Tables  1,2)

The K106 Structure
In the 1H,1H TOCSY spectrum of the CPS of 48-1789, there were correlations for H1 with H2-H6 of unit A, H1 with H2 of unit B, H1 with H2-H5 of unit C, and H1 with H2-H4, H6 of unit D. The signals within each spin system were assigned using the 1H,1H COSY spectrum. Relatively large J1,2 coupling constants of 7-8 Hz indicated that units A and C were β-linked, whereas the α-linked unit D was characterized by a smaller J1,2 value (<4 Hz). Comparison of the 13C NMR chemical shifts of unit B with published data [20] showed that unit B was α-linked. Low-field positions at δ 77.2, 83.3, 76.4, and 80.7 of the signals for C2 of unit C, C3 of units A and B, and C4 of unit D, respectively, showed that the CPS is linear and revealed the glycosylation pattern of the monosaccharides.
The sequence of the monosaccharides was determined by the 1H,13C HMBC experiment which showed correlations between the anomeric protons and linked carbons of the neighboring sugar residues including correlations of H1 of unit A with C4 unit D, H1 of unit B with C3 of unit A, H1 of unit C with C3 of unit B, and H1 of unit D with C2 of unit C. These data also confirmed the substitution pattern in the K unit.
Based on these data, it was concluded that the K106 CPS of A. baumannii 48-1789 is linear and it has the structure shown in Figure 5.

The K112 Structure
Correlations for units A-D in the 1 H, 1 H TOCSY spectrum of the K112 CPS were similar to those in the spectrum of the K106 CPS. In addition, there was a correlation between H1 and H2 of unit E and there were no correlations between H1 with H5-H6 for unit A and H1 with H6 for unit D. Comparison of the 13 C NMR chemical shifts of units B and E with published data of the corresponding monosaccharides [20,21] showed that unit B was α-linked and unit E was β-linked.
Low that is attached to the terminal D-GlcpNAc residue in the respective main chains. In K112, the remaining Gtr183K112 enzyme (GenPept accession number QNR01095.1) would add this sugar via a β-(1→4) linkage, whereas in K83, this sugar is linked by Gtr154K83 (GenPept accession number AHB32312.1) via α-(1→3) linkage. Finally, the L-6dTalp residue in K83 is 2-O-acetylated but, although KL106 and KL112 carry the same atr8 gene as KL83, neither the K106 or K112 structures include an O-acetyl group.

Discussion
The A. baumannii K106 and K112 structures elucidated in this study are closely related to the A. baumannii K11 and K83 structures reported previously [19]. An unusual feature of this group of related CPS structures is the presence of L-6dTalp, which is a rare sugar component of polysaccharides produced by bacterial species (Bacterial Carbohydrate The order of the monosaccharides in the K112 CPS of A. baumannii MAR24 was determined and the substitution pattern in the K unit was confirmed by the 1H,13C HMBC experiment, which showed the following correlations of the anomeric protons with the linked carbons of the neighboring sugar residues: H1 of unit A with C3 of unit D, H1 of unit B with C3 of unit A, H1 of unit C with C3 of unit B, H1 of unit D with C2 of unit C, and H1 of unit E with C4 of unit D. The CPS structures established by NMR spectroscopy were corroborated by Smith degradation followed by identification of the resulting oligosaccharides (OS1 for K106 and OS2 and OS3 for K112) by NMR spectroscopy as described above for the CPS (Tables 1 and 2) and high-resolution electrospray ionization mass spectrometry (HR ESI MS − m/z 715.2779. OS1 and OS2 were found to be the expected products containing glyceraldehyde (C') as an aglycone that was derived from the 2-substituted β-Glc residue (unit C) ( Figure 5). OS3 was suggested to have a cyclic aglycone (C") due to incomplete hydrolysis in the destroyed β-Glc residue (unit C).
Therefore, the K112 CPS of A baumannii MAR24 has the structure presented in Figure 5.
In addition to a shared main chain, K112 includes an L-Rhap side branch, like K83, that is attached to the terminal D-GlcpNAc residue in the respective main chains. In K112, the remaining Gtr183 K112 enzyme (GenPept accession number QNR01095.1) would add this sugar via a β-(1→4) linkage, whereas in K83, this sugar is linked by Gtr154K83 (GenPept accession number AHB32312.1) via α-(1→3) linkage. Finally, the L-6dTalp residue in K83 is 2-O-acetylated but, although KL106 and KL112 carry the same atr8 gene as KL83, neither the K106 or K112 structures include an O-acetyl group.

Discussion
The A. baumannii K106 and K112 structures elucidated in this study are closely related to the A. baumannii K11 and K83 structures reported previously [19]. An unusual feature of this group of related CPS structures is the presence of L-6dTalp, which is a rare sugar component of polysaccharides produced by bacterial species (Bacterial Carbohydrate Structure Database at http://csdb.glycoscience.ru/bacterial/ (accessed on 15 April 2021), see also References in Supplementary Materials). The corresponding CPS gene clusters share genes for synthesis of dTDP-L-Rhap and dTDP-L-6dTal, as well as the ItrA3 initial glycosylphosphotransfer to build the lipid-linked β-D-GlcpNAc A-PP-Und precursor and the glycosyltransferases required for the assembly of the D-B fragment. Previously, KL106 was also found to be 95% identical to the KL gene cluster from A. nosocomialis M2 [19] and deletion of gtr genes in this cluster confirmed the order of function [22], which is identical to the assignments made in this study. Hence, the M2 CPS structure, which was not determined previously, would likely be the same as the K106 structure. Though K11 includes the same α-D-GlcpNAc-(1→2)-β-D-Glcp-(1→3)-α-L-6dTalp-(1→3)-β-D-GlcpNAc tetrasaccharide fragment (Figure 1), the genetic arrangement of the KL11 gene cluster includes gtr28-atr6 in place of gtr60-atr8 that is present in KL106, KL112, and KL83. Gtr28 was previously reported to be 46% identical to Gtr60, and the two glycosyltransferases appear to catalyze formation of the same linkage.
An additional difference between structures in this related group is the presence or absence of an O-acetyl group at the α-L-6dTalp residue. Each of the four gene clusters includes an acetyltranferase gene adjacent to tle: atr6 in KL11 and atr8 in KL83, KL106, and KL112. Previously, Atr8 was predicted to be responsible for 2-O-acetylation of α-L-6dTalp in K83. However, as no acetyl group was found in the K11 structure, a role for Atr6 in CPS modification was not established [19]. In this study, both K106 and K112 lack O-acetylation like K11, suggesting either that atr8 may be inactive in these strains or that the atr gene responsible for acetylation of K83 resides elsewhere. Further work will be needed to confirm the differential expression and/or activity of these acetyltransferases, or if an acetyltransferase is encoded elsewhere in the genome as has been observed for other strains [12].
Previous pairwise comparison of gene clusters within this L-6dTalp-containing group revealed that the KL11 and KL29 gene clusters, and also the KL83 and KL105 gene clusters, are gene cluster 'pairs', differing only in the segment including either gtr28/atr6 or gtr60/atr8. Thus, the structures of the CPSs of these pairs are expected to be identical, with a potential difference only in the type of O-acetylation pattern present. However, due to the possible differential activity of Atr6 and Atr8, this study reinforces the necessity for elucidating structural data rather than forming conclusions about structure based on sequence alone. Bacterial cells (~1 g) were extracted with 45% aqueous phenol, the extract was dialyzed against distilled water without layer separation and freed from insoluble contaminations by centrifugation. The resultant solution was treated with cold (4 • C) aq. 50% CCl3CO2H; after centrifugation, the supernatant was dialyzed against distilled water and freeze-dried. Crude CPS samples were heated with 2% aqueous AcOH at 100 • C for 2 h, and the purified high-molecular mass CPSs preparations (8 mg from 48-1789 and 20 mg from MAR24) were isolated by gel-permeation chromatography on a column (60 × 2.5 cm) of Sephadex G50 Superfine in 0.1% aqueous AcOH monitored using a differential refrac-tometer (Knauer, Berlin, Germany).

Smith Degradation
CPS samples from 48-1789 (6 mg) and MAR24 (16.3 mg) were oxidized with aq. 0.05M NaIO4 (1 mL) at 20 • C for 48 h in the dark, reduced with NaBH 4 (12 and 40 mg, respectively) at 20 • C for 16 h. The excess of NaBH4 was destroyed with concentrated AcOH, the solutions were evaporated, and the residues were evaporated with methanol (3 × 1 mL), dissolved in 0.5 mL water, and applied to a column (35 × 2 cm) of TSK HW-40.

NMR Spectroscopy
Samples were deuterium-exchanged by freeze-drying from 99.9% D2O and then examined as solutions in 99.95% D2O. NMR spectra were recorded on a Bruker Avance II 600 MHz spectrometer (Bremen, Germany) at 60 • C. Sodium 3-trimethylsilylpropanoate-2,2,3,3-d4 (δH 0, δC −1.6) was used as internal reference for calibration. Then, 2D NMR spectra were obtained using standard Bruker software, and Bruker TopSpin 2.1 program was used to acquire and process the NMR data. Further, 60-ms MLEV-17 spin-lock time and 150-ms mixing time were used in TOCSY and ROESY experiments, respectively. A 60-ms delay was used for evolution of long-range couplings to optimize HMBC experiments for the coupling constant of JH,C 8 Hz. 1H and 13C NMR chemical shifts of the CPSs and OS1-OS3 are tabulated in Tables 1 and 2.

Mass Spectrometry
High-resolution electrospray ionization (HR ESI) mass spectrometry was performed in positive and negative ion modes using micrOTOF II and maXis instruments (Bruker Daltonics, Bremen, Germany). Oligosaccharide samples (~50 ng L −1 ) were dissolved in a 1:1 (v/v) water-acetonitrile mixture and injected with a syringe at a flow rate of 3 µL min −1 . Capillary entrance voltage was set at −4500 V (positive ion mode) or 4000 V (negative ion mode). Interface temperature was set at 180 or 200 • C. Nitrogen was used as a drying and nebulizing gas. Mass spectra were acquired in a range from m/z 50 to m/z 3000. Internal or external calibration was done with ESI Calibrant Solution (Agilent, Santa Clara, CA, USA).

Sequencing and Bioinformatic Analysis
Whole genome sequences for A. baumannii 48-1789 and MAR-303 isolates were obtained using a Nextera DNA library preparation kit (Illumina, San Diego, CA, USA) and MiSeq platform. Assembly of the short read sequence data was performed with SPAdes v. 3.10 [23]. The K locus sequence located between fkpA and lldP was extracted and then subjected to KL typing using the Kaptive search tool [14]. Sequence arrangements that could not be identified in the existing A. baumannii KL sequence database were assigned a KL number and annotated following the established nomenclature scheme for A. baumannii [13]. Fully annotated sequences of KL106 and KL112 were deposited to GenBank under accession numbers MK399430.1 and MT152376.1, respectively. Putative functions of encoded proteins were assigned using BLASTp, as well as searches with the Pfam and CAZy databases.