Multiple-stage Precursor Ion Separation and High Resolution Mass Spectrometry toward Structural Characterization of 2 , 3-Diacyltrehalose Family from Mycobacterium tuberculosis

Mass spectrometry (MS)-based precursor ion isolation, collision-induced dissociation (CID) fragmentation, and detection using linear ion-trap multiple-stage mass spectrometry (LIT MSn) in combination with high resolution mass spectrometry (HRMS) provides a unique tool for structural characterization of complex mixture without chromatographic separation. This approach permits not only separation of various lipid families and their subfamilies, but also stereoisomers, thereby, revealing the structural details. In this report, we describe the LIT MSn approach to unveil the structures of a 2,3-diacyl trehalose (DAT) family isolated from the cell envelope of Mycobacterium tuberculosis, in which more than 30 molecular species, and each species consisting of up to six isomeric structures were found. LIT MSn performed on both [M + Na]+ and [M + HCO2] ions of DAT yield complimentary structural information for near complete characterization of the molecules, including the location of the fatty acyl substituents on the trehalose backbone. This latter information is based on the findings of the differential losses of the two fatty acyl chains in the MS2 and MS3 spectra; while the product ion spectra from higher stage LIT MSn permit confirmation of the structural assignment.


Introduction
Tandem mass spectrometry is a powerful tool for structural analysis of unknown molecules.Tandem mass spectrometry consists of several sequential events including formation and mass selection of the precursor ions, collision induced dissociation (CID) with inert target gas for fragment ion formation, followed by mass analysis and detection of the product-ions.For tandem mass spectrometry with quadrupole (e.g., triple quadrupole, TSQ), sector, and hybrid Q-TOF instruments, these processes occur sequentially in the separate regions of the instruments and the MS/MS process is tandem-in-space.For quadrupole ion-trap (QIT) and linear ion trap (LIT) instruments, the precursor ion selection, CID, and product-ion analysis and detection events are all executed in the ion trap in a timing sequence manner, and the MS/MS process is tandem-in-time [1].Linear ion trap multiple stage tandem mass spectrometry (LIT MS n ) permits the repeat of the precursor ion selection-CID-product-ion analysis process, and up to 10 cycles can be performed using modern commercial LIT mass spectrometers such as Thermo LTQ Orbitrap.This instrument is also featured with an extreme high resolving power and up to a million resolution (at m/z 200) can be reached [2].
Due to its MS n capability with high resolution, LIT/Orbitrap with MS n approach has been widely used in the structural characterization of a wide range of biomolecules [3][4][5][6][7].LIT MS n with high resolution is also extremely useful for identification of complex lipid structures, in particular, microbial lipid, permitting revelation of numerous isomeric molecules in lipid extracts.For example, we demonstrated that sulfolipid II in Mycobacterium tuberculosis (M.tuberculosis) H37Rv cells is the predominated lipid family, consisting of hundreds of molecular species rather than the sulfolipid I family as previously reported [8,9].LIT MS n with high resolution mass spectrometry has also been successfully applied to delineate the structures of phosphatidylinositol mannosides (PIMs) [10,11], and phthiocerol dimycocerosates (PDIMs) [12] in the cell envelope of M. tuberculosis.The former lipid family is known to have played important roles in M. tuberculosis adhesins that mediate attachment to nonphagocytic cells [13], while the latter is recently found to play role in drug resistance to M. tuberculosis [14].
In addition to the above complex lipid families, other glycolipids found in the mycobacterial cell wall include acylated trehaloses [15][16][17][18].These trehalose-containing glycolipids consist of many families [18][19][20][21][22], of which the 2,3-di-O-acyltrehaIose (DAT) family was previously defined as glycolipid B. DAT is a mycobacterial factor capable of modulating host immune responses [23] and can inhibit the proliferation of murine T cells [24].DAT along with pentaacyl trehalose (PAT) also play an important role in pathogenesis and a structural role in the cell envelope, promoting the intracellular survival of the bacterium [25].The DATs from M. tuberculosis and M. fortuitum have been shown to be antigenic [24,26,27] and their potential use in serodiagnosis has been postulated [28,29].
Besra and coworker defined the structures of the acylated trehalose lipid family using gas chromatography-mass spectrometry, in conjunction with normal/reversed phase TLC, and one/two-dimensional 1 H, and 13 C nuclear magnetic resonance spectroscopy [21].However, there is no report in the detailed structural assignment of the various molecular species with many isomeric structures for the entire lipid family.
In this report, we applied multiple stage precursor ion isolation and resonance excitation activation to generate distinct MS n spectra to explore the structure details of the 2,3-diacyl trehalose (DAT) lipid family found in M. tuberculosis.This study highlights the unique feature of the technique of LIT MS n for tandem-in-time precursor ion separation that affords structural characterization of a complex lipid family, while a similar structural analysis would be very difficult to perform utilizing the tandem mass spectrometric approach with a sector, TSQ or QTOF instrument.

Materials
All solvents (spectroscopic grade) and chemicals (ACS grade) were obtained from Sigma Chemical Co.(St.Louis, MO, USA).

Sample Preparation
M. tuberculosis strain H37Rv were grown and total lipids were extracted and isolated as previously described [8].Briefly, the total lipid was separated by a Phenomenex C18 Kinetex (100 × 4.6 mm, pore size 100 Å, particle size 2.6 µm) column at a flow rate of 300 µL/min with a gradient system as previously described [12].DAT (eluted at 24.45-26.43min) fraction from three injections (~200 µg total lipid/injection) were collected and pooled, dried under a stream of nitrogen, and stored at −20 • C until use.

Mass Spectrometry
Both high-resolution (R = 100,000 at m/z 400) low-energy CID and higher collision-energy dissociation (HCD) tandem mass spectrometric experiments were conducted on a Thermo Scientific (San Jose, CA, USA) LTQ Orbitrap Velos mass spectrometer (MS) with Xcalibur operating system.Samples in CH 3 OH were infused (1.5 µL/min; ~10 pmol/µL) to the ESI source, where the skimmer was set at ground potential, the electrospray needle was set at 4.0 kV, and temperature of the heated capillary was 300 • C. The automatic gain control of the ion trap was set to 5 × 10 4 , with a maximum injection time of 100 ms.Helium was used as the buffer and collision gas at a pressure of 1 × 10 −3 mbar (0.75 mTorr).The MS n experiments were carried out with an optimized relative collision energy ranging from 25-35% and with an activation q value at 0.25.The activation time was set for 10 ms to leave a minimal residual abundance of precursor ion (around 20%).For the HCD experiments, the collision energy was set at 50-55% and mass scanned from m/z 100 to the upper m/z value that covers the precursor ions.The mass selection window for the precursor ions was set at 1 Da wide to admit the monoisotopic peak to the ion-trap for collision-induced dissociation (CID) for unit resolution detection in the ion-trap or high resolution accurate mass detection in the Orbitrap mass analyzer.Mass spectra were accumulated in the profile mode, typically for 3-10 min for MS n spectra (n = 2,3,4).

Nomenclature
To facilitate data interpretation, the following abbreviations were adopted.The abbreviation of the long-chain fatty acid such as the stearic acid attached to the C2 position of the trehalose backbone is designated as 18:0.The multiple methyl-branched mycolipenic acid, for example, the 2,4,6-trimethyl-2-tetracosenoic acid attached to the C3-position is designated as 27:1 to reflect the fact that the acid contains a C 27 acyl chain with one double bond.Therefore, the DAT species consisting of 18:0-and 27:1-FA substituents at C2-, and C3-position, respectively, is designated as 18:0/27:1-DAT.

Results and Discussion
DAT formed [M + Alk] + ions (Alk = NH 4 , Li, Na, etc.) in the positive ion mode; and [M + X] − (X = Cl, RCO 2 ; R = H, CH 3 , C 2 H 5 , etc.) ions in the negative-ion mode when subjected to ESI in the presence of Alk + and X − .For example, when dissolved in CH 3 OH with the presence of HCO 2 Na, adduct ions in the fashions of [M + Na] + (Figure 1) in the positive ion mode and [M + HCO 2 ] − ions (data not shown) in the negative ion mode were observed.The formation of these adduct ions was revealed by the elemental composition of the molecular species deduced by high resolution mass spectrometry (Table 1).Upon being subject to CID in a linear ion-trap, the MS n (n = 2,3,4) spectra of both the [M + Na] + and [M + HCO 2 ] − ions contain rich structural information readily applicable for structural identification.

The Fragmentation Processes of the [M + Na] + Ions of DAT Revealed by LIT MS n
The utility in the performance of sequential precursor ion separation, CID, and acquiring MS n spectra of LIT MS n mass spectrometry permits insight into not only the fragmentation processes, but also the structural details of the molecules.For example, the LIT MS 2 spectrum of the [M + Na] + ions of 18:0/27:1-DAT at m/z 1021 contained the dominated ions of m/z 859 (Figure 2a) arising from loss of glucose residue, along with the ion set at m/z 737 and 613, arising from losses of 18:0-, and 27:1-fatty acid substituents, respectively (Scheme 1).The ions of m/z 859 represent the sodiated diacylglucose with both the 18:0-, and 27:1-fatty acyl substituents.This notion is further supported by the MS 3 spectrum of the ions of m/z 859 (1021 → 859, Figure 2b) which contained ions of m/z 575 (859 -284) and 451 (859 -408), arising from losses of 18:0-, and 27:1-fatty acid substituents, respectively.The results also suggest that the Na + charge site most likely resides at the glucose ring with the two acyl groups (Glc 1).In contrast, the MS 2 spectrum of the [M + Na] + ions of the 6,6 -dioleoyltrehalose standard [30] at m/z 893 forms abundant ions at m/z 467, representing the sodiated oleoylglucose (data not shown), consistent with the fact that the 18:1 fatty acyl substituents in the 18:1/18:1-DAT are located on the separate Glc (i.e., Glc1 and Glc 2).Further dissociation of the ions of m/z 737 (1013 → 737, Figure 2c) gave rise to the prominent ions of m/z 329 by loss of 27:1-fatty acid substituent, and m/z 575, arising from loss of glucose residue (Glc 2), together with m/z 431 representing a sodiated ion of 27:1-FA.These results further support the fragmentation processes as proposed in Scheme 1.The formation of the ions of m/z 575 from m/z 859 by loss of 18:0-FA residue at C2 may involve the participation of the hydrogen atom at C1 to form an enol, which undergoes enol-keto tautomerism to yield a stable sodiated ion of monoacyl (27:1) glucose as the keto form (Scheme 1).This fragmentation processes are further supported by MS 4 on the ions of m/z 575 (1013 → 859 → 575, Figure 2d), which yielded ions of m/z 545, 515, and 475, likely arising from the across cleavages of the glucose ring, suggesting that the 27:1-fatty acyl substituent is located at C3 (Scheme 1).
Similarly, MS 4 on the ion of m/z 451 (1013 → 859 → 451, Figure 2e) gave rise to ions of m/z 421, 391, and 361 arising from the similar rupture of the glucose ring, indicating that the 18:0-fatty acyl substituent is most likely located at C2 of the glucose ring.The preliminary loss of the 27:1-FA substituent may involve the participation of the adjacent hydrogen at C4 of Glc 1 to form an enol, which sequentially rearranges to keto form via the similar enol-keto tautomerism mechanism.
The preferential formation of the ions of m/z 575 from loss of the 18:0-FA substituent over the ions of m/z 451 from similar loss of the 27:1-FA as seen in Figure 2a is readily applicable for locating the FA substituents on the trehalose backbone.
The MS 3 spectrum of the ions of m/z 695 (979 → 695; Figure 3e) contained the ions of m/z 533 and 329 arising from further losses of Glc and 24:1-FA residues (Figure 3a), respectively.The MS 4  spectrum of the ions of m/z 533 (979 → 817 → 533; data not shown) gave ions of m/z 503, 473, and 443 from the similar fragmentation processes that cleave the sugar ring (Scheme 1), confirming that the 24:1-FA substituent is located at C3.The results led to assign the 18:0/24:1-DAT structure.Using this LIT MS n approach, a total of six isomeric structures were identified.

The Fragmentation Processes of the [M + HCO 2 ] − Ions of DAT Revealed by LIT MS n
In the negative-ion mode in the presence of HCO 2 − , 18:0/27:1-DAT formed [M + HCO 2 ] − ions of m/z 1043, which gave rise to the prominent ions of m/z 997 by loss of HCO 2 H, along with ions of m/z 713 and 589 by further losses of 18:0-and 27:1-FA substituents, respectively (Figure 4a) (Scheme 2).This fragmentation process is further supported by the MS 3 spectrum of the ions of m/z 997 (1043 → 997, Figure 4b), which are equivalent to the [M − H] − ions of 18:0/27:1-DAT.The ions of m/z 731 (Figure 4a) arising from loss of 18:0-FA as a ketene is more prominent than the ions of m/z 607 arising from analogous 27:1-ketene loss.This preferential formation of m/z 731 corresponding to loss of the FA-ketene at C2 over ions of m/z 607 from the FA-ketene loss at C3 was seen in all the MS 2 spectra of the [M + HCO 2 ] − ions of DAT, providing useful information for distinction of the FA substituents at C2 and C3.The ketene loss process probably involves the participation of HCO 2 − , which attracts the labile α-hydrogen on the fatty acid group to eliminate FA-ketene and HCO 2 H simultaneously (Scheme 2).Therefore, the low abundance of the ions of m/z 607 arising from loss of the FA-ketene at C3 may reflect the fact that the 27:1-FA substituent at C3 contains an α-methyl side chain [19,21,22,31], and does not contain labile α-hydrogen required for ketene loss.This is in contrast to the 18:0-FA substituent at C2, which possesses two α-hydrogens (Scheme 2).

The Fragmentation Processes of the [M + HCO2] − Ions of DAT Revealed by LIT MS n
In the negative-ion mode in the presence of HCO2 − , 18:0/27:1-DAT formed [M + HCO2] − ions of m/z 1043, which gave rise to the prominent ions of m/z 997 by loss of HCO2H, along with ions of m/z 713 and 589 by further losses of 18:0-and 27:1-FA substituents, respectively (Figure 4a) (Scheme 2).This fragmentation process is further supported by the MS 3 spectrum of the ions of m/z 997 (1043 → 997, Figure 4b), which are equivalent to the [M − H] − ions of 18:0/27:1-DAT.The ions of m/z 731 (Figure 4a) arising from loss of 18:0-FA as a ketene is more prominent than the ions of m/z 607 arising from analogous 27:1-ketene loss.This preferential formation of m/z 731 corresponding to loss of the FA-ketene at C2 over ions of m/z 607 from the FA-ketene loss at C3 was seen in all the MS 2 spectra of the [M + HCO2] − ions of DAT, providing useful information for distinction of the FA substituents at C2 and C3.The ketene loss process probably involves the participation of HCO2 − , which attracts the labile α-hydrogen on the fatty acid group to eliminate FA-ketene and HCO2H simultaneously (Scheme 2).Therefore, the low abundance of the ions of m/z 607 arising from loss of the FA-ketene at C3 may reflect the fact that the 27:1-FA substituent at C3 contains an α-methyl side chain [19,21,22,31], and does not contain labile α-hydrogen required for ketene loss.This is in contrast to the 18:0-FA substituent at C2, which possesses two αhydrogens (Scheme 2).The ions at m/z 731 and 607 arising from losses of 18:0-ketene and 27:1-ketene, respectively, are absent in Figure 4b.This is consistent with the notion that the ketene loss requires the participation of HCO2 − .The ketene loss pathway becomes not operative after the [M − H] − ions are formed from [M + HCO2] − by loss of HCO2H.
The spectrum (Figure 4b) also contained the prominent ions of m/z 407, representing 27:1-fatty acid carboxylic anions, and the ions of m/z 283 representing 18:0-FA carboxylate anions, along with ions of m/z 305 arising from losses of both 18:0-and 27:1-FA substituents.The preferential formation of the ions of m/z 407 (at C3) over m/z 283 (at C2) is also a reflection of the location of the fatty acid substituents on the Glc ring, leading to the assignment of 18:0/27:1-DAT structure.

Characterization of Minor Species Applying LIT MS n on the [M + HCO2] − Ions
Applying multiple-stage mass spectrometry (LIT MS n ) for consecutive ion separation followed by CID mass spectrometry is particularly useful for characterization of minor DAT species as [M + HCO2] − ions.For example, the MS 2 spectrum of the [M + HCO2] − ion of the minor DAT at m/z 989 (Figure 6a) gave a major [M − H] − fragment ions at m/z 943, but the spectrum also contained many unrelated fragment ions (e.g., ions of m/z 957, 930, 921, and 905) that complicate the structural identification.These fragment ions may arise from the adjacent precursor ions admitted together with the desired DAT ions for CID, due to that the precursor ion selection window (1 Da) cannot sufficiently isolate the isobaric ions (the mass selection window and injection time govern the total ions admitted to the trap for CID and >1 Da mass selection window is often required to maintain the sensitivity).Thus, fragment ions unrelated to the targeted molecule were formed simultaneously and complicating the structure analysis.However, the MS 3 spectrum of m/z 943 (Figure 6b) contained only the fragment ions related to the DAT species, due to that the [M − H] − ions still retain the complete structure but have been further segregated, and fragment ions unrelated to the structure have been filtrated by another stage (MS 3 ) isolation.In this context, the MS 4 spectrum of the ions of m/z 589 (989 → 943 → 589; Figure 6c), which were further "purified", becomes even more specific, due to that only the fragment ions from DAT that consists of 23:0-FA substituent at C3 were subjected to further CID.Thus, the spectrum only contained ions of m/z

Characterization of Minor Species Applying LIT MS n on the [M + HCO 2 ] − Ions
Applying multiple-stage mass spectrometry (LIT MS n ) for consecutive ion separation followed by CID mass spectrometry is particularly useful for characterization of minor DAT species as [M + HCO 2 ] − ions.For example, the MS 2 spectrum of the [M + HCO 2 ] − ion of the minor DAT at m/z 989 (Figure 6a) gave a major [M − H] − fragment ions at m/z 943, but the spectrum also contained many unrelated fragment ions (e.g., ions of m/z 957, 930, 921, and 905) that complicate the structural identification.These fragment ions may arise from the adjacent precursor ions admitted together with the desired DAT ions for CID, due to that the precursor ion selection window (1 Da) cannot sufficiently isolate the isobaric ions (the mass selection window and injection time govern the total ions admitted to the trap for CID and >1 Da mass selection window is often required to maintain the sensitivity).Thus, fragment ions unrelated to the targeted molecule were formed simultaneously and complicating the structure analysis.However, the MS 3 spectrum of m/z 943 (Figure 6b) contained only the fragment ions related to the DAT species, due to that the [M − H] − ions still retain the complete structure but have been further segregated, and fragment ions unrelated to the structure have been filtrated by another stage (MS 3 ) isolation.In this context, the MS 4 spectrum of the ions of m/z 589 (989 → 943 → 589; Figure 6c), which were further "purified", becomes even more specific, due to that only the fragment ions from DAT that consists of 23:0-FA substituent at C3 were subjected to further CID.Thus, the spectrum only contained ions of m/z 283, representing a 18:0-carboxylate anion, along with ions at m/z 323 and 305, representing the dehydrated trehalose anions.These results led to specifically define the 18:0/23:0-DAT structure.The spectrum (Figure 6b) also contained the m/z 269/367 and 255/381 ion pairs, representing the 17:0/24:0-, 16:0/25:0-FA carboxylate anion pairs, together with ions of m/z 687/561 and 673/575 ion pairs, arising from losses of 17:0/24:0-, 16:0/25:0-FA substituents, respectively.These results readily led to the assignment of 17:0/24:0-and 16:0/25:0-DAT isomeric structures.

Conclusions
Sequential precursor ion isolation applying multiple-stage mass spectrometry (LIT MS n ) adds another dimension of separation in the analysis, providing a powerful tool for structural identification of various compounds.Thereby, many isomeric structures of the molecule can be unveiled in a very short period of time.By contrast, using the conventional chromatographic separation combined with a TSQ or QTOF instrument, the consecutive precursor ion isolation by MS is not achievable, and the species separation can only rely on column separation.Thus, complete separation of a complex lipid mixture with a wide range of molecular species and many isomeric structures is often difficult.Compound separation by chromatographic means also requires significantly more times [33], as compared to the LIT MS n approach, by which the separation-CID-detection process can be completed within a very short period of time.

Conclusions
Sequential precursor ion isolation applying multiple-stage mass spectrometry (LIT MS n ) adds another dimension of separation in the analysis, providing a powerful tool for structural identification of various compounds.Thereby, many isomeric structures of the molecule can be unveiled in a very short period of time.By contrast, using the conventional chromatographic separation combined with a TSQ or QTOF instrument, the consecutive precursor ion isolation by MS is not achievable, and the species separation can only rely on column separation.Thus, complete separation of a complex lipid mixture with a wide range of molecular species and many isomeric structures is often difficult.Compound separation by chromatographic means also requires significantly more times [33], as compared to the LIT MS n approach, by which the separation-CID-detection process can be completed within a very short period of time.
LIT MS n permits ion isolation in the time sequence manner, and the separation of ions is flexible (i.e., the types of ions selected and the mass selection window of precursor ions).The selected ions become more specific, and the MS n spectrum provides more structurally specific information as the MS n stage advances, therefore, resulting in a confident and detailed structural identification.The structures of minute ion species that are often difficult to define by other analytical method can also be assigned (Table 1).However, the sensitivity declines as the higher order of MS n stage proceeds.
Other drawback includes that a complete structural information is not necessary extractable by MS n .For example, the positions of the double bond and methyl side chain of the fatty acid substituents at C3 have not been defined in this study.
The LIT MS n approach as described here affords near complete structural characterization of a complex DAT lipid family, locating the fatty acyl groups on the trehalose backbone, and recognizing many isomeric structures.A LIT MS n approach combined with chemical reaction modification [9,34] for locating the functional groups including the methyl, hydroxyl, and the double bond on the fatty acid substituents are currently in progress in our laboratory.

Figure 1 .
Figure 1.The positive-ion ESI mass spectrum of the [M + Na] + ions of the DAT species isolated from cell envelope of M. tuberculosis.

Figure 1 .
Figure 1.The positive-ion ESI mass spectrum of the [M + Na] + ions of the DAT species isolated from cell envelope of M. tuberculosis.

Scheme 1 .
Scheme 1.The structure of [M + Na] + ion of 18:0/27:1-DAT at m/z 1021 and proposed LIT MS n fragmentation processes*.* All the ions represent the sodiated species.To simplify, the drawing of "Na + " is omitted from the scheme.

Figure 6 .
Figure 6.The negative-ion MS 2 spectrum of the [M + HCO2] − ions of DAT at m/z 989 (a), its MS 3 spectrum of the ions of m/z 943 (989 → 943) (b), MS 4 spectrum of the ions of m/z 589 (989 → 943 → 589) (c).The ions of m/z 989 are minor species and its MS 2 spectrum contains several ions (shown in red in (a)) unrelated to the structure, but have been filtrated from higher stage MS n , as shown in (b) (MS 3 ) and (c) (MS 4 ).

Figure 6 .
Figure 6.The negative-ion MS 2 spectrum of the [M + HCO 2 ] − ions of DAT at m/z 989 (a), its MS 3 spectrum of the ions of m/z 943 (989 → 943) (b), MS 4 spectrum of the ions of m/z 589 (989 → 943 → 589) (c).The ions of m/z 989 are minor species and its MS 2 spectrum contains several ions (shown in red in (a)) unrelated to the structure, but have been filtrated from higher stage MS n , as shown in (b) (MS 3 ) and (c) (MS 4 ).

Table 1 .
The high resolution mass measurements of the [M + Na] + ions of DATs isolated from M. tuberculosis and the assigned structues (* structure not defined).

Table 1 .
The high resolution mass measurements of the [M + Na] + ions of DATs isolated from M. tuberculosis and the assigned structues (* structure not defined).Measured m/zTheo.