Potential Plasticity of the Mannoprotein Repertoire Associated to Mycobacterium tuberculosis Virulence Unveiled by Mass Spectrometry-Based Glycoproteomics

To date, Mycobacterium tuberculosis (Mtb) remains the world’s greatest infectious killer. The rise of multidrug-resistant strains stresses the need to identify new therapeutic targets to fight the epidemic. We previously demonstrated that bacterial protein-O-mannosylation is crucial for Mtb infectiousness, renewing the interest of the bacterial-secreted mannoproteins as potential drug-targetable virulence factors. The difficulty of inventorying the mannoprotein repertoire expressed by Mtb led us to design a stringent multi-step workflow for the reliable identification of glycosylated peptides by large-scale mass spectrometry-based proteomics. Applied to the differential analyses of glycoproteins secreted by the wild-type Mtb strain—and by its derived mutant invalidated for the protein-O-mannosylating enzyme PMTub—this approach led to the identification of not only most already known mannoproteins, but also of yet-unknown mannosylated proteins. In addition, analysis of the glycoproteome expressed by the isogenic recombinant Mtb strain overexpressing the PMTub gene revealed an unexpected mannosylation of proteins, with predicted or demonstrated functions in Mtb growth and interaction with the host cell. Since in parallel, a transient increased expression of the PMTub gene has been observed in the wild-type bacilli when infecting macrophages, our results strongly suggest that the Mtb mannoproteome may undergo adaptive regulation during infection of the host cells. Overall, our results provide deeper insights into the complexity of the repertoire of mannosylated proteins expressed by Mtb, and open the way to novel opportunities to search for still-unexploited potential therapeutic targets.


Introduction
Although curable, human pulmonary tuberculosis (TB) caused by Mycobacterium, tuberculosis (Mtb) remains the deadliest bacterial infection, with more than one death every 30 s [1]. Today, the disease burden is worsened by the increasing incidence of co-infection by HIV and the threatening emergence of multi-or extensively-drug-resistant Mtb strains. To face this alarming situation, efforts must intensify to foster the development of innovative protective and therapeutic strategies to fight TB [2]. Among these, the search of so-called "anti-virulence drugs" able to disarm the pathogen of its virulence without affecting its vital metabolic pathways is considered as a promising approach to circumvent the aim, we set up an unbiased glycoproteomic approach to identify the most probable mannoproteins associated to Mtb virulence by comparing the glycoproteomes expressed by the wild-type H37Rv Mtb strain (Mtb WT ) and its derived Mtb∆rv1002c mutant invalidated for the protein-O-mannosyl transferase PMTub.
We applied also this approach which combines SDS-PAGE fractionation and MS-based proteomics with a stringent multi-parametric filtering strategy for high confidence identification of the glycopeptides, to explore conversely the impact of the constitutive overexpression of the PMTub gene on the contour of the mannoprotein repertoire.
Owing to this, we were able to characterize more than one hundred glycopeptides that were not detected in the Mtb∆rv1002c mutant and which can be attributed obviously to PMTub modified mannoproteins. Not all, but most of the glycopeptides from the known mannoproteins previously found in the Mtb CF were detected as described or with related sequences and/or different glycosylation pattern. More interestingly, beside these latter, most the selected hits, was found to correspond to new glycopeptides that fulfill a stringent validation procedure providing reliable structural evidence of glycosylation for about twenty five new Mtb proteins.
Finally, we tentatively addressed the consequences of the regulation of the PMTub gene expression on the variability of the mannoprotein repertoire expressed by Mtb during the infection. Our results will pave the way to a better understanding of the contribution of mannoproteins in Mtb virulence.

Mtb Culture Filtrate Proteome Coverage Using SDS-PAGE Separation and MS-Based Shotgun Proteomics
Mtb mannosylated proteins are secreted and accumulate in the culture filtrate (CF) all along the growth of the bacteria. Hence, the MS-based proteomic analyses were performed on protein extracts from 6 weeks old Mtb cultures grown on a glycerol-based synthetic medium. These conditions being considered as the optimal ones to recover the highest amount of mannoproteins [7][8][9][10]12]. The complexity of CF protein mixtures was resolved by SDS-PAGE and the gel strip was sliced into 17 bands that were processed in parallel for in gel digestion and peptide extraction. The individual peptide extracts were successively analyzed by nano LC-MS/MS using collision induced dissociation (CID) as activation method. This fragmentation mode allows to maximize the MS 2 sampling for optimal analytical coverage of the peptide mixture [31], but also permits to preserve the characteristic hexose neutral loss ions signing the peptide glycosylation [12,32,33]. The datasets of fragmentation spectra obtained for each of the 17 bands were individually processed for the data mining of the annotated Mtb TubercuList reference database (release 27; 4031 entries). The resulting search outputs were merged for protein validation using the in-house developed MFPaQ software [34].
Given as an example, the analyses of the 17 SDS-PAGE gel slices of the Mtb WT culture filtrate (CF) fractionation resulted in the recording of 244,488 fragmentation spectra that were assigned to 13,602 unique peptide sequences mapping 1232 proteins identified with a protein false discovery rate of 1% (FDR 1%; p > 0.004) (Table S1, Figure S1). This protein repertoire covers about 31% of the total annotated gene-coded proteome and~40% of the expressed proteome [35] amenable to discovery using MS-based proteomics [36]. Most of the proteins detected herein (n = 1007; >81%) were previously reported by Albrethsen et al. (1362 proteins, [37]) from label-free 2D-DIGE proteomic analyses of the Mtb extracellular proteome (Table S1). The results obtained from these independent studies reveal a remarkable consistency of the Mtb culture filtrate proteome composed of a thousand proteins. It is noteworthy that only a limited number (95/1007; <10%;) of the proteins identified are actually predicted to be secreted and hence susceptible to correspond to secreted mannoproteins [38].
Similar analyses of the secreted proteome of the Mtb PMTub mutant (∆Rv1002c) led to very comparable results with the identification of 1187 proteins (FDR < 1%) present in the culture filtrate (Table S2) including 935 (78%) proteins shared with the dataset of Albrethsen et al [37].
The substantial coverage of the Mtb secretome achieved, using this pre-analytic SDS-PAGE fractionation approach, supports its efficiency and relevance to maximize the detection of low-abundance glycosylated peptides as also reported by others [27]. Nevertheless, while suited for an in-depth qualitative proteome exploration, the analyses of the multiple fractions is also expected to lead to less accurate differential quantitative analyses than single fraction comparisons. Therefore, we performed a complementary quantitative analysis of the relative expression of the glycoproteins detected in the unfractionated extracts to verify whether it could be affected by the interruption of the PMTub catalyzed mannosylation. Label free quantification of the non-glycosylated prototypic peptides showed no statistically significant differences for each of the putative mannoproteins detected in the CFs of the Mtb WT and the Mtb∆rv1002c mutant ( Figure S2). This result obviously suggests that the disruption of the protein-O-mannosylation process does not affect the expression of the target proteins.

Identification of Glycopeptides Present in the Culture Filtrate of Mtb
In an initial attempt to identify the glycopeptides deriving from the already known mannoproteins present in the Mtb WT CF, we applied a targeted search strategy. Briefly, the Mascot data-mining engine was parametrized for the identification of the presence of mono-, di-and tri-hexosyl modifications substituting Ser or tThre as variable labile post-translational modifications (PTMs). Hence, according to the difference between the experimental and the calculated mass of the peptide inferred from the fragment ions series of the non-glycosylated peptide sequence observed in the CID MS 2 spectrum, about 1291 glycopeptide were identified mapping 577 putative O-glycosylated proteins ( Figure S1). Because of the absence of additional available screening criterion to verify individually each of these putative assignments, we focused first on the search for the mannosylated peptides previously reported in Mtb [7][8][9]11,12,[39][40][41]. This targeted search led to the identification of 103 MS 2 spectra assigned to glycopeptides matching with 11 of the 17 Mtb mannoproteins biochemically confirmed to date (Table 1).

Improvement of MS 2 Data Processing for Mtb Glycoproteins Discovery
However, about 98.5% of the 7779 MS 2 spectra attributed to glycosylated peptides remained speculative requiring further endorsement ( Figure S1). This considerable number of spectra to process makes such a task hardly manageable manually. To overcome this hurdle, we set up an automated procedure to search for fragmentation spectra exhibiting the characteristic signature of glycosylation that could strengthen the reliability of the putative assignments for new glycopeptides discovery. Indeed, CID fragmentation spectra of glycoconjugates bearing terminal hexoses are characterized by the presence of intense Y-type fragment ions resulting from the loss of the sugar ring(s) as neutral fragments [12,32,41]. Most of the CID-MS 2 -data search engine algorithms can be parametrized to take advantage of this property for the recognition of glycopeptide fragmentation spectra. However, they generally fail to exploit the full discriminating value of these diagnostic ions by miscarrying their relative intensity or rank in the fragmentation spectra. To improve the selectivity of the glycopeptide identification, we thus computerized a screening procedure to detect the MS 2 spectra exhibiting Molecules 2020, 25, 2348 6 of 25 consistent glycosylation signature ions clearly emerging from the background noise. Applying this filtering strategy, we found that only a very minor proportion (n = 223) of the MS 2 spectra assigned to putative glycopeptides actually exhibited the expected signature ( Figure S1, Table S2). It is noteworthy that 30% (n = 67) of these fragmentation spectra corresponded to previously reported glycopeptides of 8 known Mtb mannoproteins while the remaining 156 "signed" spectra pointed to 120 previously unreported glycoprotein candidate.

High Confidence Identification of Mtb Glycosylated Proteins
Although highly stringent with less than 3% of validated MS 2 spectra, it cannot be totally excluded that this filtering strategy still allows the endorsement of false positive glycopeptide assignments. Then to increase further the reliability of the glycopeptide identification, we added two additional selection parameters. First, based on the observation that a large majority (>74%) of these assigned spectra corresponds to single sequencing events (Table S2)-and considering the stochastic character of the precursor ion selection for fragmentation [36]-we deliberately chose to temporary set apart the "orphan" spectra and to consider exclusively the assignments relying on several matching spectra. This resulted in the subsequent validation of a very limited number of precursor ions spectra (n = 40) that identify 10 peptides potentially glycosylated ( Table 2). Table 2. Identification of the novel glycopeptide detected in the Mtb WT CF by the improved "neutral loss" post analytical MS 2 data processing. (* and § : See Figure 1 and Figure S3 for detailed identification; for additional legend see Table 1). residue. Moreover, they exhibit fragmentation spectra and chromatographic behaviors consistent with the non-glycosylated forms of the corresponding peptides ( Figure S3). Thus, the concomitant presence of both the non-modified tryptic peptides and their mono-glycosylated forms corroborated their respective assignments and was considered as above, as a consistent proof of glycosylation of the hypothetical proteins Rv0315 and of the virulence associated lipoglycans transporter LprG. Finally, the failure to detect the unsubstituted peptide or any related glycoforms for the 5 remaining precursor ion masses M = 1658.7, M = 1632.8, M = 1497.7, M = 1705.8 and M = 3813.8 in Table 2, makes these assignments less reliable even though fulfilling most of the usually accepted criteria for glycopeptide characterization. Therefore, although their glycosylation cannot be strictly ruled out in absence of definite indications, these candidates were not considered in this study.  Second, still to gain more confidence into the assignment of these spectra, we took advantage of the well-established intrinsic micro-heterogeneity of the glycosylation process to select the most probable glycopeptide candidates. Indeed, alike most glycosylation routes, the protein glycosyl post-translational modification leads to the formation of structurally related glycoforms differing by their carbohydrate content (number and/or nature of the sugar). Accordingly, among the 13 precursor ions ( Table 2) (Figure 1a,b). These assignments were further supported by the analysis of the respective relative LC-MS elution time of these glycoforms. Indeed, the lower chromatographic retention time of the expected most glycosylated form at M = 2051.9 is consistent with its higher polarity ( Figure 1d). The consistency of the chromatographic behavior of these glycosyl-modified peptides [42] together with the presence in the sequence of a Ser or Thr rich cluster ( Figure 1c) with a high glycosylation probability (Figure 1e), permit us to propose the LpqR (Rv0838) as a novel glycosylated protein of Mtb.
In contrast, the identification of the remaining 8 glycopeptide candidates seems less reliable due to the lack similar glycoforms. However, the assignments of 3 of these precursor ion masses at M = 3615.7, M = 3544.7 and M = 1115.6 are supported in a same way, by the presence of the related lower mass ions at M = 3453.68, M = 3382.64 and M = 953.55. Indeed, compared to the former ones, these latter present a mass difference of 162 amu that is readily attributable to the absence of an hexose residue. Moreover, they exhibit fragmentation spectra and chromatographic behaviors consistent with the non-glycosylated forms of the corresponding peptides ( Figure S3). Thus, the concomitant presence of both the non-modified tryptic peptides and their mono-glycosylated forms corroborated their respective assignments and was considered as above, as a consistent proof of glycosylation of the hypothetical proteins Rv0315 and of the virulence associated lipoglycans transporter LprG.
Finally, the failure to detect the unsubstituted peptide or any related glycoforms for the 5 remaining precursor ion masses M = 1658.7, M = 1632.8, M = 1497.7, M = 1705.8 and M = 3813.8 in Table 2, makes these assignments less reliable even though fulfilling most of the usually accepted criteria for glycopeptide characterization. Therefore, although their glycosylation cannot be strictly ruled out in absence of definite indications, these candidates were not considered in this study.

Assessment of the PMTub Associated Mannoprotein Repertoire
To verify whether the newly identified glycopeptides correspond to the PMTub associated mannoprotein repertoire we analyzed in parallel a CF protein extract from the Mtb ∆rv1002c mutant invalidated for the protein O-mannosyl transferase gene using the same multi-parametric selection workflow. In these conditions, we found 138 spectra assigned to putative glycosylated peptides that exhibit the neutral loss signature (Table S3). Targeted search on the full MS 2 dataset failed to detect any fragmentation spectra fitting with known glycopeptides. Moreover, none of these spectra corresponded to any of the expected glycopeptides of the newly Mtb mannosylated proteins identified above. Finally, after the exclusion of the MS 2 spectra corresponding to unique sequencing events (123/138), only 15 fragmentation spectra attributed to 4 putative glycopeptides remained. However, manual curation of these latter candidates failed to identify associated glycoforms that would confirm, as above, the glycosylation of these peptides. It is noteworthy that, the absence of reliable identification of putative glycopeptides in the Mtb ∆rv1002c mutant suggests that the secreted hexosyl modified proteins present in the Mtb culture filtrate are exclusively PMTub modified mannoproteins.
In conclusion, using this original, strategy 42 glycopeptides detected in the culture filtrate of the Mtb WT, but not in the Mtb ∆rv1002c mutant one, could be assigned with high-confidence to Mtb mannoproteins. Of these, 37 obviously match with characterized glycosylated peptides from previously described Mtb mannoproteins, while 5 correspond to novel glycopeptides belonging to 3 proteins: the DC maturation-inducing antigen Rv0315, the Mtb virulence-associated Rv1411c LprG lipoprotein and the Rv0838 D-alanyl-D-alanine dipeptidase lipoglycoprotein LpqR (Tables 1 and 2). It is worth mentioning that while Rv0315 and LprG were both presumed to be mannosylated [10,40,[43][44][45], our data provide the first convincing evidence of mannosylation of the Rv0838 LpqR C-terminal peptide.

Variability of the Mtb Mannoproteome Determination
It is striking that only 8 of the mannoproteins detected herein were also reported by Smith et al. [12]. Indeed, we did not detected the glycopeptides reported by these authors for the Rv1096, Rv1887, Rv2164c, Rv2394 and Rv2744c proteins ( Figure 2). It is clear that the original filtering strategy we used as well as the culture conditions and the origin of the Mtb strain analyzed could affect the output of our analyses and explain these differences. To verify this assumption, we duplicated our analysis using strictly the same Mtb culture conditions, pre-analytical treatment and data processing. Examination of the CF proteins of this second replicate led to a very comparable number of hits with 13 mannoproteins identified (Table 3). However, the nature of the mannoproteins identified substantially diverge qualitatively with, respectively, 62% (8/13) and 42% (6/13) overlapping with our first biologic replicate or with the Smith et al. inventory (Figure 2). Precisely, on one hand, proofs of glycosylation are missing for the conserved Mce associated membrane protein Rv0175 [12], LprF (Rv1368, [11]), Apa (Rv1860, [7]), Rv2799 [12], as well as for the LpqR (Rv0838) and the antigenic MPB83 (Rv2873, [8]), which were not detected at all in this second sample. On the other hand, compared with our initial analysis, 5 additional glycopeptides were detected including 3 that were readily assigned to the previously reported glycopeptides P27-46 of the γ-glutamyl transferase GgtB (Rv2394) [12], P39-55 of the LppX (Rv2945; [46]) and P27-51 of the 19kDa antigen LpqH (Rv3763; [41]). Interestingly the two remaining hits correspond to novel glycosylated peptides supporting the presumed mannosylation of the two proteins LpqI (Rv0237) and PstS3 (Rv0928) [10] ( Figure S4).
Finally, of the 23 glycoproteins identified in total by Smith et al. [12] and us, only 5 (<22%) were systematically detected in the 3 analyses while 12 (52%) were present in only 2 experiments and 11 (48%) were sample specific (detected in a single essay).
Although these differences are generally attributed in large part to the technical variability arising from the intrinsic stochastic character of the selection of low abundance parent ions for MS 2 analysis [47], other factors, such as the most likely inherent biologic variability of the glycoproteome expressed by the bacteria from culture to culture may contribute also to these discrepancies. Indeed, even under highly defined and controlled cultivation conditions, subtle environmental ill-defined variations or fluctuations of poorly understood cellular processes, are known to provoke significant qualitative and quantitative alterations of the protein expression and post-translational modification [48,49]. Hence, one can reasonably speculates that the observed disparity of the Mtb mannoprotein repertoires identified from sample to sample results from a stochastic behavior of the protein-Omannosylation process that generates an intrinsic diversity of the mannoproteins secreted as recently reported for the S-palmitoylation of membrane proteins [50].    Although these differences are generally attributed in large part to the technical variability arising from the intrinsic stochastic character of the selection of low abundance parent ions for MS 2 analysis [47], other factors, such as the most likely inherent biologic variability of the glycoproteome expressed by the bacteria from culture to culture may contribute also to these discrepancies. Indeed, even under highly defined and controlled cultivation conditions, subtle environmental ill-defined variations or fluctuations of poorly understood cellular processes, are known to provoke significant qualitative and quantitative alterations of the protein expression and post-translational modification [48,49]. Hence, one can reasonably speculates that the observed disparity of the Mtb mannoprotein repertoires identified from sample to sample results from a stochastic behavior of the protein-O-mannosylation process that generates an intrinsic diversity of the mannoproteins secreted as recently reported for the S-palmitoylation of membrane proteins [50]. Table 3. Glycosylated peptides identified in the independent analysis of the Mtb WT CF. (Newly identified sequences are in bold; *: See Figure S4 for detailed identification and Table 1  Alongside these analyses, we also examined the presence of potential mannoproteins in the Mtb cell lysate. Indeed, since the seminal work of C. Espitia & R. Mancilla [5] and in agreement with the current model of the mycobacterial protein glycosylation pathway [51], all the studies dedicated to the identification and characterization of mycobacterial mannoproteins have focused exclusively on the soluble proteins present in the CF. However, it is noteworthy that many of the bona fide or suspected Mtb glycoproteins reported to date are also putative lipoproteins, which may remain more tightly associated to the bacterial cell wall through their lipidic anchor [41]. Therefore, we explored the cell associated glycoprotein repertoire from protein extracts obtained from the disrupted bacterial cells of the two Mtb biologic replicates. In total, the overall glycopeptidomics analyses of these fractions afforded high-confidence indications of glycosylation for 44 peptidic sequences attributed to the different glycoforms of 25 putative glycopeptides mapping 20 glycoproteins (Table 4). Of these, only 8 correspond to glycoproteins also detected in CFs of Mtb which were previously annotated as either putative lipoproteins or secreted proteins ( Figure 3A). In contrast, the remaining 12 glycosylated proteins identified are actually novel Mtb mannoproteins considered from literature as intimately associated to the cell envelope [52]. However, unexpectedly only 4 of them correspond to lipoproteins or peripheral membrane proteins as initially hypothesized: i.e., the extracellular disulfide bond forming protein Dsbf (Rv1677), the LpqE (Rv3584) and LpqG (Rv3623) lipoproteins and the phage shock protein orthologue PspA (Rv2744c)). Indeed, most these mannosylated proteins are classified as integral polytopic membrane proteins that have never been considered, up to now, as potential targets for glycosylation. However, it is worth mentioning that the topological analysis of the sequences of these eight proteins revealed that the glycosylated peptides correspond systematically to predicted extracellular domains flanked by hydrophobic transmembrane helices ( Figure 3B). This observation is consistent with the expected membrane location of the protein glycosylation machinery and strongly supports a glycosylation process occurring co-translationally during the protein export /insertion in the membrane via the Sec dependent protein secretion system.  Finally, it is noteworthy that, alike the CF, the comparison of the cell associated protein extracts, revealed significant differences between the 2 biologic replicates with only half of the proteins shared between these two samples (10 of the 20 proteins identified). Table 4. Peptides glycoforms detected in cell-lysate protein extracts of the Mtb WT and PMTuboverexpressing strain. Specific cell-associated glycosylated proteins are noted in bold and their occurrence in the Mtb WT and /or in the complemented strain overexpressing the PMTub gene (Cp) are reported in the right column. (see Table 1 for additional legend).

Potential Impact of the Mtb PMT Expression on the Mtb Glycoprotein Repertoire
Despite the critical impact of the Mtb protein-O-mannosylation on the invasiveness and virulence of the infectious pathogen, the regulation of this process remains unexplored. Moreover, the recent report of the PMTub higher expression in the virulent W. Beijing strain than in lower virulence Mtb strains [53], further addresses the question of the correlation between the PMTub expression level and the contour of the repertoire of mannoproteins secreted by Mtb.
Therefore, we investigated whether and to what extent the Mtb mannoprotein repertoire was affected by an increased expression of PMTub. With the aim to evaluate exclusively the influence of a PMTub overexpression on the Mtb glycoproteome, independently of the genetic backgrounds of the Beijing Mtb lineage, we used an isogenic Mtb complemented strain expressing the PMTub gene under the control of a pBlaF* promoter [4]. Indeed, originally selected as a strong mycobacterial promoter, pBlaF* is regularly used for a high constitutive expression of the downstream gene [54]. This property was readily confirmed by quantitative PCR analysis that shows a 2 to 3-fold increase of the PMTub gene expression during both the exponential and stationary growth phases of the complemented strain compared to the Mtb WT ( Figure 4A). the Beijing Mtb lineage, we used an isogenic Mtb complemented strain expressing the PMTub gene under the control of a pBlaF* promoter [4]. Indeed, originally selected as a strong mycobacterial promoter, pBlaF* is regularly used for a high constitutive expression of the downstream gene [54]. This property was readily confirmed by quantitative PCR analysis that shows a 2 to 3-fold increase of the PMTub gene expression during both the exponential and stationary growth phases of the complemented strain compared to the Mtb WT ( Figure 4A).  Table S4 for raw data).  Table S4  The mannoproteome of the complemented strain was thus explored by SDS-PAGE fractionation and MS-based proteomics using strictly identical experimental settings to those used above for the parent Mtb WT . Interestingly, in these conditions, the number of proteins identified in the CF of the complemented strain was found very comparable to the one obtained with the Mtb WT (1223 versus 1232 proteins). This result confirmed that the overexpression of the PMTub gene does not affect significantly, in its diversity, the repertoire of the secreted proteins expressed by the complemented Mtb strain (Table S5). In addition, we verified, by complementary label-free quantification experiments on the unfractionated extract, whether the expression of the known PMTub target proteins was altered by the overexpression of the PMTub gene. Again, quantification of the non-glycosylated prototypic peptides of the detected mannoproteins confirmed the absence of statistically significant differences in the relative abundances of the PMTub target protein between the parent Mtb WT strain and the derived strain overexpressing the PMTub gene ( Figure S5). In contrast, the glycopeptidomics analyses reveal that the overexpression of the PMTub gene is associated with a significant increase (+45%) in the number of different glycopeptides detected in the CF extract as compared to the parent Mtb WT strain. Indeed, 61 different glycosylated peptides passing through the stringent filtering steps of our multi-parametric identification workflow were detected in the PMTub-overexpressing strain's CF (Table S5). Most of them (43/61) match with glycopeptide sequences detected in the Mtb WT by us or others ( Figure 4B). But more interestingly, 23 of these glycopeptides that were never detected by us nor by others in the Mtb WT , were found to fit with 10 new peptidic sequences with variable glycosylation degrees. These new glycopeptides evidence definitely the unpredicted mannosylation of 8 additional Mtb proteins including some that are considered as essential such as the TatA Sec-independent protein translocase Rv2094c [55], the GlnA1 Glutamine synthase Rv2220 [55] and the MtrAB two-component system associated LpqB lipoprotein Rv3244c [56] (Table 5). Again, it cannot be totally excluded that the failure to detect these glycopeptides in the Mtb WT is due to the stochastic character of the parent ion selection for fragmentation in MS/MS that discriminate the low abundance ions [47]. However, it is worth mentioning that these glycopeptides, which have never been reported to date in any of the previous Mtb WT glycoproteome explorations, are identified here with high confidence by multipleMS 2 scans. This result points to the PMTub-overexpression as the cause of the increase of the mannosylation of these proteins in the complemented strain up to a detectable level. Thereby, these results constitute strong evidences that the repertoire of the mannosylated proteins secreted by Mtb is affected in its diversity by the expression level of the PMTub.

Mtb PMTub Expression Is Increased in the Macrophage
Finally, to assess whether the Mtb mannoproteome may undergo modulation during the infection, we analyzed the expression of the PMTub gene in infected macrophages. Indeed, several evidences support a potential regulation of its expression in the host including. First, the immediate vicinity of rv1002c with the in vivo expressed iVEGI genomic island (running from rv0960 to rv1001) that has been shown to be specifically activated during mice infection [57]. Secondly, the presence of at least three binding sites for transcription regulatory factors upstream or within the rv1002c coding sequence reported by independent Mtb gene wide regulation network studies and suggesting a possible regulation of the gene transcription [58,59]. To support the physiological regulation of the PMTub gene expression in vivo, we thus monitored the expression of the Mtb rv1002c gene post-infection of murine alveolar macrophages. Interestingly, we observed that the host cell intracellular hostile environment induces a transient significant increased expression of the gene reaching its maximum 5 h post-infection ( Figure 5). sequence reported by independent Mtb gene wide regulation network studies and suggesting a possible regulation of the gene transcription [58,59]. To support the physiological regulation of the PMTub gene expression in vivo, we thus monitored the expression of the Mtb rv1002c gene postinfection of murine alveolar macrophages. Interestingly, we observed that the host cell intracellular hostile environment induces a transient significant increased expression of the gene reaching its maximum 5 h post-infection ( Figure 5).  Table S6).
In contrast, the expression of the PMTub gene was not affected when placed under the control of the constitutive pBlaF* promotor in the isogenic complemented mutant. These results confirmed the specificity of the promoter dependent transient upregulated expression of the PMTub gene during the colonization of the host cell. Whether this early increase of the PMTub gene expression in  Table S6).
In contrast, the expression of the PMTub gene was not affected when placed under the control of the constitutive pBlaF* promotor in the isogenic complemented mutant. These results confirmed the specificity of the promoter dependent transient upregulated expression of the PMTub gene during the colonization of the host cell. Whether this early increase of the PMTub gene expression in the macrophage could result in the mannosylation of subdominant glycoproteins specifically secreted during the infection and essential for the macrophage colonization and the intracellular survival of the virulent Mtb, remains to be demonstrated. Unfortunately, the evaluation of the impact of such increased expression of the PMTub gene on the Mtb mannoprotein repertoire expressed in vivo remains highly difficult to tackle due to the very low amount of bacterial proteins recoverable from infected macrophages.

Materials and Methods
Bacterial culture-M. tuberculosis (H37Rv) wild-type and RV1002c complemented ∆Rv1002c mutant [4] strains were grown aerobically at 37 • C as pellicle for up to 6 weeks on glycerol-based Sauton medium for large scale glycoproteomic analyses. For macrophage infection purpose cells were routinely grown to exponential phase in Middlebrook 7H9 Media supplemented with ADC (Becton Dickinson Microbiology System) and 0.05% Tween−80 at 37 • C under shaking conditions. Protein extracts-M. tuberculosis culture filtrate (CF) protein extracts were obtained after cell harvesting by twice filtration of the culture mediums on 0.22 µm membrane and concentration using Vivaspin 5k ultrafiltration devices (SartoriouStedim Biotech) of appropriated volume at the desire protein concentration for proteomics analysis. Harvested cell pellets in suspension (1/1; v/v) in lysis buffer (pH = 7.4; 50-mM Tris-HCl, 5-mM EDTA; 5-mM DTT, 1% SDS, Sigma P8340 Protease inhibitor cocktail) were disrupted by bead beating using 0.1-mm glass beads (3/1: v/v, MerckEurolab, France) at maximum speed for 30 s followed by 1 min cooling on ice (X 5 times). The homogenate was centrifuged at 12,000 g for 20 min at 4 • C. Pellet was discarded and the supernatant was recovered as the "Cell lysate protein extract". The final protein concentration was estimated with the Bradford Protein Assay Kit (Thermo Scientific) according the manufacturer's guidelines. Protein quality was checked by SDS-PAGE.
Mass spectrometry analysis-Gel lane was cut into 17 homogenous slices and treated as described [4]. The peptides mixtures were analyzed by nanoLC-MS/MS using an Ultimate3000 system (Dionex) coupled to an LTQ-Orbitrap Velos mass spectrometer (Thermo Fisher Scientific) operating in positive mode. The peptides were loaded on a 300-µm inner diameter × 5-mm PepMap C18 precolumn (LC Packings, Dionex) at 20 µL/min in 2% acetonitrile, 0.05% trifluoroacetic acid. After desalting for 5 min, peptides were separated on line on a 75-µm inner diameter × 15-cm C18 column (packed in-house with Reprosil C18-AQ Pur 3-µm resin, Dr. Maisch; Proxeon Biosystems, Odense, Denmark). Peptides were eluted using a 5%-50% gradient of solvent B during 80 min at 300 nL/min flow rate. The LTQ-Orbitrap was operated in data dependent acquisition mode with the XCalibur software. Survey scan MS were acquired in the Orbitrap on the 300-2000 m/z range with the resolution set to a value of 60,000. The twenty most intense ions per survey scan were selected for collision induced dissociation CID fragmentation and the resulting fragments were analyzed in the linear ion trap (LTQ). The normalized collision energy was set to 35% and activation times to 10 ms and 150 ms. Dynamic exclusion was employed within 30 s to limit repetitive selection of the same peptide. MS data are available via ProteomeXchange with identifier PXD012018 and the open source "Neutral-Loss finder" software is downloadable from https://github.com/david-bouyssie/neutral-loss-finder/. Post-analytical bioinformatics data processing Database search and data validation-The Mascot Daemon software (version 2.6, Matrix Science, London) was used to perform database searches, using the Extract_msn.exe macro provided with Xcalibur (version 2.0 SR2, Thermo Fisher Scientific) to generate peak lists. The following parameters were set for creation of the peak lists: parent ions in the mass range 400-4500, no grouping of MS/MS scans, threshold at 1000. A peak list was created for each analyzed fraction (i.e., gel slice) and individual Mascot (version 2.6) searches were performed for each fraction. Data were searched against the TubercuList reference database (release 27; 4031 entries). The following list of variable modifications was used: carbamidomethylation of cysteines, propionamidation of cysteines, oxidation of methionine, mono-glycosylation (Hexose) of serine/threonine and di-glycosylation (2 Hexoses) of serine/threonine. Searches were performed using semi-tryptic digestion mode, and the specificity of trypsin was set for cleavage after K or R with two missed trypsin cleavage sites allowed. The mass tolerances in MS and MS/MS were set to 10 ppm and 0.6 Da, respectively, and the instrument setting was specified as "ESI-TRAP". In order to calculate the False Discovery Rate (FDR), the search was performed using the "decoy" option in Mascot. Protein groups were validated based on this Mascot MudPIT score to obtain a FDR of 1% at the protein level: FDR = number of validated decoy hits / (number of validated target hits + number of validated decoy hits) x 100. The mass spectrometry proteomics data were deposited to the ProteomeXchange Consortium via the PRIDE [1] partner repository with the dataset identifier PXD012018 Filtering of glycosylated MS/MS spectra-To isolate MS/MS spectra resulting from the fragmentation of glycosylated peptides, peak lists (i.e., MGF files) were filtered using a dedicated algorithm searching for consecutive neutral losses of hexoses. The script written in the Perl language is executed on each MGF file. The algorithm starts by filtering the top N peaks (N = 30 by default, and minimum relative intensity >10%) of a given MS/MS spectrum. Then, knowing the precursor ion m/z value and its charge (z), it computes the expected m/z value corresponding to the neutral loss of a given hexose: NLpeak = precursor_mz-(hexose_mass/z). A neutral loss is considered as being detected if the difference between the observed m/z value of the considered peak and the expected m/z value for this neutral loss is lower than a given tolerance (500 ppm in this study). If a peak is matched in the Top N list then the algorithm searches iteratively other consecutive neutral losses of the same hexose until no further loss is observed or until the maximum number of losses is reached (8 in this study). Finally, information about the putative glycosylated peptides are reported in a tabulated text file containing the precursor m/z and charge state, the spectrum title and the list of detected neutral losses. The neutral-loss finding software can be downloaded from https://github.com/david-bouyssie/neutral-loss-finder/.
Infections of macrophages-MH-S cell line macrophage infections were performed as previously described [4]. Briefly, macrophages in RPMI 1640 were infected with either the H37Rv wild-type strain or the rv1002c complemented ∆rv1002c mutant at a multiplicity of infection (MOI) of 10. Infection was allowed to proceed at 37 • C for 1 h, and the extracellular bacteria were removed by 3 successive washes with fresh medium. At the end of the infection period, cells were harvested for RNA extraction at 0, 5, 8 and 24 h post-infection.
PCR amplification-Total cellular RNA was extracted from either in vitro cultured Mtb or infected MH-S macrophages with the RNeasy Total RNA kit, (Qiagen, Hilden, Germany) according manufacturer instruction. Reverse transcription was performed using the RevertAid retro transcription kit from Thermo Fisher. Quantitative PCR reaction was performed in 25-µL reactions using the Maxima SYBR Green kit and the following gene-specific primer pairs (sense and antisense) of the M. tuberculosis pmt rv1002c target gene and rpoB rv0667 housekeeping reference gene: Rv1002c-RTF: 5 GTCTATCTGGCCACCTACGCT 3 , Rv1002c-RTR: 5 GATTCCCAAGGGTGGTAGTTG 3 , Rv0667-RTF: 5 AGGAACGGCATGTCCTCAAC 3 , Rv0667-RTR: 5 CGAATCCGGCAAGGTGAT 3 . Reactions were performed on the Applied Biosystems 7500 and rounds of amplification and annealing temperatures were optimized for each primer pair. Analysis were performed using the manufacturer software

Discussion
From the pioneering works by Schultz et al. [60] and Espitia et al. [5], mycobacterial mannosylated proteins have aroused curiosity due to their potential contribution to the interactions between Mtb and its host [30]. However, the definitive evidence of their crucial role in the host-pathogen interaction only came with the demonstration of the essentiality of the protein-O-mannosylating enzyme for Mtb virulence [4]. Because protein O-mannosylation remains the sole post-translational protein glycosyl modification actually documented in Mtb [4,[7][8][9]51], this finding renewed the interest for these potential virulence factors as a source for alternative chemotherapy targets [61] or new immuno-dominant epitopes for molecular vaccine development [62,63]. Although mannoproteins are of promising interest, the biologic functions of the protein glycan chains in Mtb still remain elusive [30,64]. Most of our knowledge is inferred from the studies dedicated to the immuno-dominant antigen Apa-an alanine and proline rich mannoprotein-specifically secreted by live bacilli [26,[65][66][67]. Several Apa's major biologic properties have been clearly ascribed to the presence of the mannosidic appendages. The mannosyl substituents are responsible for the binding of Apa to the host immune system C-Type lectines DC-Sign and SP-A, thereby, contributing directly to the invasion and the colonization of the host cell by the pathogen as reported since for several other mycobacterial mannoproteins [68,69]. In addition, changes in the mannosylation pattern of the M. bovis BCG Apa have been shown to alter its ability to stimulate CD4 + and CD8 + T-lymphocyte responses involved in the protective properties of the BCG vaccine against tuberculosis [70][71][72]. Alternatively, the O-mannosylation has been suggested to modulate the subcellular localization and the antigenic processing of the Mtb LpqH lipoprotein (19 kDa) antigen, [39,41,73]. In the case of the lipoglycan transporter lipoprotein LprG, the conclusions are less evident to draw since the glycan decorations were found dispensable for its transport function in Mtb [40] while they are essential for the MHC II restricted T Cell activation of lepromatous patient T lymphocytes [74].
However, in most cases, the role of the mannosyl decorations remains largely misunderstood due to the scarce number of precise structural characterization available but needed to determine with accuracy the structure-function relationships. A reason for this is that, while it is easy to evidence the presence of mannose residues on a protein using specific lectins, it is much more challenging to get clues on the localization of the glycosyl substituents on the protein skeleton. Such structural details are essential however to decipher the role of these appendages on the protein functionalities. Then to gain further structural insights into the O-mannosylation profiles of the Mtb proteins, we set up an unbiased strategy that combines a large-scale MS-based proteomic approach with the original search of multi-parametric signatures for the consistent characterization of glycopeptide from the LC-MS/MS data. The decisional pipeline combines (i) the detection in the MS 2 spectra of fragment ions resulting from the neutral loss of hexose residues from the parent peptides, (ii) the presence of several MS 2 spectra corresponding to different glycosylated forms of the peptides and (iii) the consistent chromatographic behavior of the glycoforms.
Although we cannot rule out the improper exclusion of the less convincing spectra that did not meet one of these criteria (false negative spectra), this stringent filtering allowed nevertheless the narrow selection of about a hundred of the most reliable glycopeptide candidates present in the protein extracts of the Mtb WT and the PMTub-overexpressing strains. Among the 40 Mtb mannoproteins that were identified with high confidence by their glycopeptides, 14 correspond to previously formally characterized mannoproteins, nine correspond to predicted one (from ConA binding experiment) and are structurally confirmed herein and 17 correspond to totally novel Mtb mannoproteins. Most of these (28 / 40) have no known function and are still considered as hypothetical or at the best as "probable proteins". However, at least one third of these proteins are assumed to be essential for growth or/and for animal infection [75][76][77] (Figure 6). Here, we provide evidences of mannosylation for the first time for at least six of these "essential" proteins involved in the host pathogen-interaction: the Rv0227c (involved in the invasion and infection of Mtb target cells [78]), the HtrA-like serine protease Rv1223 [79], the Sec-independent protein translocase TatA (Rv2094c [80]), the Glutamine synthetase GlnA1 involved in the DCs maturation and activation (Rv2220; [81]) and the disulfide oxidase DsbA-like enzyme Rv2969c involved the formation of protein disulfide bonds essential for the protein folding [82]). on the protein or are rather exclusive as in the case of the glycolipoprotein LpqH (Rv3763) [41]. Nevertheless, this finding addresses the intriguing question of the existence, in Mtb, of a functional interplay between protein mannosylation and phosphorylation analogous to the well described regulatory site-occupancy competition between phosphorylation and O-GlcNAcylation in the eukaryotes. In addition, we found that the expression level of PMTub can alter the repertoire of mannoproteins expressed by Mtb. Indeed, in the strain overexpressing the PMTub gene, we have observed a notable increase of the number of different glycopeptides detected including The secreted mannoproteins detected in the CF must rather be involved in the interaction with the host immune response effectors while those more tightly associated to the envelope and detected in the CL must contribute to the bacterial fitness and to the adaptive response to the host microbicidal processes. The stars highlight the detection in the culture filtrate of glycopeptides arising from mannoproteins annotated as membrane proteins. Reported phosphorylated proteins are labeled with blue diamond and filled blue diamond when the identified glycopeptide matches with a phosphopeptide reported in literature.
In addition, it is worth mentioning that among the glycosylated proteins identified some have been reported to be also phosphorylated on serine or threonine residues. Interestingly, comparison of our glycopeptide repertoire with the Mtb phosphopeptidomes reported in the literature [83][84][85][86] reveals for the first time that some peptides can be phosphorylated and glycosylated ( Figure 6). The available data do not permit us to determine whether the two PTMs may be present simultaneously on the protein or are rather exclusive as in the case of the glycolipoprotein LpqH (Rv3763) [41]. Nevertheless, this finding addresses the intriguing question of the existence, in Mtb, of a functional interplay between protein mannosylation and phosphorylation analogous to the well described regulatory site-occupancy competition between phosphorylation and O-GlcNAcylation in the eukaryotes.
In addition, we found that the expression level of PMTub can alter the repertoire of mannoproteins expressed by Mtb. Indeed, in the strain overexpressing the PMTub gene, we have observed a notable increase of the number of different glycopeptides detected including glycopeptides derived from proteins never suspected of being glycosylated until then. Taken together with the intracellular transient overexpression of the PMTub gene transcription detected in infected macrophages, these results strongly suggest that the mannoprotein repertoire may undergo an adaptive regulation into the host cell. At the light of the reported PMTub overexpression in the highly pathogenic W. Beijing strain [53], one can readily suspect that during the infection, PMTub may mannosylate some specific, but still unidentified secreted proteins able to alter the host-cell microbicide response and to contribute to Mtb intracellular survival and multiplication.
In conclusion, although certainly not exhaustive, the present inventory based on MS evidence of glycosylation of the Mtb mannoproteins opens new important questionings and constitutes an essential starting point to uncover the complexity of the multiple roles and contributions of the protein-O-mannosylation in the Mtb virulence. In addition, even though the contribution of mannosyl residues to the activities and functionalities of the Mtb mannoproteins is still poorly understood and needs to be explored on a case-by-case basis [40] (protection against proteolysis? folding? Addressing?), the essentialness of this widespread PTM for Mtb virulence makes this process an ideal alternative target for the development of antagonists of high therapeutic potential. Indeed, the Mtb PMTub presents no overlapping with any of the current anti-tuberculous drug targets and its chemical inhibition would deprive Mtb of several potential virulence factors while limiting the emergence of compensatory resistances by spreading the selective pressure over the multiple essential mannoproteins contributing to Mtb pathogenicity.  Table S1: List of the proteins detected in the culture filtrate of the Mtb H37Rv, Table S2: List of the 223 Mtb culture filtrate putative glycopeptides, Table S3: List of the putative glycopeptides detected in the culture filtrate of the Mtb Rv1002c KO Mutant, Table S4: Lists of the validated proteins and glycopeptides detected in the culture filtrate of the Mtb strain overexpressing the PMTub, Table S5: Relative expression of Rv1002c in the Mtb strain overexpressing the PMTub gene, Table S6: qPCR Raw data of the relative expression of the PMT Rv1002c gene during infection of MH-S macrophages.