Multiplatform Metabolomics Characterization Reveals Novel Metabolites and Phospholipid Compositional Rules of Haemophilus influenzae Rd KW20

Haemophilus influenzae is a gram-negative bacterium of relevant clinical interest. H. influenzae Rd KW20 was the first organism to be sequenced and for which a genome-scale metabolic model (GEM) was developed. However, current H. influenzae GEMs are unable to capture several aspects of metabolome nature related to metabolite pools. To directly and comprehensively characterize the endometabolome of H. influenzae Rd KW20, we performed a multiplatform MS-based metabolomics approach combining LC-MS, GC-MS and CE-MS. We obtained direct evidence of 15–20% of the endometabolome present in current H. influenzae GEMs and showed that polar metabolite pools are interconnected through correlating metabolite islands. Notably, we obtained high-quality evidence of 18 metabolites not previously included in H. influenzae GEMs, including the antimicrobial metabolite cyclo(Leu-Pro). Additionally, we comprehensively characterized and evaluated the quantitative composition of the phospholipidome of H. influenzae, revealing that the fatty acyl chain composition is largely independent of the lipid class, as well as that the probability distribution of phospholipids is mostly related to the conditional probability distribution of individual acyl chains. This finding enabled us to provide a rationale for the observed phospholipid profiles and estimate the abundance of low-level species, permitting the expansion of the phospholipidome characterization through predictive probabilistic modelling.


Introduction
Haemophilus influenzae is a gram-negative, pleomorphic bacillus that belongs to the Gammaproteobacteria class. Attending to the presence and type of its polysaccharide capsule, this species is classified into serotypes a-f and unencapsulated serotypes, which are also referred to as non-typeable H. influenzae (NTHi). All serotypes are capable of infecting humans and establishing disease, with serotype b (Hib) being the most virulent [1,2]. H. influenzae and, in particular, Hib cause a wide spectrum of clinical manifestations, especially in children. These complications range from otitis media and bronchitis to more severe invasive conditions, such as pneumonia, meningitis, and sepsis [3]. Conjugate vaccines against Hib have significantly reduced the mortality rates attributable to H. influenzae in the developed world. According to a 2015 estimate [4], Hib is responsible for an annual unsaturations mostly correspond to the presence of Z double bonds and cyclopropane rings within fatty acyl chains [40,41]. However, the presence of low-level fatty acyl chains and the degree of lipid chain asymmetry between sn-1 and sn-2 positions of GPs have not yet been qualitatively or quantitatively addressed in H. influenzae.
In recent years, several 'lipid atlases' of distinct biological matrices have been elucidated [42]. These atlases aimed to accurately describe and quantify the lipidome found in specimens of biological origin. However, as lipid atlases are inferred from LC-MS data, they often face limitations related to sensitivity that challenge the acquisition of information from the 'deep lipidome'. These limitations include the inability to match low-quality MS/MS spectra or determine lipid species whose abundance is under the limit of detection.
In this study, we performed a multiplatform mass-spectrometry approach that enabled us to qualitatively and quantitatively characterize the endometabolome and phospholipidome of H. influenzae. We identified 18 metabolite species and GPs not previously described in the reference metabolic model and elucidated the chemical structure of GPs present in H. influenzae to the species level. Notably, the evaluation of the phospholipidome compositional profile allowed us to develop a mathematical rationale that could be used to explain the reasons that some phospholipids predicted to occur were not observed in our lipidomic analysis and permitted the expansion of the phospholipidome by predicting the abundance order of magnitude of these low-level species.

Global Compositional Properties of the Experimentally Determined Metabolome of Haemophilus influenzae
A total of 257 chemical species with a wide variety of functional groups and polarities were annotated with varying degrees of confidence ( Figure 1a). These species included 96 polar metabolites (Table S1, File S1), lipids encompassed in 64 sum compositions (with evidence for the presence of at least 118 lipids, Tables S1 and S2), and 47 linear peptides with unknown functions (Figure 1a, Table S1). Remarkably, most polar metabolites were identified with an L1 level of confidence in the annotation, except for most peptides, which had L3 annotations ( Figure 1a, Table S1). Interestingly, lipid annotations showed a high level of confidence, being predominantly L2 (Figure 1a). Taken together, these annotations accounted for 15-20% of the small molecule compendium of H. influenzae, as captured in current GEMs (Figure 1b, Table S3). Although our study provided a modest metabolome coverage compared to the global metabolome present in current H. influenzae GEMs, these values were consistent with the current state-of-the art coverage of single resources in untargeted metabolomics [43]. Systematic chemical ontology analysis revealed the presence of several metabolite classes in the experimentally determined metabolome, with notable representation of peptides, lipids, and carbohydrates ( Figure 1c, Table S1) attributable to the combination of techniques used in this analytical approach. Correlation analysis of metabolite pools showed the presence of 13 self-correlating metabolite islands among polar metabolites, excluding peptides with unknown function (A-M, Figure 1d, Table S1). These self-correlating islands exhibited partial intercorrelation ( Figure 1d, Table S4). Very high correlation coefficients were observed for expected metabolite pairs with well-known biochemical associations, such as valine-isoleucine, glutathione disulfide-S-adenosylmethionine, or the branched-chain keto acid dehydrogenase products 3-methyl-2-oxovalerate-2-ketoisocaproate (Table S4), which demonstrate the usefulness of the correlation analysis employed. Additionally, very high self-correlations between peptides and correlations between specific lipid sum compositions and polar metabolites were observed (Table S5). correlations between specific lipid sum compositions and polar metabolites were observed (Table S5). metabolome coverage achieved via our multiplatform MS-based metabolomics approach and novel polar metabolites are annotated, both with regard to the filtered small molecule datasets obtained from iCS400 and iNL638 GEMs (each lipid type was considered to be a representative example, including the novel phospholipid subclass, LPE); (c) chemical ontology classification of annotated polar metabolites, excluding linear peptides of unknown function; (d) correlation heatmap of abundances in polar metabolites, not including linear peptides of unknown function. Self-correlating metabolite islands are denoted by letters. Intercorrelating islands are denoted by the product of letters. Asterisks (*) in the correlation heatmap specify that only part of the self-correlating island was involved in the intercorrelation.
To provide insights into the topology of H. influenzae metabolic pathways, we performed community clustering of the iNL638 periplasmic and cytoplasmic metabolite-reaction subnetwork (File S2). The clustering algorithm classified the corresponding network entities in 53 highly interconnected clusters of varying sizes and significantly enriched in different KEGG pathways (Figure 2, Tables 1 and S6). Among these examples, several highly sized clusters (clusters 1, 7, 13, 19, 22, Figure 2, Table S6) encompassed multiple metabolic pathways (Tables 1 and S6). Over-representation of network clusters in our experimentally determined metabolome was heterogeneous among clusters, with higher enrichment values found in clusters associated with general central carbon metabolism (cluster 1, Table 1), carbohydrate metabolism and transport (clusters 1, 3, 28, and 35, Table  1), amino acid metabolism and transport (clusters 1,7,9,12,15,16,18,19,22,23,30,34,49,50, and 53, Table 1), nucleotide metabolism (clusters 7, 10, 11, 27, 44, and 45, Table 1), and choline transport and incorporation into lipo-oligosaccharide (cluster 48, Table 1). The (b) metabolome coverage achieved via our multiplatform MS-based metabolomics approach and novel polar metabolites are annotated, both with regard to the filtered small molecule datasets obtained from iCS400 and iNL638 GEMs (each lipid type was considered to be a representative example, including the novel phospholipid subclass, LPE); (c) chemical ontology classification of annotated polar metabolites, excluding linear peptides of unknown function; (d) correlation heatmap of abundances in polar metabolites, not including linear peptides of unknown function. Self-correlating metabolite islands are denoted by letters. Intercorrelating islands are denoted by the product of letters. Asterisks (*) in the correlation heatmap specify that only part of the self-correlating island was involved in the intercorrelation.
To provide insights into the topology of H. influenzae metabolic pathways, we performed community clustering of the iNL638 periplasmic and cytoplasmic metabolitereaction subnetwork (File S2). The clustering algorithm classified the corresponding network entities in 53 highly interconnected clusters of varying sizes and significantly enriched in different KEGG pathways (Figure 2, Tables 1 and S6). Among these examples, several highly sized clusters (clusters 1, 7, 13, 19, 22, Figure 2, Table S6) encompassed multiple metabolic pathways (Tables 1 and S6). Over-representation of network clusters in our experimentally determined metabolome was heterogeneous among clusters, with higher enrichment values found in clusters associated with general central carbon metabolism (cluster 1, Table 1), carbohydrate metabolism and transport (clusters 1, 3, 28, and 35, Table 1), amino acid metabolism and transport (clusters 1,7,9,12,15,16,18,19,22,23,30,34,49,50, and 53, Table 1), nucleotide metabolism (clusters 7, 10, 11, 27, 44, and 45, Table 1), and choline transport and incorporation into lipo-oligosaccharide (cluster 48, Table 1). The absence of metabolite mapping in some large-and medium-sized clusters revealed areas of the H. influenzae metabolome that are poorly covered and warrant further metabolomic investigation (e.g., clusters 2, 5, 13, Figure 2, Table S6, File S2) [43]. Despite the high number of annotated lipids (Figure 1a, Tables S1 and S2), we did not observe significant enrichment in network clusters associated with lipid metabolism, as only a few of the species detected were included in previous GEMs and, therefore, could be mapped to the subnetwork metabolism (Table S3). These results highlight the need to update metabolic models that effectively capture the diversity of lipids in H. influenzae, as current GEMs have limited consideration for lipid composition in terms of chemical species [18,19,32]. Significant cluster enrichment of metabolite groups within the identified self-correlating islands was found (Figures 1d and 2, Table S6), suggesting a relationship between high interconnection within clusters and metabolite level correlation, especially in the large-sized cluster 1 (Table 1, Figure 2). Taken together, our results provide novel and global information on the means through which the metabolite pools are inter-related in a significant part of the metabolome of H. influenzae. cluster enrichment of metabolite groups within the identified self-correlating islands was found (Figures 1d and 2, Table S6), suggesting a relationship between high interconnection within clusters and metabolite level correlation, especially in the large-sized cluster 1 (Table 1, Figure 2). Taken together, our results provide novel and global information on the means through which the metabolite pools are inter-related in a significant part of the metabolome of H. influenzae. Figure 2. Topology of identified network clusters in the iNL638 metabolite-reaction subnetwork and mapping of metabolites associated with self-correlating islands. Light green nodes represent reactions; dark grey nodes represent unmapped species nodes and species not associated to correlating islands A-M; species that belong to identified metabolite self-correlation islands are represented according to the color legend. . Topology of identified network clusters in the iNL638 metabolite-reaction subnetwork and mapping of metabolites associated with self-correlating islands. Light green nodes represent reactions; dark grey nodes represent unmapped species nodes and species not associated to correlating islands A-M; species that belong to identified metabolite self-correlation islands are represented according to the color legend.

Experimental Polar Metabolome Characterization Reveals Novel Metabolites of H. influenzae
Metabolome characterization unveiled the presence of 18 polar metabolites not contemplated in iCS400 and iNL638 [18,19] (Tables 2 and S1). Notably, we identified the presence of a proline-based cyclic dipeptide not described previously in H. influenzae, cyclo(Leu-Pro) (Tables 2 and S1, File S1), which was reported in other Gammaproteobacteria, such as Pseudomonas spp. [44,45]. Cyclo(Leu-Pro) possesses significant biological functions, including antimicrobial activity, presumably through inhibition of quorum sensing [46,47] and biofilm formation [48], suppression of pro-inflammatory cytokines (IL-1β, TNF-α, IL-6) [44], and regulation of the composition of the oral microbial consortium [49]. Hence, the production of cyclo(Leu-Pro) by H. influenzae may play a significant regulatory role during H. influenzae-associated infections and commensalism in the oropharyngeal mucosa.
Additionally, we annotated with reliability the metabolite pseudouridine, which is a post-translational modification present in various forms of RNA, including tRNA, rRNA, and mRNA [50]. We found gene annotation evidence for pseudouridine biosynthesis via tRNA pseudouridine synthases A to D (UniProt IDs: P45291, P45142, P44197, and P44039, respectively), as well as a dual-specificity RNA pseudouridine synthase (UniProt ID: P44782). These results suggest that the observed pseudouridine pool is generated from RNA degradation, despite this compound potentially being a dead-end product in H. influenzae, as it lacks genes known to encode for enzymes, mediating its phosphorylation and the subsequent cleavage of pseudouridine-5 -phosphate into ribose-5 -phosphate and uracil [51,52]. Table 2. Identified polar metabolites absent in current developed GEMs.

Matching Experimental Evidence
Metabolite Name

L1
HRMS fragmentation spectra and retention times.
Nominal mass fragmentation spectra and NIST retention indices.
Ophtalmic acid Betaine 5'-Methylthioadenosine HRMS: high-resolution mass spectrometry; RMT: relative migration time Moreover, our metabolomic analysis revealed the presence of high abundances of ophthalmic acid in the metabolome of H. influenzae (Tables 2 and S1). Interestingly, ophthalmic acid is a poorly understood metabolite that is canonically generated via glutathione synthethases. However, these synthetases are absent in the H. influenzae genome, as indicated by the glutathione auxotrophy for this bacterium [53]. An additionally identified but poorly understood metabolite was N-methylalanine (Tables 2 and S1, File S1). While this metabolite is known to be biosynthesized from methylamine in other bacteria via N-methylalanine dehydrogenase [54], no ortholog genes putatively encoding for this enzyme are present in H. influenzae. Further studies are needed to ascertain the possible origin and function of ophthalmic acid and N-methylalanine in H. influenzae.
Another metabolite annotated with high confidence absent in current GEMs and not supported by gene annotation evidence was polyamine cadaverine. Intriguingly, the presence of cadaverine was previously determined in H. influenzae via HPLC analysis [55], and lysine decarboxylase activity has been observed for several H. influenzae clinical isolates [56]. We also identified the metabolite 5 -methylthioadenosine, which is involved in polyamine metabolism and supported by the presence of a purine nucleoside phosphorylase and a 5 -methylthioadenosine/S-adenosylhomocysteine nucleosidase (UniProt IDs: P44552 and P45113, respectively) (Tables 2 and S1). Another novel metabolite not supported through gene annotation and identified via metabolomics is γ-aminobutyric acid (GABA). Interestingly, the GABA subproduct succinate semialdehyde was included in iNL638. Thus, it may be of interest to ascertain if GABA can be biosynthesized by H. influenzae, despite no known orthologs putatively encoding for enzymes mediating its biosynthesis from glutamate or spermidine being found in its genome. We also detected significant abundances of the metabolite glycine betaine, which is a well-known osmoprotectant (Tables 2 and S1), despite the absence of enzymes known to biosynthesize betaine in H. influenzae. Therefore, betaine may act as an additional osmoprotectant for H. influenzae, which was presumably acquired from the bacterial environment absorbed via the high affinity choline transporter BetT [57].
We also identified the existence of free thymine and cytosine pools not described in current GEMs that may arise from the breakdown of other nucleosides and the incorporation of carbohydrate moieties [58] (Tables 2 and S1, File S1). We also identified two metabolite pools consisting of polyols (erythritol, ribitol, xylitol, sorbitol) and carbohydrates (trehalose, mannose, sucrose) (Tables 2 and S1). These results were unexpected since H. influenzae is not capable of fermenting such compounds [59,60] and, therefore, they may represent side reactions or simple absorption and accumulation of these compounds from the media. Lastly, we annotated with high confidence the metabolite threonic acid, which was previosuly described as a degradation product from carbohydrate-like compounds [61]. Taken together, our results suggest modest differences in metabolic pathway topology than were reported in previous GEMs, which are worth for further exploration and allowed us to find unexpected metabolome constituents, which, in turn, could not be inferred solely from current gene annotation evidence.

Global Compositional Properties of the Phospholipidome of H. influenzae
Using our lipidomics workflow and downstream extensive data curation, we determined that the phospholipidome of H. influenzae is composed of at least eight different lipid series, according to their polar head and unsaturation number (Figure 3a, Table S1). These lipid series corresponded to multiple GPs encompassed in 30-sum compositions of phosphatidylethanolamines (PEs), 25-sum compositions of phosphatidylglycerols (PG), and 7-sum compositions of lyso-phosphatidylethanolamines (LPE) (Tables S1 and S2). We observed that PEs accounted for the majority (~75%) of the phospholipid content, whilẽ 20% of the total phospholipid content was attributable to PGs ( Figure 3b, Table S7). These results were consistent with previous TLC studies [32] and proved the validity of our analysis. Interestingly, we identified LPE as a novel lipid subclass not previously described in H. influenzae, and it contributed to a very small fraction of the phospholipid content (Figure 3a,b, Table S7). Consistent with the literature, no other major lipid classes were identified in the phospholipid content of H. influenzae, demonstrating that this organism possesses one of the simplest phospholipidomes among gram-negative bacteria. However, lyso-phosphatidylglycerols and other phospholipids, such as phosphatidylserines, phos-phatidylglycerophosphates, phosphatidic acids, and their lyso-forms, may exist at pools with abundances below our instrumental limit of detection (Table S3) [18,32]. Regarding constituents of the lipidome other than phospholipids, we confidently annotated MG(18:0/0:0) and MG(0:0/18:0), as well as heptanoic acid (Table S1).
Evaluation of the unsaturation number distribution revealed that diunsaturated diacyl-GPs were minor components (Figure 3c), whereas the quantitative distribution of phospholipid-related sum compositions revealed that the carbon distribution in H. influenzae is qualitatively complex, quantitatively spanning at least~4 orders of magnitude ( Figure 3d,e, Table S7). Despite the relatively high number of sum compositions found for some lipid series (Figure 3e), it was determined that diacyl-GPs bearing 30 and 32 carbon units accounted for~80% of the total phospholipidome content (Figure 3d). In particular, PE(28:0), PE(30:0), PE(30:1), and PE(32:1) exhibited very high abundances (Figure 3e,  Table S7), which aligns with the total fatty acid composition determined via TLC, followed by GC-MS analysis of hydrolyzed phospholipid extracts [36]. However, the very high levels of PG(32:1) were not consistent with the fatty acyl compositions estimated for PGs, as no unsaturated acyl chains were reported [36] (Figure 3e). Notably, we observed a strong positive correlation between PE and PG logarithms of abundances in both saturated and monounsaturated phospholipids (Figure 3f, Table S1), which indicated that the processes governing the acyl chain distribution of PE and PG species in terms of order of magnitude are largely independent of the polar head in H. influenzae. ure 3a,b, Table S7). Consistent with the literature, no other major lipid classes were iden-tified in the phospholipid content of H. influenzae, demonstrating that this organism possesses one of the simplest phospholipidomes among gram-negative bacteria. However, lyso-phosphatidylglycerols and other phospholipids, such as phosphatidylserines, phosphatidylglycerophosphates, phosphatidic acids, and their lyso-forms, may exist at pools with abundances below our instrumental limit of detection (Table S3) [18,32]. Regarding constituents of the lipidome other than phospholipids, we confidently annotated MG(18:0/0:0) and MG(0:0/18:0), as well as heptanoic acid (Table S1).

Fragmentation Analysis of Phospholipid Coelutions Allow for a Detailed Estimation of the Fatty Acyl Composition in the H. influenzae Lipidome and Positional Information of Fatty Acyl Chains in Diacylglycerophospholipids
MS/MS analysis of molecular species coeluting with the same sum compositions allowed us to obtain the compositional information in fatty acyl chain content for~96.9%mol of the detected phospholipidome (100%,~99.5%, and~95.2% of the %mol attributable to LPEs, PGs, and PEs, respectively, as shown in Tables S1 and S2), as fatty acyl information of coelutions corresponding to low-abundant sum-compositions could not be obtained (Table S1). Within this subset of the total phospholipid content, spectral evidence of 61 PEs and 29 PGs was found ( Figure 4a, Table S2). Notably, it was found that the most abundant PEs and PGs contained C14:0, C16:0, C16:1, and C18:0 fatty acyl chains, as evidenced based on the estimated proportion of PE-and PG-bound fatty acids. In terms of consistency, only C14:0, C16:0, C16:1, and C18:0 were previously reported as phospholipid-bound fatty acyl residues in H. influenzae [36]. Here, our findings revealed the existence of a wide panel of low-abundant fatty acyl chains ranging from C10 to C20 and bearing a maximum of one unsaturation (Figure 4b), with the exception of C11:1, C19:0, and C19:1, suggesting that limitations in methodological sensitivity did not allow the detection of very low-level species in previous studies [36]. Interestingly, MS/MS fragmentation analysis of distant PE(32:2) peaks suggested a mixed profile of two diunsaturated PE types, as identical fatty acyl chains were observed at different RTs ( Figure S1). We propose that this effect can be attributed to the coexistence of a mixed population of PE(X:2) bearing fatty acyl chains with unsaturated Z bonds and cyclopropane rings, despite conventional MS/MS fragmentation analysis being unable to distinguish between these isomers ( Figure S1). Although diacyl-GP regioisomers were not separated via the analysis, we estimated the abundance-corrected carbon and unsaturation number of phospholipids by assuming that only the predominant regioisomer was present, an assumption that was in agreement with bacterial biological constraints for saturated and monounsaturated diacyl-GPs [40,41]. Under this assumption, it was observed that the sn-2 positions of saturated PEs and PGs bear notably shorter fatty acyl chains than those bound to the sn-1 position, while diunsaturated PEs and PGs exhibited the opposite effect (Figure 4c). Unsaturated fatty acyl chains were found to be preferentially bound to the sn-2 position of monounsaturated PEs and PGs (Figure 4d), which is consistent with the literature [40,41]. Notably, it was observed that these differences in fatty acyl chain binding to the sn-1 and sn-2 positions were principally attributable to the notable enrichment of C16:0 at the sn-1 position, as well as C16:1 at the sn-2 position (Figure 4e).  Through the examination of the associations between the different constituents of the phospholipidome and their distribution patterns, we observed that highly abundant LPEs were associated with highly abundant PE sum compositions, as these elements encom-

Probabilistic Rules Are a Major Factor Defining the Order of Magnitude of Lipid Species, Allowing the Prediction of Low-Level Species, Expansion and Refinement of the Phospholipidome
Through the examination of the associations between the different constituents of the phospholipidome and their distribution patterns, we observed that highly abundant LPEs were associated with highly abundant PE sum compositions, as these elements encompassed PE chain isomers bearing fatty acyl chains that were also found in highly abundant LPEs (Figure 3e). After formally testing of this relationship, we found high correlation coefficients between the logarithms of experimentally determined abundance fractions, which corresponded to sum compositions in PE(X:0) and PE(X:1) series, and the logarithms of the predicted abundance fractions obtained via summing the pairwise and compatible abundance products of LPEs (Figure 5a, Table S8). Although the predictability of PE(X:2) sum composition abundances determined via this approach was modest (Figure 5a), these results suggest that LPEs are generated via PE hydrolysis based on the putative action of phospholipases without any significant overall preference for fatty acyl chains. Furthermore, this result suggests that the probability distributions of PE(X:0) and PE(X:1) series could be inferred from LPE abundances in terms of order of magnitude ( Figure 5a). However, structural elucidation limited the inference of the probability distribution of LPEs from experimental PE abundance data using the opposite approach, as multiple PE chain isomers bearing fatty acyl chains not detected in LPE series coeluted and had equivalent PE sum compositions (Tables S1 and S3).
Based on the relationships between monoacyl-and diacyl-glycerophospholipids and between the observed experimental lipid chain isomer distribution (Figure 4a) and the total phospholipid-bound fatty acyl content (Figure 4b), we hypothesized that the distribution profile of PEs and PGs arises from the conditional probabilities associated with the unique probabilities of observing a fatty acyl chain in a given phospholipid. Inference of these probabilities from experimental data ( Table 3) allowed us to determine very high correlation coefficients between the experimental and predicted logarithms of the abundances corresponding to saturated PE and PG species, suggesting that positional independence largely dominates the PE(X:0) and PG(X:0) profiles (Figure 5b, Table S9). In addition, the utility of the predictions was further confirmed through adequate estimation of the logarithms of abundances corresponding to sum compositions that were not fragmented and, thus, not used for the calculation of probabilities (Figure 5c). Interestingly, most of the undetected X:0 sum compositions that were predicted fell below the limit of detection ( Figure 5b, Table S9). Thus, the model successfully predicted the order of magnitude of species that were undetectable via our MS method. In particular, we predicted the existence of 31 additional PE(X:0)s and 43 additional PG(X:0)s (Table S9). Analysis of the predictability of lipid chain isomer abundances in monounsaturated PEs and PGs revealed weaker correlation coefficients (~0.84 and~0.62, respectively) ( Figure 5b). Several possible explanations for this effect exist, including the following: (i) the calculation of fatty acyl chain probabilities in PX(Y:1) series was made via substitution taking scaled PX(Y:0) fatty acyl chain probabilities, as these figures could not be directly inferred from experimental data in PX(Y:1) due to the intrinsic nature of unsaturated diacyl-GPs; (ii) the partly resolved chromatographically species found in PX(Y:1) series may limit the accurate estimation of abundances for species within a definite sum composition; (iii) lipid chain isomers in PX(Y:1) bearing Z bonds or cyclopropane rings could follow distinct distribution probabilities; and (iv) PGs had significantly lower abundances, while given the high correlation found between PE-and PG-associated abundances (Figure 3f), individual fatty acyl chain probabilities calculated for PEs were used to predict PG abundances when these probabilities were unsolvable within PG data. This issue significantly constrained the number of fatty acyl chain probabilities available for PG(X:1) and, therefore, may restrict the predictability of the PG(X:1) series. Nonetheless, due to the relatively high correlation coefficient found between experimental and predicted PE(X:1) chain isomer abundance data (Figure 5b), we could predict the existence of 61 additional PE(X:1) chain isomers (Table S9). As expected (Figures 3f and 5a), we observed low predictive capabilities for the compositional profile of lipids belonging to the PE(X:2) series and insufficient data to evaluate the abundance predictability of PG(X:2) series (Table S9). The modest results observed when evaluating the PE(X:2) series (Table S9) may be attributed to the following potential causes: (i) the low number of detected PE(X:2) sum compositions, (ii) the low overall abundance for PE(X:2) sum compositions, and (iii) the putative presence of a mixed population of PE(X:2) with different compositional behaviors due to the presence of Z bonds and cyclopropane rings in fatty acyl chains (Figures 1A and S1). In sum, we considered the predictability of the compositional profile of PE(X:0), PG(X:0) and PE(X:1) chain isomers to be good enough to expand the number of diacyl-GP chain isomers by more than two-fold (135 predicted lipid chain isomers, in addition to the 90 chain isomers that were experimentally determined, as shown in Tables S2 and S9). Nevertheless, we suggest that deep lipidomics methodologies and computational optimization of probabilities may be employed to evaluate the positional independence and compositional properties of PG(X:1), PE(X:2), and PG(X:2) to generate more refined probabilistic models. Table 3. Estimated fatty acyl chain probabilities for each lipid series. Furthermore, this result suggests that the probability distributions of PE(X:0) and PE(X:1) series could be inferred from LPE abundances in terms of order of magnitude (Figure 5a). However, structural elucidation limited the inference of the probability distribution of LPEs from experimental PE abundance data using the opposite approach, as multiple PE chain isomers bearing fatty acyl chains not detected in LPE series coeluted and had equivalent PE sum compositions (Tables S1 and S3). , and PG(X:1) series. We note that most predicted abundances for sum compositions not measured in the analysis were below the lowest abundance corresponding to a sum composition within a lipid series, while predicted logarithm of abundance values in sum compositions absent in experimental data were additionally used in the y-axis to facilitate visualization. , and PG(X:1) series. We note that most predicted abundances for sum compositions not measured in the analysis were below the lowest abundance corresponding to a sum composition within a lipid series, while predicted logarithm of abundance values in sum compositions absent in experimental data were additionally used in the y-axis to facilitate visualization.

Reagents and Solutions
All solvents and reagents used in this study were of MS-grade quality. The polar extraction solution was 17.17 µM 4-chloro-phenylalanine (internal standard, Sigma-Aldrich, Steinheim, Germany) in Milli-Q water. The lipophilic extraction solution was prepared by adding methyl tert-butyl ether (Sigma-Aldrich, Steinheim, Germany) to MeOH (Thermo Fisher Scientific, Loughborough, UK) up to a 1:1 (v/v) ratio.

Bacterial Strains and Culture Conditions
H. influenzae RdKW20 (NC_000907.1) was cultured via continuous shaking (125 rpm) at 37 • C under microaerophilic conditions (5% CO 2 ) in Haemophilus Test Medium (HTM) broth (Francisco Soria Melguizo S.A., Madrid, Spain). Five biological replicate populations of H. influenzae Rd KW20 were evolved for 10 days (approximately 100 generations, designated as A to E). Every 24 h, the grown cultures were diluted at a 1:1000 ratio (v/v) in fresh media (2 mL HTM). Five biological replicates were generated from each evolution replicate (total n = 25). The bacteria were then incubated overnight under the above-mentioned conditions. Subsequently, the cultures were diluted at a ratio of 1:100 (v/v) with fresh HTM and cultured until OD 600 = 0.6. The bacterial pellets for mass spectrometry analysis were obtained via centrifugation (108× g; T = R.T.; t = 3 min), followed by washing with 800 µL PBS and another round of centrifugation (108× g; T = R.T.; t = 3 min). Next, 200 µL of cold methanol (−20 • C) was added to the pellets, followed by storage at (−80 • C) to minimize metabolism before quenching.

Cellular Lysis and Metabolite Extraction
To enhance metabolite recovery, double extraction was performed. The first polar extract was generated by adding 140 µL of polar extraction solution to the bacterial pellets preserved in cold MeOH (achieving a H 2 O:MeOH 1:1.43 (v/v) solvent mixture). Bacterial lysis was achieved through alternating ultrasonication cycles (20 pulses for 0.5 s each, amplitude = 60%) using a UP 200S ultrasonicator equipped with an S2 probe (Dr Hielscher GmbH, Stahnsdorf, Germany) with immersion in liquid N 2 (t = 50 s). This process was repeated 20 times for each sample. After metabolite extraction on ice (t = 15 min), the samples were centrifuged (12,600× g; T = 4 • C, t = 30 min). The supernatant was aliquoted for further CE-TOF/MS and GC-QTOF/MS analyses. After collecting the supernatant, 200 µL of lipophilic extraction solution was added to the remaining pellet, which was then resuspended through ultrasonication (3 pulses for 0.5 s each, amplitude = 20%). The samples were kept on ice to prevent overheating (t = 1 min), followed by vortexing (T = R.T., t = 30 min), and placed at RT (t = 15 min) for metabolite extraction. The samples were centrifuged (16,300× g, T = R.T., t = 30 min), and 100 µL of the supernatant was transferred to LC/MS vials for LC-QTOF/MS analysis.

LC-QTOF/MS Analysis and Data Processing
MeOH:MTBE extracts were analyzed following an adapted version of the Agilent MassHunter Workstation Lipid Annotator method for lipidomics analysis described by Agilent Technologies [62]. The analysis was performed using an HPLC system (1200 series, Agilent Technologies, Santa Clara, CA, USA) coupled to an Agilent 6545 QTOF/MS analyzer (Agilent Technologies, Santa Clara, CA, USA). The injection volumes were 2 µL and 5 µL for positive and negative electrospray ionization (ESI) modes, respectively. Metabolites were separated via an Agilent InfinityLab Poroshell 120 EC-C18 column (3.0 mm × 100 mm, 2.7 µm; Agilent Technologies, Santa Clara, CA, USA) equipped with an Agilent Infinity-Lab Poroshell 120 EC-C18 guard column (3.0 mm × 5 mm, 2.7 µm; Agilent Technologies, Santa Clara, CA, USA). The column temperature was 50 • C. A mobile phase flow rate of 0.6 mL·min −1 was maintained throughout the chromatographic gradient. Firstly, 70% B was held until 1 min. Secondly, 86% B was achieved at 3.5 min and held until 10 min. Next, 100% B was achieved at 11 min and held until 17 min, followed by 2 min of re-equilibration time at 70% B. The total method runtime was 19 min. Metabolites were ionized using an ESI source with a nebulizer at 50 psi, a drying gas temperature of 200 • C a drying gas flow rate of 10 L·min −1 , a sheath gas temperature of 300 • C, and a sheath gas flow rate of 12 L·min −1 . In both polarity modes (ESI+ and ESI−), the capillary, fragmentor, skimmer, and 18ctupole radiofrequency voltages were set to to 3500, 150, 65, and 750 V, respectively. Firstly, full MS was selected as the data acquisition mode at an acquisition rate of 3.5 spectra·s −1 over a mass range of m/z 50 to 3000. Mass correction was performed using m/z 121.0509 and 922.0098 for ESI+ and m/z 112.9856, 980.0164, and 1033.9881 for ESI-mode. Next, two iterative auto MS/MS analyses (under both polarity modes) were conducted under identical chromatographic conditions. MS/MS spectra were systematically acquired via iterative MS/MS mode over subsequent injections from a sample pool. Isolation width, ramped collision energy, and offset values were set at ≈1.3 Da, 3.8, and 4.6, respectively. Additional iterative MS/MS analyses were performed by increasing the injection volume in both ESI polarities to increase the spectral quality of low-abundant features (injection volumes of 7 µL and 10 µL for ESI+ and ESI-, respectively). Data deconvolution, chromatogram alignment, and compound integration were performed using Agilent MassHunter Workstation Software Profinder (B.10.00, Agilent Technologies, Santa Clara, CA, USA). Kendrick masses were calculated by adjusting for CH 2 , enabling the generation of Kendrick mass diagrams for the systematic search of lipid series. We considered as a lipid series at least three molecular features of similar Kendrick mass defect, which were spaced by 14 mass units. Initial lipid series annotation was performed by merging the manually curated Kendrick Mass output with the automated annotations obtained from processing MS/MS files with Agilent MassHunter Workstation Lipid Annotator (v. 1.0, Agilent Technologies, Santa Clara, CA, USA) (precursor and fragment ppm tolerance = 20). Unannotated molecular features putatively belonging to annotated lipid series were annotated using CEU Mass Mediator Batch search [63]. A subsequent metabolite annotation curation pipeline was performed for annotations obtained via non-targeted metabolomics data reprocessing using the following steps:  3 COO] − for PG series); (iii) annotations of PX(Y:Z) (PX(Y:Z) was a generic GP, and X was defined as a polar head, Y was defined as a sum composition carbon number, and Z was defined as a sum composition unsaturation number) were consistent with a logical elution order (RT PX(Y:0) > RT PX(Y:1) > RT PX(Y:2) at equal C number; RT PG(Y:Z) < RT PE(Y:Z) at equal C and unsaturation numbers; RT PX(Y1:Z) > RT PX(Y2:Z) if Y 1 > Y 2 , at equal unsaturation number); (iv) manual inspection of extracted ion chromatograms for all predicted m/z corresponding to all PG, PE, and LPE lipid series, which contained a range of fatty acyl chains of C10:0-C20:0 and C10:1-C20:1, as no m/z corresponding to fatty acyl chains with C number lower than 10 or higher than 20 were observed in initial MS/MS data curation, and fatty acyl chains contained in GPs were determined to bear a maximum of one unsaturation. The curated set of annotations was used to re-integrate the LC-ESI(-)-QTOF/MS data from the [M−H] − adducts of peaks that corresponded to the identified series LPE(X:0), LPE(X:1), PE(X:0), PE(X:1), PE(X:2), PG(X:0), PG(X:1), and PG(X:2) (X being a defined C number) using Agilent MassHunter Workstation Qualitative Analysis (B. 08.00, Agilent Technologies, Santa Clara, CA, USA) (m/z tolerance = 20 ppm). All annotations with more than one piece of orthogonal information (m/z, RT, MS/MS, adduct profile) were assigned as having an L2 confidence in the annotation. The chromatographic method was unable to resolve lipid chain isomers of saturated and isobaric diacyl-GPs and could only partially resolve lipid chain isomers of unsaturated and isobaric diacyl-GPs. Consequently, the area under the curve (AUC) of diacyl-GP lipid chain isomers and regioisomers (diacyl-GPs isomers bearing identical lipid chains but opposite binding to the sn-1 and sn-2 positions of their glycerol backbone) with identical sum composition that belonged to a specific phospholipid type (PE, PG, LPE) was summed for initial phospholipidome compositional analysis. Next, lipid chain isomers in each phospholipid coelution were determined based on the presence of compatible sums of m/z corresponding to fatty acyl chains in the collected MS/MS spectra. The predominant regioisomer in each lipid chain isomer was determined, as the relative intensity of the m/z generated from the fatty acyl chain bound to the sn-2 position of Pes and PGs was described as being higher with respect to that generated from the fatty acyl chain bound at sn-1 position [64]. LPE regioisomers with identical sum composition were also summed, as it was shown that artifactual interconversion occurs between these species [62,65]. Determination of the relative percentages of fatty acyl chains bound to GPs was performed using MS 1 -reprocessed data corresponding to LC-ESI(-)-QTOF/MS analyses. Firstly, raw phospholipid abundances of PGs and LPEs were normalized based on their relative response factor (Table S10)

GC-QTOF/MS Analysis and Data Processing
Samples were derivatized and analyzed following a protocol adapted from a previously described methodology [67][68][69]. Sample analysis was performed in an Agilent 7890B GC system coupled to an Agilent 7250 accurate mass Q/TOF analyzer (both from Agilent Technologies, Santa Clara, CA, USA). Next, 2 µL of sample was injected into a multimode inlet at 250 • C with a 3:1 split ratio. Separation of metabolites then occurred using a capillary column (Agilent DB-5MS, 30 m × 0.25 mm, 0.25 µm film thickness; 95% dimethyl-5% diphenylpolysiloxane and 10 m guard column; Agilent Technologies, Santa Clara, CA, USA). The carrier gas (He) flow rate was set at 0.85 mL·min −1 . The oven was allowed to stand at 60 • C for 1 min, and the temperature was gradually increased at 10 • C·min −1 until 325 • C, at which point it was held for 10 min. Metabolites were ionized using an electron ionization (EI) source. Full MS was selected as data acquisition mode, which used an acquisition rate of 6.67 spectra·s −1 over a mass range from m/z 45 to 650. D files acquired in profile mode were converted to the SureMass format to use the SureMass deconvolution algorithm from Agilent MassHunter Unknowns Analysis (B.10.00). The GC/MS fatty acid methyl ester described above was used to generate a correlation between Fiehn retention indexes (RI) and RTs. Similarly, the GC/MS n-alkane mix (C8-C40) was used to correlate NIST retention indexes with RTs. A spectral library search was initially performed for deconvoluted features using a combination of Exact Mass FiehnLib, Nominal Mass FiehnLib, and exact-mass in-house libraries of polyols, carbohydrates, and other polar compounds, all of which generated from data of pure standards analyzed via the same GC chromatographic method and using 70 eV in an EI source [67]. Hence, an L1 level of confidence in the annotation was assigned when match factor score > 70% or match factor < 70%, though the annotation was supported by acceptable RT error (RT %Error < 2%) and specific exact mass ions (File S1). An additional metabolite annotation was performed using the NIST Library (score cutoff > 80%), in which annotations were assigned an L2 confidence level if experimental RIs coming from experimental RTs using the n-alkane RI regression (RT %Error < 5%) were consistent with RIs collected in the NIST library. Metabolite annotations were manually curated, and qualifier and quantifier ions were selected based on their relative abundance and selectivity. Compound integration was performed using Agilent MassHunter Workstation Quantitative analysis for TOF (B.10.00, Agilent Technologies, Santa Clara, CA, USA). Additional recursive metabolite annotation over GC-MS data was performed by manually searching for the presence of metabolites contained in the H. influenzae metabolic models iCS400 and iNL638 in both Exact Mass and Nominal Mass FiehnLib. Annotations obtained from iCS400 and iNL638 were given an L1 level of confidence in the annotation when supported by RT tolerance (RT %Error < 2%) and characteristic exact mass ions (File S1). Qualifier and quantifier ions associated with these annotations were reintegrated using Agilent MassHunter Workstation Quantitative Analysis (B. 10.00, Agilent Technologies, Santa Clara, CA, USA).

CE-TOF/MS Analysis and Data Processing
Samples were treated and analyzed following the adaptation of a previously -described method [70]. In brief, 120 µL of H 2 O:MeOH extract was subjected to high vacuum on a HyperVAC VC2124 vacuum concentrator (Gyrozen, Daejeon, Republic of Korea) until complete dryness. Next, 60 µL of CE-TOF/MS sample buffer solution was added to each dried sample, which was vortexed for 1 min. After subsequent centrifugation (12,600× g; T = 4 • C; t = 15 min), the solution was transferred to CE/MS glass vials, which underwent centrifugation (4000× g; T = 4 • C; t = 20 min). Clear solutions were analyzed using a CE-TOF/MS platform composed of an Agilent 7100 CE system coupled to an Agilent 6224 TOF/MS analyzer (both from Agilent Technologies, Santa Clara, CA, USA). Samples were hydrodynamically injected at 50 mbar for 100 s and stacked by injecting background electrolyte at 100 mbar for 20 s. Metabolite separation was performed via a fused-silica capillary (70 cm, 50 µm i.d.; Agilent Technologies, Santa Clara, CA, USA). Separation conditions were +30 kV of capillary voltage (~2.4 mA) under normal polarity at 20 • C.
Metabolites were ionized under ESI+ mode, with a sheath liquid flow of 6 µL·min −1 , a nebulizer pressure of 10 psi, a drying gas temperature of 200 • C, and a flow rate of 10.0 L·min −1 . Voltages for capillary, fragmentor, skimmer, and octopole were set to 5500, 125, 65, and 750 V, respectively. Full MS was selected as data acquisition mode, which was set at an acquisition rate of 1.36 spectra·s −1 over a mass range of m/z 70 to 1000. Mass calibrators m/z 121.0509 and 922.0098 were used for online mass correction. Data deconvolution, electropherogram alignment, and molecular feature integration were performed using Agilent MassHunter Workstation Software Profinder (B.10.00, Agilent Technologies, Santa Clara, CA, USA). Molecular features obtained using non-targeted metabolomics data reprocessing were subjected to initial metabolite annotation using the CE-MS Experimental RMT search from CEU Mass Mediator [71] (m/z tolerance = 20 ppm; RMT tolerance = 5%, except 10% for regions near the EOF or at very low migration times [71]), which uses data from an in-house library of pure standards run under identical method conditions. These annotations were assigned an L1 confidence in the annotation. Annotations of compounds not annotated in GC-MS were further supported based on the presence of characteristic in-source fragments [71]. Subsequently, unannotated molecular features were subjected to additional annotation using the CEU Mass Mediator Batch Search [63]. An L3 confidence in the annotation level was assigned for these annotations if compatible migration times with protonation status shown at separation pH (~2) were determined. Additionally, a recursive metabolite annotation over CE-MS data was performed by manually searching for the presence of metabolites contained in the H. influenzae metabolic models iCS400 and iNL638, thus extracting the predicted m/z from [M+H] + adducts of metabolites using Agilent MassHunter Qualitative Analysis (B.08.00, Agilent Technologies, Santa Clara, CA, USA) (m/z tolerance = 20 ppm). Compatible features were subjected to the aforementioned annotation workflow for CE-TOF/MS and reintegrated using Agilent MassHunter Quantitative Analysis for TOF (B.10.00).

3.7.
Curation and Obtention of the Small Molecule Set of iCS400 and iNL638 Metabolic Models iCS400 [19] and iNL638 [18] H. influenzae GEMs were downloaded in the SMBL format from the BioModels repository (Model Identifiers MODEL1507180053 and MODEL2204040002). We did not use iJE296 [15], as this model was unavailable [18]. Next, metabolite IDs, names, and compartments for each model were extracted using Python (v. 3.10.1) and the COBRApy package (v. 0.26.3) [72], and they were exported into tabular format (File S3). KEGG Compound IDs and exact mass values for each compound were manually retrieved from the KEGG Compound database [73]. Both iCS400 and iNL638 were subjected to variable filtering, as only small molecules with defined mass <1500 Da were considered as 'truly metabolites' (entries corresponding to proteins, tRNAs, oligomers of ≥3 residues, and entries with MW ≥ 1500 Da were filtered out) (Table S3). To avoid biases in metabolite coverage estimation, only one representative example of each lipid type contained in models was retained for further global analysis of models (generic KEGG Compound IDs were assigned to these representative compounds) (Table S3).

Curation and Network Analysis of the Small Molecule Set of iCS400 and iNL638 Metabolic Models
The KEGG Compound IDs for annotated compounds found in GC-MS, LC-MS, and CE-MS were retrieved and curated manually from the KEGG Compound database [73]. Next, the information was crosschecked against the KEGG Compound IDs of the filtered small molecule datasets obtained from iCS400 and iNL638. Additionally, mapping of experimental metabolomics data was performed to a subnetwork of iNL638, which encompassed all periplasmic and cytoplasmic metabolites, as well as their associated reactions. This subnetwork was generated by importing iNL638 as a metabolite-reaction network using the cy3smbl App (v. 0.3.0) [74] in Cytoscape (v. 3.8.0) [75] and subsequently deleting the nodes not corresponding to metabolites or reactions, those corresponding to extracellular metabolites, and those of metabolic subproducts or cofactors with very high connectivities (H 2 O, CO 2 , AMP, ADP, ATP, NAD(P)(H), and H + ) (File S2). The subnetwork was clustered using the GLay community clustering algorithm from the clusterMaker 2 (v. 2.0) Cytoscape App [76]. Functional annotation of clusters was conducted using KEGG Enrichment over network clusters using KOBAS-i, selecting H. influenzae Rd KW20 as the KEGG organism [77] (Table S6). The metabolome coverage of clusters was evaluated by considering only the total number of chemical species present in the filtered small molecule dataset obtained from iNL638, as described above. Over-representation analysis of clusters using the experimental metabolomics data as a query was performed through a Fischer exact test in each cluster using python (v. 3.10.1) and the scipy, seaborn, and math packages (File S4).

Global Evaluation of Metabolite Experimental Data
The chemical ontology of metabolites detected in experimental metabolomics data was determined using the ClassyFire algorithm [78]. The abundances of these metabolites were subjected to correlation analysis using MetaboAnalyst 5.0 [79]. Spearman's rank correlation was performed over experimentally determined metabolites using default values, and correlation clusters with correlation coefficients of >0.6 and <−0.6 were subjected to further manual inspection. Enrichment of subnetwork clusters in features (variables) associated with self-correlating metabolite clusters was performed using python (v. 3.10.1) and the scipy, seaborn, and math packages (File S4).

Calculation of Experimental Lipidomic Properties
Response factor-corrected phospholipid abundances were converted to %mol. The total %mol content of each GP type (PE, PG, LPE) was determined for each measured sample extract. To evaluate the distribution of the total unsaturation number, we calculated the %mol of PX(Y:0), PX(Y:1), and PX(Y:2), respectively, which were summed for each sample extract. Similarly, the carbon number distribution of sum compositions was determined from C10 to C40, regardless of the unsaturation number. To calculate the fatty acyl average content bound to phospholipids, we first estimated the content of each given lipid chain isomer within an isobaric phospholipid coelution as the ratio between the sum of the intensity of the m/z associated with its two acyl chains and the total sum of intensities of combinations of m/z associated with fatty acyl chains compatible with the sum composition in the MS/MS spectra, as reported by Agilent MassHunter Lipid Annotator (v. 1.0, Agilent Technologies, Santa Clara, CA, USA), recognizing that the spectra were generated by combining multiple MS/MS spectra acquired via subsequent QC injections, followed by iterative auto MS/MS data acquisition, as described above for LC-ESI(-)-QTOF/MS. The entries corresponding to diacyl-GPs (Table S2) were selectively duplicated, and each fatty acyl chain was assigned to each of the generated entries. We then estimated the %mol corresponding to each bound fatty acyl chain. Metrics for sn-1 and sn-2 positions of PE and PG series were estimated by assuming that only the predominant regioisomer was present. To quantitatively assess the differences between the numbers of carbon atom and unsaturations, two parameters were calculated for the sn-1 and sn-2 position of each determined diacyl-GP chain isomer. These were named as 'abundance-corrected average number of carbon atoms (Equation (1)), and 'abundance-corrected average unsaturation number' (Equation (2)), as shown in the corresponding equations, where a represents the sum of abundances of lipid chain isomers containing a defined carbon atom number (Equation (1)) or a defined unsaturation number (Equation (2)) (m) from 1 to n, c represents a defined atom carbon number, and u represents a defined unsaturation number.

Calculation of Predictive Lipidomic Properties
We hypothesized that the probability of observing a defined PE or PG with X i and X j chains (p(X i _X j )) in H. influenzae is equal to the conditional probability of a PE or PG having specific acyl chains (p(X i _X j )) = p(X i |X j ) = (p(X i ) · p(X j )), as previously modelized for other biological matrices [80]. For the prediction of probability distributions of PEs from LPE composition, we assumed that p(PE(X i _X j )) = (p(LPE(0:0/X i )) + p(LPE(X i /0:0))) · (p(LPE(0:0/X j )) + p(LPE(X j /0:0))). To generate the predicted sum composition distribution, we then calculated p(PE(X i _X j )) for all possible pairwise combinations of fatty acyl chains contained in detected LPEs and summed all p(PE(X i _X j )) corresponding to lipid species with identical sum compositions (Table S8). For the prediction of PE and PG sum compositions and lipid chain isomers below the limit of detection from experimentally determined PE and PG data, we first considered the calculated p(X i _X j ) for each experimentally determined PE or PG lipid chain isomer. Subsequently, we estimated p(X i ) (X = C10:0 to C20:0) in saturated PE and PG lipid series by first taking p(X i ) as the square root of p(X i _X i ) (for chain isomers with identical fatty acyl chains). Next, p(X i ) that were not determinable via this method were solved via substitution that considered the p(X i _X j ) of the lipid chain isomer of the same lipid series with the possible highest abundance and an already solved p(X j ) other than p(X i ). Given the high correlation observed between PE and PG abundances that corresponded to identical sum compositions and the lower abundance observed for PG(X i _X j ) compared to equivalent PE(X i _X j ), we used p(X i ) values that were solved for PE series as p(X i ) for fatty acyl chains when not determinable in the equivalent PG series. The p(X i ) values were scaled for both saturated PE and PG series, meaning that ∑ n i=m p(X i )= 1. For the prediction of p(X i ) in monounsaturated PE and PG series, we used p(X i ) obtained from saturated species and scaled them such as that ∑ n i=m p(X i , saturated) = 0.5, as the probability of randomly observing a saturated fatty acyl chain in a monounsaturated diacyl-GP was 0.5. Next, the p(X i , unsaturated) of monounsaturated fatty acyl chains were estimated through substitution. Lastly, probabilities from diunsaturated species were estimated using p(X i ) values of monounsaturated PE and PG series, as only few diunsaturated PEs and PGs were observed, which did not allow a direct calculation of p(X i ) from p(X i _X i ) for low-level fatty acyl chains. All of the probabilities associated with each fatty acyl chain in each PE and PG series were used to generate all possible phospholipid chain isomer predicted abundances (Table S9). Predicted abundances of sum compositions were generated by summing all predicted abundances of lipid chain isomers that corresponded to the same sum composition (Table S9).

Conclusions
This study is the result of a combination of experimental findings originating from multiplatform metabolomics based on high-resolution analytical techniques, along with the development and application of fruitful mathematical models based on data and scientific principles. Consequently, we extended the direct evidence of the presence metabolome constituents of H. influenzae. Despite the relatively limited coverage of the metabolites predicted via GEMs, our established workflow for metabolome characterization enabled not only the relative quantitation, but also the identification, of lipid molecular species and polar metabolites not reported to date, of which the potential role of cyclo-(Leu-Pro) in H. influenzae pathogenesis is worth further investigation. We propose that analyzing correlating pools of metabolites could be a valuable resource for third-generation metabolic models that use metabolite-level information. Current metabolic models may be additionally benefit from the inclusion of the phospholipid composition that was elucidated through extensive study of the H. influenzae lipidome. Given the relative simplicity of the H. influenzae phospholipidome and its well-characterized genome, we foresee potential utility of H. influenzae as a model organism for exploring the factors that contribute to emergent membrane properties attributable to changes in phospholipid composition. Importantly, through the elucidation of the processes governing membrane composition in this relatively simple lipidome, we generated the first experimental in silico characterization of the phos-pholipidome. Further studies are needed to address the rules governing the compositional properties of membranes with higher lipid class diversity. Moreover, our study provides a mathematical rationale for the selective observation of specific lipid species, even though other species remain below the current instrumental limits of detection. We present the application to H. influenzae lipidome and beyond, proposing that revealing these processes could lead to more realistic membrane models and help in guiding the deep lipidome characterization of other biological specimens. Funding: This research was funded by the Ministry of Science and Innovation of Spain, grant number PID2021-122490NB-I00/AEI/10.13039/501100011033, and cofunded by the European Regional Development Fund ERDF, "A way of making Europe". The APC was funded by the European Regional Development Fund ERDF, "A way of making Europe", grant number PID2021-122490NB-I00.

Data Availability Statement:
Mass spectrometry data were uploaded to the Metabolomics Workbench Server (Dataset Identifier: ST002763).