Robust Moiety Model Selection Using Mass Spectrometry Measured Isotopologues

Stable isotope resolved metabolomics (SIRM) experiments use stable isotope tracers to provide superior metabolomics datasets for metabolic flux analysis and metabolic modeling. Since assumptions of model correctness can seriously compromise interpretation of metabolic flux results, we have developed a metabolic modeling software package specifically designed for moiety model comparison and selection based on the metabolomics data provided. Here, we tested the effectiveness of model selection with two time-series mass spectrometry (MS) isotopologue datasets for uridine diphosphate N-acetyl-d-glucosamine (UDP-GlcNAc) generated from different platforms utilizing direct infusion nanoelectrospray and liquid chromatography. Analysis results demonstrate the robustness of our model selection methods by the successful selection of the optimal model from over 40 models provided. Moreover, the effects of specific optimization methods, degree of optimization, selection criteria, and specific objective functions on model selection are illustrated. Overall, these results indicate that over-optimization can lead to model selection failure, but combining multiple datasets can help control this overfitting effect. The implication is that SIRM datasets in public repositories of reasonable quality can be combined with newly acquired datasets to improve model selection. Furthermore, curation efforts of public metabolomics repositories to maintain high data quality could have a huge impact on future metabolic modeling efforts.


Introduction
While the first observations of metabolic alterations in cancer were made about a century ago [1], metabolomics is a relatively new field of 'omics' technology aiming to systematically characterize metabolites being created and/or utilized in cells, tissues, organisms, and ecosystems [2]. This combined consumption and biosynthesis of metabolites can be represented as flux through specific metabolic paths within cellular metabolism, reflecting specific physiological and pathological states in biomedically useful detail and in ways that are distinct and often more sensitive than other omics methods. It is increasingly recognized that metabolomics biomarkers have great utility in characterizing and monitoring diseases with significant metabolic reprogramming like cancer [3]. Therefore, better regulatory understanding of specific metabolic flux phenotypes of metabolic diseases will aid in developing new therapeutic strategies.
Stable isotope resolved metabolomics (SIRM) experiments utilize stable isotopes from a labeling source to isotopically enrich detected metabolite analytical features, providing more complex but data-rich metabolomics datasets for metabolic flux analysis. Advances in mass spectrometry (MS) and nuclear magnetic resonance spectroscopy (NMR) greatly contribute to the generation of high-quality SIRM datasets [4]. However, computational methods are required to gain biologically meaningful interpretation from such complex datasets, especially in terms of metabolic flux through specific metabolic paths in cellular metabolism. Most current metabolic flux analysis methods heavily depend on a predetermined metabolic network and are mostly focused on the analysis of 13 C tracer experiments [5][6][7][8]. However, large numbers of 'unknown' metabolites in the metabolomics datasets strongly indicate that current metabolic networks are far from complete, especially for secondary metabolism and central metabolism of non-model organisms [9][10][11]. Without an accurate and reasonably defined metabolic network, it is challenging to conduct meaningful metabolic flux analyses. Even worse, assuming that a metabolic model is accurate compromises the scientific rigor of the metabolic modeling and can lead to misinterpretation of results [12].
Our newly developed moiety deconvolution package called moiety_modeling is a novel method for analyzing time series SIRM MS isotopologue profiles that can involve single or multiple isotope tracers [13]. This package integrates facilities for moiety (i.e., biochemical functional group) model and data representation, model (parameter) optimization, analysis of optimization results, and model selection under a single moiety modeling framework. A typical data analysis workflow for this moiety modeling framework is shown in Figure 1. Moiety modeling deconvolutes isotopologue intensity data of a metabolite into pseudo-isotopomers based on a given moiety description of the metabolite. Moiety modeling is an early step in certain metabolic flux analysis approaches that can allow the comparison of different moiety models for model selection. First, plausible and hypothetical moiety models of an interesting metabolite are provided by a user based on a relevant metabolic network. After the optimization of each moiety model during isotopologue deconvolution, the best model provided can be selected based on the optimized results of model parameters, which can be directly used for downstream metabolic flux analysis and interpretation.
Metabolites 2020, 10, x FOR PEER REVIEW 2 of 15 and nuclear magnetic resonance spectroscopy (NMR) greatly contribute to the generation of highquality SIRM datasets [4]. However, computational methods are required to gain biologically meaningful interpretation from such complex datasets, especially in terms of metabolic flux through specific metabolic paths in cellular metabolism. Most current metabolic flux analysis methods heavily depend on a predetermined metabolic network and are mostly focused on the analysis of 13 C tracer experiments [5][6][7][8]. However, large numbers of 'unknown' metabolites in the metabolomics datasets strongly indicate that current metabolic networks are far from complete, especially for secondary metabolism and central metabolism of non-model organisms [9][10][11]. Without an accurate and reasonably defined metabolic network, it is challenging to conduct meaningful metabolic flux analyses. Even worse, assuming that a metabolic model is accurate compromises the scientific rigor of the metabolic modeling and can lead to misinterpretation of results [12]. Our newly developed moiety deconvolution package called moiety_modeling is a novel method for analyzing time series SIRM MS isotopologue profiles that can involve single or multiple isotope tracers [13]. This package integrates facilities for moiety (i.e., biochemical functional group) model and data representation, model (parameter) optimization, analysis of optimization results, and model selection under a single moiety modeling framework. A typical data analysis workflow for this moiety modeling framework is shown in Figure 1. Moiety modeling deconvolutes isotopologue intensity data of a metabolite into pseudo-isotopomers based on a given moiety description of the metabolite. Moiety modeling is an early step in certain metabolic flux analysis approaches that can allow the comparison of different moiety models for model selection. First, plausible and hypothetical moiety models of an interesting metabolite are provided by a user based on a relevant metabolic network. After the optimization of each moiety model during isotopologue deconvolution, the best model provided can be selected based on the optimized results of model parameters, which can be directly used for downstream metabolic flux analysis and interpretation. In this paper, we used this moiety modeling framework to investigate the effects of the optimization method, optimizing degree, objective function, and selection criterion on model selection to identify modeling criteria that promote robust model selection. To our knowledge, this is the first attempt to investigate how all of these factors can affect model selection in metabolic modeling.

UDP-GlcNAc Moiety Model Construction
UDP-GlcNAc can be divided into four distinct moieties: glucose, ribose, acetyl, and uracil, in which isotopes incorporate through a metabolic network from an isotope labeling source. The expected (expert-derived) moiety model of 13 C isotope incorporation from 13 C-labeled glucose to In this paper, we used this moiety modeling framework to investigate the effects of the optimization method, optimizing degree, objective function, and selection criterion on model selection to identify modeling criteria that promote robust model selection. To our knowledge, this is the first attempt to investigate how all of these factors can affect model selection in metabolic modeling.

UDP-GlcNAc Moiety Model Construction
UDP-GlcNAc can be divided into four distinct moieties: glucose, ribose, acetyl, and uracil, in which isotopes incorporate through a metabolic network from an isotope labeling source. The expected (expert-derived) moiety model of 13 C isotope incorporation from 13 C-labeled glucose to UDP-GlcNAc (see Figure 2B) is built based on well studied human central metabolism pathways that converge in UDP-GlcNAc biosynthesis, which is corroborated with NMR data [14]. This expert-derived model is labeled as 6_G1R1A1U3, representing six optimizable parameters, one for the glucose moiety (G1), one for the ribose moiety (R1), one for the acetyl moiety (A1), and 3 for the uracil moiety (U3), for each moiety state equation representing the fractional 13 C incorporation for each moiety. For example, the g6 state represents the incorporation of 13 C 6 into the glucose moiety, whereas the g0 state represents no incorporation of 13 C. Since both g0 and g6 must sum to 1, there is only one parameter that needs to be optimized for this moiety state equation. The set of isotopologue intensity equations are derived using the moiety model parameters and Equation (1), as illustrated for the expert-derived model in Figure 2B. Figure 2C shows an alternative hypothetical moiety model 7_G0R3A1U3_g3R2R3_g6r5_r4 along with the isotopologue intensity equations generated from the model.
We also manually crafted 40 hypothetical moiety models to capture isotope flow from [U-13 C]-glucose into each moiety. This set of models provides a mechanism for testing how robustly the expert-derived model can be selected from all the other models provided.
Metabolites 2020, 10, x FOR PEER REVIEW 3 of 15 UDP-GlcNAc (see Figure 2B) is built based on well studied human central metabolism pathways that converge in UDP-GlcNAc biosynthesis, which is corroborated with NMR data [14]. This expertderived model is labeled as 6_G1R1A1U3, representing six optimizable parameters, one for the glucose moiety (G1), one for the ribose moiety (R1), one for the acetyl moiety (A1), and 3 for the uracil moiety (U3), for each moiety state equation representing the fractional 13 C incorporation for each moiety. For example, the g6 state represents the incorporation of 13 C6 into the glucose moiety, whereas the g0 state represents no incorporation of 13 C. Since both g0 and g6 must sum to 1, there is only one parameter that needs to be optimized for this moiety state equation. The set of isotopologue intensity equations are derived using the moiety model parameters and Equation 1, as illustrated for the expert-derived model in Figure 2B. Figure 2C shows an alternative hypothetical moiety model 7_G0R3A1U3_g3R2R3_g6r5_r4 along with the isotopologue intensity equations generated from the model.
We also manually crafted 40 hypothetical moiety models to capture isotope flow from [U-13 C]-

A Simple Comparison of Two Moiety Models
Model optimization aims to minimize an objective function that compares calculated isotopologues based on moiety state parameters from the model to the directly observed, experimentally-derived isotopologues. Figure 3A shows the comparison of optimized model parameters between the expert-derived moiety model (6_G1R1A1U3) and the hypothetical moiety model (7_G0R3A1U3_g3r2r3_r4) for three time points of isotopologue intensity data, i.e., three sets of isotopologue intensities. In these model optimizations, the SAGA-optimize method and absolute difference objective function were used and DS0, DS1, and DS2 correspond to the 34 h, 48 h, and 72 h time points in the FT-ICR-MS UDP-GlcNAc dataset. We can easily tell that the relative intensity of the corresponding model parameters between these two models are quite different, suggesting that the moiety-specific 13 C isotopic incorporation derived from the same MS isotopologue profile varies from one model to another. Furthermore, experiment-derived and model parameter-calculated isotopologue profiles are shown in Figure 3B, illustrating how much better the expert-derived model vs. an inaccurate model is able to reflect the observed data.
Metabolites 2020, 10, x FOR PEER REVIEW 4 of 15 model (7_G0R3A1U3_g3r2r3_r4) for three time points of isotopologue intensity data, i.e., three sets of isotopologue intensities. In these model optimizations, the SAGA-optimize method and absolute difference objective function were used and DS0, DS1, and DS2 correspond to the 34h, 48h, and 72h time points in the FT-ICR-MS UDP-GlcNAc dataset. We can easily tell that the relative intensity of the corresponding model parameters between these two models are quite different, suggesting that the moiety-specific 13 C isotopic incorporation derived from the same MS isotopologue profile varies from one model to another. Furthermore, experiment-derived and model parameter-calculated isotopologue profiles are shown in Figure 3B, illustrating how much better the expert-derived model vs. an inaccurate model is able to reflect the observed data.
Normalized Intensity  Observed isotopologue data were compared with the mean of calculated isotoplogue data with standard deviation from the optimized parameters for each model.

Effects of Optimization Method on Model Selection
The first question we were interested in was whether the optimization method could affect the model selection results. As in the previous analysis, we used three time points from the FT-ICR-MS dataset, the AICc criterion, and an absolute difference objective function in the initial trial. The optimization for each model was conducted 100 times, and we used the average of the 100 optimization results in the analysis (see Table 1). Most optimization methods can select the expertderived model except for 'SLSQP'. What interested us most was that the 'SLSQP' method failed in model selection with the lowest loss value (value returned from objective function) and is generally considered to be the fastest converging of the optimization methods we tested.
We repeated the experiment with the 'SLSQP' method 10 times and found that model selection fails when the loss value approaches 0.3 (Supplementary Figure S1), suggesting strong instability of Normalized Intensity

UDP-GlcNAc Isotopologue
Normalized Intensity Observed isotopologue data were compared with the mean of calculated isotoplogue data with standard deviation from the optimized parameters for each model.

Effects of Optimization Method on Model Selection
The first question we were interested in was whether the optimization method could affect the model selection results. As in the previous analysis, we used three time points from the FT-ICR-MS dataset, the AICc criterion, and an absolute difference objective function in the initial trial. The optimization for each model was conducted 100 times, and we used the average of the 100 optimization results in the analysis (see Table 1). Most optimization methods can select the expert-derived model except for 'SLSQP'. What interested us most was that the 'SLSQP' method failed in model selection with the lowest loss value (value returned from objective function) and is generally considered to be the fastest converging of the optimization methods we tested.
We repeated the experiment with the 'SLSQP' method 10 times and found that model selection fails when the loss value approaches 0.3 (Supplementary Figure S1), suggesting strong instability of model selection at a critical point. Model optimization aims to minimize the objective function, which is actually a non-linear inverse problem and one inherited issue in solving a non-linear inverse problem is overfitting (i.e., fitting to error in the data). Therefore, we developed the hypothesis that over-optimization of model parameters can lead to failure in model selection.

Over-optimization Leads to Failure in Model Selection
To test the above hypothesis, we first tried to increase the stop criterion of the 'SLSQP' method to control over-optimization. The results are shown in Table 2. When optimization stops earlier, the expert-derived model can be selected, which supports our hypothesis.
The SAGA-optimize method is more flexible in controlling the degree of optimization simply by adjusting the number of optimization steps. The more steps, the lower the average loss value reached by the optimization. Next, we performed a set of experiments using the SAGA-optimize method with increasing number of optimization steps to further validate the hypothesis. The results are summarized in Table 3. We can see that the loss value decreases as optimization step increases. When the loss value reaches a certain critical point, the expert-derived model cannot be selected, further supporting the hypothesis that over-optimization can lead to failure in model selection. Furthermore, the selected model can change with increasing degrees of over-optimization.  Based on the above results, we conclude that it is not the optimization method but the degree of optimization that affects model selection, which is explained by overfitting to error in the data when solving a non-linear inverse problem. When optimization reaches a certain critical point, successful model selection cannot be guaranteed. Therefore, proper control of the degree of optimization is of great importance in model selection.

Effects of Selection Criterion on Model Selection
Next, we investigated whether selection criterion could affect model selection. We compared the model selection results generated by SAGA-optimize with different model selection criteria (see Table 4, see Supplementary Table S1 for complete results). From these results, we can see that the rank of top models is quite consistent across different selection criteria, suggesting that these model selection criteria have little effect on robust model selection, at least under this model selection context. Since our previous experiments used AICc as the selection criterion, we will stick with AICc in the following experiments.

Effects of Selection Criterion on Model Selection
Considering that the dominant type of error existing in metabolomics datasets may vary from dataset to dataset, different forms of an objective function may affect model optimization and then influence the results of model selection. Here, we tested the effects of four objective functions in the context of model selection: absolute difference, absolute difference of logs, square difference, and difference of AIC. To speed up optimization, we first split the FT-ICR-MS dataset based on time point (34 h, 48 h, 72 h) into separate model optimizations executed on their own CPU core, and then combine the optimization results for the model selection. This functionality is provided by the moiety_modeling package. We set a series of experiments for each objective function with the SAGA-optimize method. The results are shown in Tables 5-8. In comparing Table 5 to Table 3, the number of optimizations per time point provides roughly the same degree of optimization as three times the number of optimization steps used on a combined optimization.    From these tables, we can see that optimization with the absolute difference of logs objective function is less likely to fail (>250,000 steps) in the model selection compared to the other three objective functions (10,000-20,000 steps). One interpretation from these results is that the FT-ICR-MS dataset is dominated by proportional error instead of additive error. However, the AICc produced with the absolute difference of logs is significantly higher (less negative) than that produced by the other objective functions. Therefore, this objective function may simply be hindering efficient optimization, especially if the dataset is dominated by an error structure that is not as compatible with this objective function. From this alternative viewpoint, additive error may actually dominate this dataset. We used a graphical method to visualize errors in both FT-ICR-MS and LC-MS datasets (Figures 4 and 5). For the plots of FT-ICR-MS datasets, we used another dataset generated from the same procedure, which included two replicates at 0, 3 h, 6 h, 11 h, 24 h, 34 h, and 48 h time points. For two replicates with proportional error, a scatter plot of each replicate against the other will show an increasing spread of values with increasing signal and the log-transformed data will collapse into a line. Plot of two replicates with additive error can be viewed as uniformly deviated from the line of identity, but once log-transformed will show an increasing spread of values with decreasing signal. The original plots of raw data indicate existence of proportional error in both FT-ICR-MS and LC-MS datasets (Figures 4A and 5A). However, the original plots of normalized data almost collapsed to a straight line ( Figures 4C and 5C), suggesting that normalization somehow removes the proportional error in the raw data. In addition, from the log-transformed plots, we can see that additive error does not exist in the normalized FT-ICR-MS datasets ( Figure 4D) but does exist in the normalized LC-MS datasets ( Figure 5D). The replicate plots of all time points (Supplementary Figures S1 and S2) show similar tendency with selected optimized datasets. Based on the above results, the absolute difference of logs objective function can hinder efficient optimization in FT-ICR-MS datasets. We also compared four objective functions in the context of model selection with LC-MS datasets (Supplementary Tables S2-S5). From these tables, we can see that model selection fails earlier with absolute difference of logs objective function compared to other objective functions, also suggesting that additive error may dominate in the normalized LC-MS datasets. Based on the above results, the objective function clearly affects model selection and the selection of certain objective functions for model optimization is able to increase resistance to failure in model selection caused by over-optimization; however, this is likely due to less efficient model optimization caused by the selection of an objective function not appropriate for the type of error in the data.  Figure 4A and 5A). However, the original plots of normalized data almost collapsed to a straight line ( Figure 4C and 5C), suggesting that normalization somehow removes the proportional error in the raw data. In addition, from the log-transformed plots, we can see that additive error does not exist in the normalized FT-ICR-MS datasets ( Figure 4D) but does exist in the normalized LC-MS datasets ( Figure 5D). The replicate plots of all time points (Supplementary Figures S1 and S2) show similar tendency with selected optimized datasets. Based on the above results, the absolute difference of logs objective function can hinder efficient optimization in FT-ICR-MS datasets. We also compared four objective functions in the context of model selection with LC-MS datasets (Supplementary Tables  S2-S5). From these tables, we can see that model selection fails earlier with absolute difference of logs objective function compared to other objective functions, also suggesting that additive error may dominate in the normalized LC-MS datasets. Based on the above results, the objective function clearly affects model selection and the selection of certain objective functions for model optimization is able to increase resistance to failure in model selection caused by over-optimization; however, this is likely due to less efficient model optimization caused by the selection of an objective function not appropriate for the type of error in the data. metabolite. When the 0 and 6h time points are removed, the selection results improved significantly. Likewise, when information-rich time points are removed, the model selection robustness decreases as well. Similar results were obtained when testing the FT-ICR-MS dataset ( Figure 6B). Taken together, the addition of information-rich data contributes to successful model selection while the addition of information-poor data detracts from successful model selection.

Effects of Information Quantity on Model Selection
From the above experiments, we found that over-optimization is a primary cause for failure in model selection and this is affected by the objective function used. The next question is whether the quantity of information affects model selection. One basic approach is to utilize more datasets in order to overcome the effects of over-optimization. In the following experiments, we repeated single model optimization 10 times in order to pragmatically finish these computational experiments. Every experiment was conducted 10 times using the AICc criterion and the absolute difference objective function. We used the SAGA-optimize method to test where model selection starts to fail.
First, we used decreasing number of time points of the LC-MS dataset to test whether data quantity affects model selection ( Figure 6A, Supplementary Table S6). However, model selection failed with few optimization steps when all five time points were included and when only one time point was included, with the most robust model selection occurring with 3 time points. Initially, these results were not expected, until we realized that the relative isotopologue intensity of the 0 and 6 h time points is concentrated within the 13 C 0 isotopologue with zero 13 C tracer. Thus, these datasets are less informative with respect to capturing the isotope flow from labeling source to each moiety in the metabolite. When the 0 and 6 h time points are removed, the selection results improved significantly. Likewise, when information-rich time points are removed, the model selection robustness decreases as well. Similar results were obtained when testing the FT-ICR-MS dataset ( Figure 6B). Taken together, the addition of information-rich data contributes to successful model selection while the addition of information-poor data detracts from successful model selection.
To further test this concept, we investigated whether combining FT-ICR-MS (34 h, 48 h, 72 h) and LC-MS (12 h, 24 h, 36 h) datasets can prevent failure in model selection ( Figure 6C). From the comparison, we can see that combining information-rich FT-ICR-MS and LC-MS datasets is much more resistant to failure of model selection than just using the information-rich FT-ICR-MS or LC-MS dataset, strongly supporting our previous conclusions that utilizing more information-rich datasets can prevent failure in model selection. Similar results were obtained with absolute difference of logs objective function (Supplementary Figure S3). These datasets were collected at different times, on very different mass spectrometry platforms. One used chromatographic separation while the other utilized direct infusion. However, the really surprising part is that the datasets were derived from different human cell cultures: LnCaP-LN3 human prostate cancer cells and human umbilical vein endothelial cells.
Metabolites 2020, 10, x FOR PEER REVIEW 11 of 15 more resistant to failure of model selection than just using the information-rich FT-ICR-MS or LC-MS dataset, strongly supporting our previous conclusions that utilizing more information-rich datasets can prevent failure in model selection. Similar results were obtained with absolute difference of logs objective function (Supplementary Figure S3). These datasets were collected at different times, on very different mass spectrometry platforms. One used chromatographic separation while the other utilized direct infusion. However, the really surprising part is that the datasets were derived from different human cell cultures: LnCaP-LN3 human prostate cancer cells and human umbilical vein endothelial cells.

Discussion
Here, we discuss the importance of model selection in isotopic flux analysis as a proxy for metabolic flux analysis and factors that affect robust model selection. We found that it is not the optimization method per se, but the degree of optimization that influences model selection, due to the effects of over-optimization, i.e., fitting of model parameters based on the error in the data. Overfitting is a known problem typically due to the ill-conditioning of the nonlinear inverse problem that is partially ill-posed. Moreover, the objective function in model optimization is also of great importance in model selection. Proper selection of an objective function can help increase resistance to failure in model selection. This may mean that different objective functions should be used for model selection versus parameter optimization for flux interpretation. Most SIRM experimental datasets have few collected replicates and time points due to the cost and effort required to acquire these datasets. The lack of replicates makes it impractical to directly estimate error in many of these datasets. Moreover, the presence of different types of systematic error like ion suppression can limit the overall effectiveness of replicate-based error analysis. With our moiety modeling framework, we are able to conduct a set of gradient experiments with varying amounts of optimization (i.e, number of optimization steps) using the SAGA-optimize method to estimate the failure point in model selection caused by overfitting. Furthermore, we found that incorporation of less informative datasets can hinder successful model selection, since they lack appreciable signal representing the incorporation of isotopes simulated by moiety models but with the full amount of error of the measured isotopologues. This lowers the overall isotope incorporation signal to noise ratio, which can lead to increased error in model selection. On the other hand, combining informative datasets (i.e., time points with significant isotope incorporation) can help control failure in model selection, which suggests that informative datasets in public metabolomics repositories can be combined to facilitate robust model selection. Moreover, these datasets do not need to come from identical biological systems, just biological systems that utilize the same part of metabolism being measured and modeled. The implication is that SIRM datasets in public repositories of reasonable quality can be combined with newly acquired datasets to improve model selection. Furthermore, curation efforts of public metabolomics repositories to maintain high data quality and provide metrics of measurement error could have a huge impact on future metabolic modeling efforts.

UDP-GlcNAc Time Course MS Isotopologue Datasets
Two UDP-GlcNAc time course MS isotopologue datasets were used to test the robustness of model selection mechanism. The first is a direct infusion Fourier transform MS (FTMS) UDPGlcNAc 13 C isotopologue dataset derived from LnCaP-LN3 human prostate cancer cells with [U-13 C]-glucose as isotope labeling source and collected on an Advion Nanomate nanoelectrospray inline connected to a Thermo 7T LTQ Fourier transform ion cyclotron resonance MS (FT-ICR-MS). This dataset includes 3 time points: 34 h, 48 h, and 72 h [14]. The second is a liquid chromatography-MS (LC-MS) UDP-GlcNAc 13 C isotopologue dataset derived from human umbilical vein endothelial cells with [U-13 C]-glucose as the isotope labeling source and collected on a ThermoFisher Dionex UltiMate 3000 LC System in-line connected to a ThermoFisher Q-Exactive Orbitrap MS. This dataset has 5 time points: 0 h, 6 h, 12 h, 24 h and 36 h [15]. Look to the reference associated with each dataset for more details on their experimental design, implementation, and rationale.

Objective Functions
We used four distinct forms of the objective function ( Table 9) that compares the observed isotopologues and corresponding calculated isotopologues derived from model parameters obtained from model optimization. The first is a summation of absolute differences between observed and calculated isotopologues, which is generally expected to work well with data where the dominant type of error is additive. The second is a summation of the absolute differences between the log of observed and calculated isotopologues, which is generally expected to work well with data where the dominant type of error is proportional. The third is a summation of square of differences between observed and calculated isotopologues. The fourth one tries to mimic the effect of model selection criteria. Table 9. Objective functions.

Objective Function Equation
Absolute difference Σ|I n,obs -I n,calc | Absolute difference of logs Σ|log(I n,obs ) -log(I n,calc )| Square difference Σ(I n,obs -I n,calc ) 2 Difference of AIC 2k + nln(RSS/n) k is the number of parameters; n is the number of data points; RSS is the residual sum of squares:

Optimization Methods
From a mathematics perspective, model optimization is actually a non-linear inverse problem. Several different optimization methods were used to solve this problem, including the SAGA-optimize method [13], and three other optimization methods ('TNC' [16], 'SLSQP' [17], and 'L-BFGS-B' [18]) available in the scipy.optimize Python module. The SAGA-optimize method is a combination of simulated annealing and genetic algorithm optimization methods that utilizes the population and crossover concepts from genetic algorithm to improve the optimization speed and consistency over older more traditional implementations of simulated annealing, allowing SAGA-optimize to produce better quality optimization results more efficiently (i.e., with fewer overall number of optimization steps). The 'TNC' method is designed for optimizing non-linear functions with large numbers of independent variables [16]. The SLSQP method uses Sequential Least Squares Programming, which is an iterative method for constrained nonlinear optimization [17]. 'L-BFGS-B' is a limited-memory algorithm for solving large nonlinear optimization problems subject to simple bounds on the variables [18].

Model Selection Estimators
We used three different quality estimators (Table 10) in model selection: the Akaike Information Criterion (AIC) [19], the sample size corrected Akaike Information Criterion (AICc) [20], and the Bayesian Information Criterion (BIC) [21]. The Akaike information criterion (AIC) is biased to select models with more parameters when the sample size is small, which can lead to overfitting [19]. The sample size corrected AIC (AICc) was developed to handle this bias and prevent overfitting [20]. The Bayesian information criterion (BIC) is another criterion commonly used in model selection [21].

Selection Criterion Equation
Akaike Information Criterion (AIC) 2k + nln(RSS/n) Sample size corrected AIC (AICc) AIC + 2k 2 + 2k /(n − k − 1) Bayesian Information Criterion (BIC) nln(RSS/n) + kln(n) k is the number of parameters; n is the number of data points; RSS is the residual sum of squares: RSS = n i=1 I obs,i − I calc,i 2 .

Supplementary Materials:
The following are available online at http://www.mdpi.com/2218-1989/10/3/118/s1, Figure S1: Error analysis in FT-ICR-MS datasets., Figure S2: Error analysis in LC-MS datasets, Figure S3: Comparison of the log of optimization steps where model selection with different datasets begins to fail with absolute difference of logs objective function, Table S1: Comparison of mode rank based on different model selection criteria, Table S2: Model selection test with absolute difference objective function, Table S3: Model selection test with square difference objective function, Table S4: Model selection test with absolute difference of logs objective function, Table S5: Model selection test with difference of AIC objective function, Table S6: Inclusion of less informative dataset can lead to failure in model selection.