Multivariate curve resolution (MCR) with constrained alternating least squares (ALS), as described by Tauler et al. [1
], is a powerful method to deconvolve overlapping spectral signals from chemical and biological reaction systems. The intended purpose is commonly the estimation of concentrations of individual components C
or the identification of unknown spectral profiles S
in complex aqueous solutions; generally, MCR has the ability to estimate both simultaneously from a data matrix X
. The specific feature of MCR is the decomposition of X
in a physically or chemically meaningful way. Besides MCR and ALS, other bilinear modelling methods and different algorithms can be utilised for the decomposition of X
, with various resolution performances and limitations [2
]. Because of its flexibility and popularity [3
], MCR with the ALS algorithm is used in this work. A tutorial for the application of MCR to analyse multicomponent systems, with special focus on the ALS algorithm, is given in [4
]. The main aspects to be considered—such as data set configurations, initial estimates and applicable constraints—are described. A central issue of bilinear decomposition is the impact which particular constraints, initial estimates and the applied algorithm may have on the uniqueness of solutions in the presence of rotational ambiguities [5
]. In the present work, the initial MCR settings and constraints for the analysis of FTIR-spectra from an E. coli
fed-batch bioprocess are described in detail.
Compared with other established chemometric analysis methods, MCR has the potential of simultaneous resolution and quantitation of all mixture components without their chemical or physical separation [5
]. Besides the recovery of qualitative and quantitative information about analytes, the identification of unknown interferents is possible [7
]. In comparison to other multivariate calibration methods, the calibration effort of MCR can be decreased significantly by application of appropriate constraints [8
]. Given a suitable set of constraints, this paper demonstrates that single measurements of pure solved analytes suffice to perform quantitative MCR analysis of respective fermentation data.
MCR-ALS has been employed for years in different research fields, especially in chemical reaction processes monitored by different spectroscopic techniques such as X-ray absorption [9
], fluorescence, nuclear magnetic resonance, Raman, Near-Infrared and FTIR [10
]. In biochemical and biophysical processes, MCR was used to analyse protein and nucleic acids systems concerning denaturation processes, protonation equilibria or complexation processes [11
]. Among other chemometric methods, MCR was utilised in single biological cell analysis to unmix information from hyperspectral images [12
]. In reference to fermentation processes, several applications of MCR-ALS have been published, e.g., in monitoring alcoholic fermentations with S. cerevisiae
], milk lactic acid fermentations with Streptococcus
] and the quantification of penicillin V in bioprocesses with Pencillium chrysogenum
]. In this work, MCR-ALS with tailored constraints is applied to estimate metabolism-relevant concentrations in high cell density cultivation (HCDC) fed-batch processes with E. coli
BL21 (DE3) pET28a. The cultivated organism produces the recombinant and pharmaceutically utilisable enzyme cytochrome p450 after induction. For evaluating process kinetics and optimising the growth of microorganisms, it is useful to obtain estimations about the quantitative changes of carbon, nitrogen and phosphate sources as well as of metabolic products such as acetic acid in the fermentation broth over process runtime. The aim of this study is the resolution of these substances by a tailored MCR-ALS algorithm.
To monitor the composition of fermentation media, ATR-FTIR spectroscopy is employed as an in-line ex situ analyser. Spectral information from the fermentation process is provided online by automatic cell-free sampling of fermentation broth through an ATR flow-cell. Because of the continuous sterile sampling, the constraint of invariance of the total concentration (closure) is applicable, as described below. An automated in-line flow system can cause the problem of CO2
and air bubbles, as well as biofilms on the ATR surface [17
]. Therefore, biofilms are inhibited by employing cell-free sampling and initial ethanol-cleaning of the flow-system. As technical gas bubble prevention, just gas-tight polytetrafluoroethylene (PTFE) tubes are implemented. However, principally, the problem of gas bubbles can be handled mathematically by the MCR algorithm as shown in this study.
FTIR spectroscopy is an established technique in bioprocess monitoring and, as other IR techniques such as near-infrared (NIR) and Raman, it combines the advantages of non-invasiveness and fast simultaneous measurement of multiple solved substances [18
]. An overview of advantages and disadvantages of different spectroscopy techniques in bioreactor monitoring has been described [19
]. In the mid-infrared region, covered by FTIR, most excitations of fundamental molecular vibrations can be found. Especially the fingerprint area (1500–500 cm−1
) exhibits specific patterns of media compounds [20
]. By contrast, the peaks in the NIR spectrum consist of overtones and combinations from primary MIR signals and are less distinctive. In comparison to NIR, the MIR region exhibits a higher selectivity thus allowing for a better detection of overlapping component spectra in complex aqueous mixtures. Raman spectroscopy is not sensitive to water and the small peak widths of solved components are main advantages of this technique in bioprocess monitoring [21
]. However, the Raman scattering is generally weaker than the FTIR signal, while higher concentrations of target analytes are required. A comparison of FTIR, NIR and FT-Raman spectroscopic techniques referring to a lactic acid fermentation shows the best prediction performance for FTIR [22
Some studies described the analysis of glucose, acetate, ammonium and phosphate concentrations in bioprocesses, using ATR-FTIR spectroscopy. Among other substances, ammonium and glucose are analysed in a complex antibiotic fermentation by at-line measurements on a horizontal attenuated total reflectance (HATR) crystal using a partial least squares (PLS) calibration model [23
]. Gluconacetobacter xylinus fed-batch cultures were monitored by an in situ ATR probe aimed at the online PLS analysis of acetate, phosphate and ammonium [24
]. The results of these PLS predictions—in contrast to MCR-ALS predictions of glucose, acetate, ammonium and phosphate—are discussed below.
As shown by references, FTIR monitoring of bioprocesses and MCR analysis of complex mixtures such as fermentation broths promise many advantages in simultaneous process information collection. The proposed effective usage of in situ and ex situ online sensor data to calculate carbon mass balance-constrained MCR predictions of several analytes underlines the relevance of ATR-FTIR/MCR-ALS combinations in fermentation analysis.
The monitored substances are related to bacterial metabolism. Glucose, ammonia and phosphate are substrates whereas acetate is a by-product of overflow metabolism [25
]. The recombinant cytochrome p450 remains unconsidered for its being an intracellular metabolite and therefore not being obtainable in fermentation broth. To predict concentrations of the observed analytes, only four calibration measurements of pure components are required, provided adequate constraints are applied during the alternating least squares procedure. In addition to the required analyte spectra, the implementation of estimated artefact spectra and additionally known fermentation media components is useful. As mentioned above, collecting samples from fermentation broth by a peristaltic pump through a tube system can present the problem of air bubbles in ATR flow-cell with impact on the measured spectra. If the water spectrum is removed from each mixture spectrum prior to that, the air bubble disturbance has its own spectral signature and can be handled like any pure component in multivariate curve resolution. The shape of this artefact is easy to be determined and its implementation in MCR improves the resolution of primary signals, as shown below. All initial estimations for pure spectral components S
expected in the mixture are also implemented as soft-constraints during alternating least squares. In the following, soft-constraints means the presence of an allowed solution area in a range set by inequality constraints, as in the optimisation problem. In the case of physical spectral shifting in the mixture, a certain flexibility during iterative identification of pure components helps to avoid over-restriction.
In the concentration estimation step, besides the non-negativity constraint, online process data such as input and exhaust gas-flow, fermenter mass, liquid supply from feed reservoir and pH control as well as turbidity are utilized to calculate elementary carbon mass balance constraints. Mass balance constraints (closure) applied to reaction systems have been described [1
]. Closure constraints require invariance of total concentration, granted by the sterile online sampling system and by including total reactor in- and output mass flow in the constraint calculation. So far, as known, the presented application of the carbon mass balance constraint for MCR to analyse a fed-batch fermentation process is a new utilisation of the popular closure constraint. The carbon mass balance constraint for a fed-batch fermentation requires extensive prior calculations, such as different conversion steps and soft sensor approaches. In addition, dynamic in- and output carbon mass-flow in gas and liquid phase as well as continuous and discrete sampling need to be taken into account. The presented algorithm is able to deal with these requirements. Calculating carbon mass balances or recovery rates is an established approach in bioprocess engineering to check the integrity of observed process data [28
]. The referenced literature has already shown the application of carbon mass balances for Escherichia coli
high-cell-density fed-batch culture and recombinant protein production. During process runtime, carbon recovery rates should take on values of about 1. In the present study, this condition is utilised as a MCR-ALS constraint for the estimation of carbon sources and metabolites such as glucose and acetate in fermentation media. In so doing, the explorative decomposition of measured mixture spectra is coupled with analytical knowledge in order to form a new hybrid multivariate modelling approach.
In summary, the objectives of this paper are
the interpretation of the MCR-ALS closure constraint as a carbon mass balance constraint for fed-batch fermentation processes;
to demonstrate that moderate gas bubble disturbances on the ATR crystal can be handled computationally, without any need for technical preventions;
to show that MCR-ALS with carbon mass constraint is capable of simultaneously predicting four analyte concentrations from FTIR spectra of fermentation media samples, with minor calibration effort.
2. Material and Methods
2.1. Spectra Acquisition, Sampling and Spectra Processing
The MIR spectra are scanned with a Thermo Scientific Nicolet™ iS™ 10 and the extension unit Nicolet iZ™ 10. The Specac’s Gateway™ ATR Accessory Kit and a ZnSe ATR crystal with six reflections are mounded as a flow cell in that unit. The flow cell is connected to the bioreactor via PTFE tubes (id 1.1 mm) and a Flownamics®
probe with rapid flow membrane for cell-free sampling. A peristaltic pump, controlled by an Arduino microcontroller and a driver board, delivers the sample liquid continuously to the FTIR flow cell. A background spectrum with pure water in the flow cell is scanned before using the FTIR for bioprocess analysis. During a running sampling process, spectra are scanned in cycles of 10 min. During spectrum acquisition by the Thermo Scientific™ OMNIC™ software [29
], the sampling pump remains inactive. The spectra acquisition time for scanning 32 spectra and releasing the mean spectrum for the current sample is about 1 min. OMNIC and microcontroller are both triggered by a C# program that observes and synchronises the sample supply and measurement steps. Before each start of a fermentation trial, tubing and flow cell are treated with 70% ethanol solution to minimise the risk of microbial activity in the sampling section. The initial spectra of known substances are standardised to unit concentration. No further pre-processing steps such as normalisation or differentiation are applied to the mixture-spectra in order to preserve the natural physical properties of the spectra. After the fermentation run, the MCR-ALS analysis of the ex situ online-monitored FTIR spectra is performed for all collected spectra.
2.2. Reference Analysis
To validate MCR-ALS results, reference values for glucose, acetate, ammonia and total phosphate concentrations are measured in the cell-free sample drain after passing the FTIR flow-cell. Glucose analysis was conducted by the YSI 2700 SELECT Biochemistry Analyzer (Yellow Springs, OH, USA). Acetate was determined by HPLC (high performance liquid chromatography) using chromatography column Reprogel H+ (Dr. Maisch GmbH, Ammerbuch, Germany). Total phosphate and ammonia were determined using photometric methods by procedures described in DIN EN 1189, DVGW W 504 and DIN 38406 E5.
2.3. Bioreactor System and Online Measurement Equipment
Fermentations are conducted in a prototype of Bioengineering’s 5l rounded-bottom autoclavable laboratory fermenter (RALF), controlled and observed with the Software BioSCADA Lab (Bioengineering AG, Wald, Switzerland). A supply tower with intelligent front modules (IFM) directs in- and output of control and measurement values. All data interchanged between IFMs and SCADA pass a structured query language (SQL) data base, the central data hub. From there, the needed data for calculating MCR constraints or advanced measurement and control strategies can be acquired by MySQL and MATLAB.
The current work utilises the following bioprocess online measurement instrumentation: Turbidity probe ASD19-N and optek-converter FC10 (optek-Danulat GmbH, Essen, Germany); exhaust gas analyser BlueInOne Ferm (BlueSens GmbH, Herten, Germany); thermal mass flow controller Red-Y Smart for inflow oxygen (0.01, …, 5 lpm) and air/nitrogen (0.1, …, 10 lpm) control (Vögtlin Instruments AG, Aesch, Switzerland); balances for online weight/volume observation of fermenter (DE 35K5D, Kern & Sohn GmbH, Balingen,Germany), acid/base (EW6000-1M, Kern & Sohn GmbH, Balingen, Germany) and feed (BL6100, Sartorius, Göttingen, Germany) reservoir.
2.4. Fermentation Strategy
The HCDC process is conducted in three phases: an initial batch phase, a feeding phase for biomass growth and an induction phase for product expression. The substrate and inductor feed is performed by exponential feeding strategy to control the cell specific growth rates µ similar to [30
]. Because of the risk of overflow metabolism und protein folding errors at high growth rates, µ is controlled to defensive setpoints of 0.1 h−1
(feed phase/biomass production) and 0.05 h−1
2.5. Strain and Fermentation Medium
E. coli BL21 (DE3) pET28a was stored as glycerol cryo-culture at −76 °C. The pre-culture is incubated as overnight culture in 500 mL baffled flasks at 37 °C in a shaker rotating 200 rpm. An amount of 300 mL pre-culture is portioned in equal shares on two shaking flasks. After 24 h pre-culture incubation, 2.7 L sterilised batch medium in the reactor is inoculated with the culture, thus amounting to a total start volume of 3 L.
The media are modified mineral media based on [31
]. The pre-culture and batch medium contain per litre: Glucose*H2
O, 16.5 g; KH2
, 13.3 g; (NH4
, 4 g; citric acid, 1.7 g; MgSO4
O, 0.72 g; Fe(II)SO4
O, 113.5 mg; CoCl2
O, 10.5 mg; MnCl2
O, 15 mg; CuCl2
, 1.2 mg; H3
, 3 mg; Na2MoO4*2 H2O, 2.5 mg; thiamine*HCl, 4.5 mg; trisodium citrate dihydrate, 75 mg; Na2
-EDTA, 9.6 mg.
The feeding solution is composed of Glucose*H2O, 544.4 g; MgSO4*7H2O, 12 g; Fe(II)SO4*7H2O, 43.3 mg; CoCl2*6H2O, 21.4 mg; MnCl2*4H2O, 23.5 mg; CuCl2 2.5 mg; H3BO3, 5 mg; Na2MoO4*2H2O, 4 mg; trisodium citrate dihydrate, 116 mg; Na2-EDTA, 14.8 mg.
3. Theory and Calculation
Matrices: Uppercase fat letters
Vectors: Lowercase fat letters
Scalars: Lowercase letters
3.2. Multivariate Curve Resolution and Its Physical Interpretation
The bilinear model of multivariate curve resolution [1
] for FTIR data can be deduced from the Lambert–Beer law which describes the attenuation of light travelling through material. The absorbance x of a material is given as
The logarithm of incident radiant intensity (I0
) divided by transmitted radiant intensity (I1
) is equal to the product of substance concentration (c), the molar attenuation coefficient (ε) and the pathlength (d). In this work, the technique of attenuated total reflection is used, so d is the penetration depth of an evanescent wave into the sample on the ATR crystal. The material and wavelength dependent factors ε and d can be pooled to s which consolidates the optical properties of a substance:
For mixtures of several substances k = 1, …, Ω, each absorbance value xij
related to its wavelength in a spectrum j = 1, …, n for a particular concentration profile i = 1, …, m is calculated as
In chemometrics, it is usual to term i = 1, …, m as the objects or samples of a dataset, whereby j counts the n features or variables. Here, the m objects are samples of fermentation broth over process runtime and the n features are absorbance values over the wavenumbers of FTIR spectra.
According to the previous sum equation, the decomposition of absorbance values over sample and wavenumber can be organised in matrices
In matrix representation, we get the simplified description:
That is the decomposition of absorbance spectra indicated by multivariate curve resolution assuming the data matrix X is bilinear.
3.3. An Implementation of the Alternating Least Squares Algorithm
With an initial estimation for concentration matrix
or pure components
and existing data X,
the ALS algorithm can run and perform multivariate curve resolution iteratively [1
]. Assuming the chemical rank of the observed data matrix is estimated and one assumption per each expected spectral independent component is available, the ALS procedure can start with an initial pure component matrix. Thus, in the first iteration, the estimated unconstrained concentration matrix
is obtained by
whereby the superscripted + indicates the pseudoinverse.
In the next step,
is estimated in an unconstrained way by
With that pure component estimation, a new concentration matrix calculation can be performed. That loop is repeated until a termination criterion is achieved.
Because of rotational and intensity ambiguities, it is necessary to constrain the solutions for and to obtain a physically meaningful separation of mixture components.
To calculate constrained linear least-squares solutions in this work, the lsqlin
function with the active-set
algorithm from MATLAB and the “Optimization Toolbox” is applied [32
makes use of mathematically rigorous methods of applying equality and inequality constraints with a better numerical stability than approximate methods commonly used in chemometrics. The approximate methods are easy to use and code, but they exhibit poor least squares behaviours and in some cases they result in an increase in the magnitude of residuals [33
solves linear least-squares curve fitting problems of the form
Hence, the MCR-ALS algorithm using lsqlin
is implemented as shown in Figure 1
to solve the present problem of resolving X
in a hybrid modelling way with the target of reducing ambiguities of least squares solutions. To bring a priori knowledge about pure spectra and the bioprocess into ALS solutions, linear inequality constraint vectors (e.g., b
) and matrices (e.g., A
) are applied. Further, the non-negativity constraint for concentrations is set by using lower bounds (l
3.4. Constraints for Pure Spectral Component Estimation
Pure spectra of components which are known and expected in mixture are constrained based on measured und normalised spectra of respective pure substances. Therefore, the same spectra used as initial estimations are also basis values of inequality constraints to calculate . Assuming the shapes of pure spectra in mixture closely resemble the pure measured spectra, in each iteration the associated pure component estimations may only vary inside the defined ranges relative to the measured . Depending on the amount of expected deviation in mixture, the range for the respective component can be adapted. In so doing, over-restriction can be avoided e.g., in the case of smaller rates of band shifting or in the case of differences in signal-to-noise ratios between high concentrated pure substance measurement and lower concentrations in mixture.
Regarding the inequation constraint for tuning
in Figure 1
(upper box), D
is composed vertically of the positive and negative identity matrices I
both of dimension (Ω, Ω).
The positive part is associated with the upper bounds , the negative with the lower bounds represented in ej for all components on each wavenumber. The allowed upper deviation u and lower deviation l are relative to the total ranges of the minimal and maximal values of the pure initial spectra for each component .
In our application, the chemical rank of the mixtures X
was estimated at 12 significant spectroscopically independent components by principal component analysis (PCA). The loadings of PCA were manually evaluated for the presence of spectra-like structure, which is strongly present on the first principal components and decreases on higher factors. Among the above mentioned, significant spectroscopically independent components were spectra of known media components, expected metabolic products, artefacts (like air bubbles) and unknown components. Only components evaluated as certainly present in the spectral mixture X
are constrained, notably the pure spectra of wanted substances: glucose, ammonia, total phosphate (H2
) and acetate. The estimations for those pure components may take on values at an interval of ±10% in the range of each pure component spectrum starting on the initial spectrum (see Table 1
Because of water background subtraction on each taken spectrum xi, air bubbles in the flow cell have the shape of inverted water spectra. Moreover, a pure water spectrum is also initialised for the case of air bubble presence during the background recording. Both estimations are not constrained and can vary depending on the actual mixture content.
To demonstrate the validity of the assumption for the spectral air bubble model, a simple aqueous solution containing glucose (15·gL−1
), ammonium (0.7 gL−1
) and phosphate (8·gL−1
) was compounded. From this solution, a first FTIR spectrum was acquired from a mixture covering the entire ATR crystal whereas a second spectrum resulted from the same mixture covering only about half of the crystal surface. In this way, a part of the IR beam reflections interacts with the aqueous solution on the ATR crystal, while another part interacts just with the air on the crystal surface. The latter liquid-free surface part simulates a large air bubble on the crystal in ATR flow cell. In the case without ALS iterations, the mixture matrix of the known solution is multiplied once with a simple pseudo inverse of the estimated initial pure components matrix. In a first measurement, S0
contains just pure measurements of glucose, ammonium and phosphate, respectively standardised to unit concentration. Next, S0
additionally contains an inverted water spectrum. The differences in concentration estimations are shown in Figure 2
whereas the actual concentration values are listed in Table 2
. Obviously, the integration of the air bubble model brings an improvement of the prediction results in the case of air bubble presence.
3.5. Constraints for Concentration Estimation
In bioprocess engineering, the carbon balance and recovery rate of a fermentation process are commonly used as a check for the integrity of process monitoring and sensors as well as the assessment of the release of outer membrane components. Carbon balances in a fed-batch culture are based on the mass of carbon in the total fermenter volume. Thereby, the recovery rate is the relation between the recovered carbon mC,rec(t) and the carbon brought into the bioreactor mC,in(t) over process runtime t.
Suppose all carbon compounds are determinable and measurement errors are negligible, rC
(t) is equal to 1 for all t. Because of the presence of measurement errors and not identified soluble organic carbon compounds, a tolerance range must be assumed. The carbon recovery considering biomass, CO2
, glucose and acetate is assumed as being about 90% [28
The recovered carbon is the sum of carbon mass in the reactor liquid phase L, gas phase g, sample liquid phase divided in cell-free sampling scf and cell containing sampling scc.
The brought-in carbon is the sum of initial carbon mass in fermentation medium at process start time (t = 0) and the supplied carbon mass mr from the feed reservoir r.
If the integrity of measurement equipment and data observation is already proved, the carbon mass balance can be applied as a MCR-ALS constraint for glucose and acetate estimation from the spectra on each observation i over process runtime. For that, several non-spectroscopic measurements and assumptions must be applied to calculate carbon balances on each FTIR measurement.
The carbon in the reactor liquid phase is located in biomass in fractions of
as well as in dissolved CO2
in fractions of
. The fraction of carbon in biomass is an assumption based on the analysis of elemental biomass composition of E. coli
with an elemental analyser taken from literature [34
]. Further carbon, of course, is located in glucose (glc) and acetate (ace), for which the concentrations c in reactor volume VL
are to be determined by FTIR/MCR-ALS. For MCR execution,
must be split into one term containing the required concentrations from FTIR ex situ online measurement (ex) and into another term containing carbon compounds with concentrations accessible by in situ online measurement (on). For calculation of the online term, biomass concentration is observed by turbidity measurement and a calibrated exponential model. Dissolved carbon dioxide concentration is estimated by a soft-sensor based on Henry’s law and the CO2
mole fraction measured by a gas sensor in exhaust gas flow [35
The ex situ online term ex, containing the concentrations to estimate by MCR, must be converted concerning the left side of inequality constraint . Thus, the (2,Ω)-matrix A contains in the first row, on positions associated with concentrations of glucose and acetate, the fractions and multiplied with reactor volumes . The first row is associated with the upper bounds of the constraint. The second row is the negative of the first row and is associated with the lower bounds .
All other brought-in and recovery terms that are directly or indirectly accessible by online process sensors and soft-sensors, but not by FTIR/MCR-ALS, are used to form the bi vector.
The carbon in the exhaust gas phase is calculated by the CO2 removal rate QCO2, which in turn is calculated based on measurements of CO2 mass flow at gas phase entry and of inert gas balance to estimate exit mass flow.
The calculation of carbon in samples of fermentation media starts at the first FTIR observation i = 1 with known initial media concentrations and sample volumes and taken before the first FTIR measurement is observed. A certain error in sample carbon mass calculation must be accepted since the respective current values of and are unknown at the time of constraint calculation. Hence, at i > 1, the results of the last MCR step i-1 are utilised. Considering comparative slow bioprocess kinetics and a higher sampling frequency, this is a reasonable approximation.
The brought-in carbon is the sum of the carbon fractions of glucose, acetate and cell mass in the initial medium as well as the supplied glucose from the feed reservoir.
By that information, bi
can be calculated as
The settings of the upper and lower tolerance bounds bu and bl of the carbon balance constraint are based on different considerations. Recovery rates higher than 1 are only caused by measurement errors while values below 1 are caused by both, measurement errors and not identified by-products. Therefore, the upper bound can be set tighter than the lower, with values in an interval of bu = (0.9, 1.1), depending on the process phase. An upper bound lower than 1 may be suitable if it is evident that the carbon compounds which are considered in the constraint calculation but which lie outside the optimisation of Ĉ are underestimated (e.g., biomass). An upper bound higher than 1 is indicated if external carbon compounds seem to be overestimated. Thus, by tuning bu, it therefore is possible to compensate measurement errors in online sensor equipment. As mentioned above, the recovery rate can reach approximately 90% at the end of the E. coli process, although just based on glucose, acetate, CO2 and biomass. In order to take the formation of not considered carbon compounds into account, the lower bound is set in a defensive way to bl = 0.8 to avoid over-restriction.
Furthermore, the non-negativity constraint is set for all concentration values, and at each new curve resolution step, the start value for lsqlin optimisation is set to the last estimation result. The initial concentration values for glucose, acetate, total phosphate and ammonia at i = 1 are set to the known batch medium concentration and may vary in a range of ±10%.
Some online measurements such as fermenter weight, gas analysis and turbidity have a higher noise level and have to be filtered before further processing. Biomass estimated by turbidity is smoothed by application of an exponential smoothing filter with a smoothing factor alpha set to 0.05. The online signal of fermenter weight is prone to disturbances in form of high needle peaks, often caused by manual contact with the reactor e.g., while taking an offline sample. Those disturbances can easily be removed automatically by a threshold filter detecting differences between one measurement point to the next, higher than a threshold value e.g., >1.5 L, since offline samples usually have values below 1.5 and since the actual fermenter volume changing rate is much inferior. These few values in the sequel above the threshold are overwritten by the last value lower than the threshold. In this way, measurement errors can be significantly reduced since all concentration values are depending on the reactor volume. Outliers in the online gas analysis are treated correspondingly.
4. Results and Discussion
The carbon balance constraint algorithm with appropriate initial pure spectra estimations results in physically reasonable MCR solutions. By setting suitable start values and tolerance bounds, the rotatory and intensity ambiguities are reduced significantly. As a consequence, the concentration profiles of the substrates glucose, ammonia, total phosphate and the expected metabolic by-product acetate can be unfolded from the spectral mixture matrix X
with minor manual measurement effort. An overview of the entire process spectra is displayed in Figure 3
The FTIR spectra show negative values because of the water background subtraction. The inflexions downwards on the left and right borders of the display are caused by air bubbles in the flow cell. These artefacts can be handled by MCR. Before integrating the spectral air bubble model in the MCR-ALS algorithm, the assumption for the spectral air bubble model was ascertained by concentration prediction of a simple aqueous solution containing glucose, acetate and phosphate. The solution was measured by FTIR with and without air on the ATR crystal surface. The prediction was executed by multiplying the measured spectra X
with the pseudoinverse of S0
, whereby S0
is a composition of pure spectra of known mixture components. In one experiment, S0
involves an estimated spectral model for air bubbles, in the other just the pure spectra of the solved components are compounded. As evident from Figure 2
and Table 2
, the integration of the estimated air bubble signature results in a significant prediction improvement.
The results of MCR-ALS concentration prediction based on the 264 measured process spectra are shown in Figure 4
. Elapsed calculation time for 300 ALS iterations was about 10 min on an Intel Core i7-4790 @3.6 GHz (4 Cores). It should be noted that, besides the constraints described above, just single manual measured spectra for each estimated pure component are utilised to achieve the resolution. Likewise, some of the pure component start values are just vectors of uniformly distributed random numbers. After 300 alternating least squares iterations, satisfactory approximations of the process dynamics are obtained. Glucose and phosphate are present in higher concentrations, so the resolution succeeds nearly without artefacts. At concentrations close to zero, a higher presence of artefacts and noise is expectedly obtained. Accordingly, the lower concentrated ammonium and acetate show a higher ratio of disturbances.
The error evaluation takes place by comparing the FTIR/MCR-ALS concentration measurements with reference measurements. Concerning the residuals, the root mean squared errors (RMSE) are calculated and shown in Table 3
The prediction results of the proposed MCR-ALS algorithm can be compared with prediction performances of PLS models. Acetate, ammonium and phosphate concentrations of a Gluconacetobacter xylinus
fed-batch culture were predicted from spectra of in situ ATR-FTIR measurements by a PLS model with accuracies of 0.2, 0.17 and 0.24 gL−1
, respectively [24
]. The validation errors for offline samples of the same process were 0.22 gL−1
(acetate), 0.24 gL−1
(ammonium) and 0.18 gL−1
(phosphate). The applied PLS regression model is based on 56 mixture solutions, used as calibration standards. The accuracies of MCR-ALS estimation for ammonium and acetate are similar to the PLS errors of the referenced paper. The absolute error of phosphate prediction is higher for the MCR-ALS approach than for the described PLS method, the measurement range being about two times higher, too. In consideration of the minor calibration effort of the proposed MCR-ALS approach, the results are impressive. Furthermore, the PLS glucose prediction accuracy by at-line ATR-FTIR monitoring of an antibiotic fermentation process is with 0.56 gL−1
similar to the present prediction by MCR-ALS [23
]. The PLS calibration model for glucose is based on 70 filtrated fermentation samples. Here, too, the reduction of calibration effort by effective online sensor data usage is evident when compared to PLS.
The estimations of pure component spectra are displayed in Figure 5
. In addition to the notice concerning the associated concentrations, the higher noise level of the lower concentrated ammonium and acetate is also apparent in the pure spectral components.
By way of comparison, Figure 6
shows the results for glucose and acetate concentrations without application of the carbon balance constraint but including the same constraints as used for pure spectra estimation, see above. Between hour 10 and 15 there is a significant artefact observable in the glucose concentration profile. The concentration estimation is too high, also discernible by carbon recovery rates approaching almost 1.2 in this process phase. A second drift in glucose concentration is located around t = 35 h without obvious reflection in carbon balance because of the lower deviation. In any case, without the carbon balance constraint, the solution space of MCR is enlarged, thereby also increasing the risk of ambiguities which can cause physically nonsensical solutions. For the same reason, the acetate profile in Figure 6
gives the impression of increasing concentrations which actually are not present. Nevertheless, the shapes of the actual existing concentration profiles in the batch phase are more or less recognised, the artefacts increasing mostly in the respective zero-concentration phases.
The carbon recovery at the end of the MCR-ALS procedure is shown in Figure 7
. Without application of the carbon balance constraint, the recovery rate exceeds two times the value of 1.09, once in the beginning and once again at the end of fed-batch phase. Even the lower value of 0.8 is slightly undershot at the beginning of the process. Around t = 25h, near the end of the feeding phase, the recovery rate with the enabled carbon balance constraint touches the highest upper bound of 1.09. As for the final estimated concentrations, the lower bound of 0.8 is not reached.