Microalgae Monitoring in Microscale Photobioreactors via Multivariate Image Analysis

: Microscale photobioreactors for microalgae growth represent an interesting technology for fast data production and biomass characterization; however, the small scale poses severe monitoring challenges, as traditional methods cannot be used. Non-invasive techniques are therefore needed to quantify biomass concentration and other culture properties, for example, pigment com-position. To this purpose, a soft sensing approach based on multivariate image regression is proposed to exploit RGB images and/or PAM-imaging chlorophyll fluorescence. Different PLS (Partial Least Squares) regression models are used to estimate: (a) biomass concentration from the features extracted by RGB indices and/or PAM-imaging chlorophyll fluorescence measurements; and (b) Chlorophyll a content per cell from the features extracted by RGB indices and biomass concentration measurements. Every single model is aimed at characterizing the microalgae culture at different light intensities during batch growth. Results show that the proposed monitoring approach is as accurate as traditional measurement approaches and may represent a promising methodology for fast and inexpensive monitoring of microscale photobioreactors.


Introduction
The development and application of microscale platforms, also called micro-photobioreactors (µPBR), for lab-scale microalgae cultivation has gained significant interest to overcome some limitations of traditional cultivation techniques, which are often timeconsuming, high cost, labor-intensive and low throughput [1,2]. Microscale technologies are characterized by very small volumes (pL, nL, µL) and represent a powerful tool to enhance data quality and data production, because they allow for tight and rapid control of key operating variables, being also cheaper and compatible with high-throughput analyses. Recently proposed microscale technologies for microalgae were used for strain selection, size characterization and cell viability assessment [2]. On the other hand, a relevant drawback of microscale techniques is that the small cultivation volume of the device is not compatible with traditional lab-scale measurement protocols, thus requiring the development of ad hoc techniques. Even though lab-on-a-chip solutions may address this issue [3], they require a significant design effort and may adopt complex methods or expensive instrumentation. A monitoring technique suitable for a microscaled device must be online and noninvasive to avoid perturbing the system. Literature examples for the monitoring of µPBR typically rely on cell autofluorescence at a specific wavelength [4] or Chlorophyll fluorescence measurements [5,6]. However, chlorophyll fluorescence may lead to calibration problems due to significant measurement noise and self-absorbance and saturation issues. Image analysis can also be used as a monitoring technique, typically by coupling a microscope with an EMCCD camera, for example, to develop a single-cell tracking algorithm in µPBR built for single-layer cultivation [7], or to infer the growth rate through the surface area occupied by cells [8].
Alternatively, a simpler and faster approach is to utilize digital RGB images, in which each pixel consists of an array of red, green and blue intensity, in the range . This monitoring method exploits the fact that color intensity is strictly linked to biomass concentration. However, as reviewed by Liu et al. [9], RGB images are rarely used to quantitatively describe microalgae and are mostly employed in larger-scale applications than µPBRs. Furthermore, constant light intensity is generally assumed. Usually, RGB images of microalgae samples are converted to grayscale and then correlated with biomass concentration [10,11]. A similar approach was used to estimate concentration via images of light distribution in a tubular photobioreactor [12]. It was also shown that the normalized intensity of the green channel can be correlated with the areal biomass concentration in the biofilm [13]. Interestingly, Su et al. [14] were able to estimate Chlorophyll a (Chl a) and lipids by establishing a correlation with RGB brightness; the methodology was validated in a real batch PBR.
In this work, a simple and fast soft sensing methodology based on multivariate image analysis [15] is proposed, to estimate the biomass concentration or Chl a content in a µPBR. A PLS (Partial Least Square) multivariate regression model was used to correlate color indices (mean and variance of each RGB channel) and/or chlorophyll fluorescence indices with biomass concentration in the µPBR. Adopting a similar approach, Chl a content was also estimated. Results from the cross-validated models showed that the methods represent a promising technique for the estimation of biomass concentration and chlorophyll content, thus paving the way towards an efficient, easy and inexpensive online monitoring of microalgae in microscale devices.

Materials and Methods
The cultivation of Scenedesmus obliquus was carried out in flat-plate batch PBRs at different light intensities. The biomass was then characterized (see Section 2.1.1) and transferred to a µPBR platform where images were collected. This is done to produce enough biomass to characterize microalgal concentration via traditional protocols (which otherwise would not be possible by using the µPBR platform only).
In the following, the cultivation methods, the assays and the mathematical models used to process the images will be described.

Microalgae Strain and Cultivation Method and Assays
A preinoculum of Scenedesmus obliquus 276.7 (obtained from SAG-Goettingen, Germany) was maintained in an Erlenmeyer flask with a BG11 medium. The preinoculum was kept at constant illumination of 80 μmol·m −2 ·s −1 and at a constant temperature of 23 ± 1 °C in a refrigerated incubator, in the flask was continuously injected with 5% CO2 enriched air. The medium BG11 was modified by adding HEPES 10 mM to prevent acidification due to continuous bubbling of 5% v/v CO2 enriched air [16]. A sample in the exponential phase was used as preinoculum and transferred in a polycarbonate flat-panel reactor, with a working volume of 150 mL to have an initial concentration of 5×10 6 cells·mL −1 and working batch-wise. The optical path of the culture was 1.2 cm. At the bottom of the reactor, a sparger was placed through which 5% v/v CO2 enriched air was bubbled. Additional mixing was provided by a magnetic stirrer placed on the opposite side wall with respect to the illumination system. All reactors and media were previously autoclaved for 20 min at 121 °C to avoid any contamination. The flat-plate reactor was exposed to a constant continuous illumination 0 of 35 (low light, LL), 150 (medium light, ML) and 950 μmol·m −2 ·s −1 (high light, HL) in biological duplicates (A and B) for each light condition with LED lamps. Incident light intensities were measured with an HD2012.1 photoradiometer (Delta Ohm, Italy), which measures the photosynthetically active radiation (PAR, 400-700 nm). Microalgae were cultivated in such reactors for five days in two parallel PBRs at a time.

Analytical Procedures
Biomass concentration was measured daily in doubles by using a Bürker chamber (HGB, Germany) while pH was measured with the HI9124 pH meter (Hanna Instruments, Italy) to check that the pH was between 7 and 8, the optimal range for cell viability. To extract the pigments, the biomass was ground with quartz powder and heated at 65 °C for 15 min with DMSO (dimethyl sulfoxide) [16]. The absorbance of the extracts was measured at 480, 649 and 665 nm by a UV-1900 UV-visible Spectrophotometer (Shimadzu, Japan), and concentrations of Chl a, Chl b and carotenoids were calculated with the following equations [17]: Chl a = 12.19 · A665-3.45·A649 (1) Car = (1000 · A480-2.14 · Chl a-70.16·Chl b)/220 At the end of the batch curve, dry weight (DW) was measured by filtering a known volume of culture on pre-dried (15 min at 100 °C) cellulose acetate filters (0.45 µm, Whatman ® , U.S.A.). The filtered biomass was then dried in the oven for 2 h at 100 °C.
Growth rates were calculated as the slope of the concentration profile in the logarithmic scale over time, during the exponential phase.
Photosynthetic efficiency is defined as the percentage of light used for microalgae growth with respect to the total received light [18]. It was calculated using the following equation: where Δ (g·L −1 ) is the increase in the concentration during the batch experiment, (L) is the reactor volume, LHV (22 KJ·g −1 ) is the lower heating value for S. obliquus [19], (m 2 ) is the illuminated area of the reactor, (s) is the experiment duration and is the average energy of a mole of photons.
was calculated from the photon spectrum of the lamp following the method described by Norton et al. [20] and is equal to 0.2139 J·µmol −1 .

Sample Preparation
The culture from the flat-plate PBR was transferred into the µPBR sketched in Figure  1a. The µPBR was obtained with replica molding by curing PDMS (poly-di-methyl-siloxane) in a stereolitography-printed mold [6]. Eventually, the cured polymer was removed from the mold, obtaining the final µPBR device, fully compatible with biological applications [21]. Depending on the daily concentration in the flat-plate PBR, the cultivation broth (at concentration D0) was diluted to five different concentrations (namely D1, D2, D3, D4, D5). Concentrations ranged between D0 and 2.5×10 6 cells·mL −1 . Each concentration was placed in four different wells of the µPBR, whose position was chosen randomly for each experiment to increase the robustness of the model. The filled µPBR was eventually sealed with a 2 mm PDMS layer.

Digital Image Acquisition
The filled µPBR was placed in a black-painted wall box with a LED stripe, thus allowing a homogeneous illumination shading the device from environmental light noise. A digital camera (LUMIX DMC-TZ57, Panasonic, Japan) equipped with a 16 megapixel MOS sensor was mounted on a support to keep it in a fixed and stable position, in order to take pictures of the µPBR inside the black box. Length and white references were placed on the side of the µPBR and included in the image scene in such a way as to measure the pixel dimension and calibrate the white balance. The camera parameters, kept constant throughout all experiments, were as follows: One image was collected for each reactor every day. A raw image captured with the aforementioned protocol is shown in Figure 1b.

Fluorescence Measurement Acquisition
The same µPBR used for digital image acquisition was then used for chlorophyll fluorescence measurements with the PAM imaging OpenFluorCam FC 800 (Photon Systems Instruments, Drásov, Czech Republic) provided with its software FluorCam7. A dark adaption time of 20 min was required prior to performing the measurements to open all reaction centers [22,23]. The PAM-imaging instrument required to set the optimal parameters for the acquisition apparatus, which were: • shutter = 1, corresponding to an aperture of 20 μs; • sensitivity = 5, in a range 0-100; • actinic light intensity (Act2) = 45, corresponding to 1150 μmol·m −2 ·s −1 ; • pulse intensity (Super) = 35, corresponding to 1300 μmol·m −2 ·s −1 .
The protocol, called Quenching Act2, required about 3 min and can be summarized as follows [24]: • in the first 5 s of the protocol, minimum fluorescence 0 is measured five times by using a measuring light; • then, a saturating pulse (duration 800 ms) is used to calculate the maximum fluorescence and the variable fluorescence , where = − 0 ; • a dark phase of 17 s is used to relax all photosystems; • actinic light is turned on for 70 s and after this, three saturating pulses (800 ms each) are produced at intervals of 10 s and another two pulses at an interval of 20 s. This allows the identification of the local maxima fluorescence due to progressive light acclimation, which is useful to describe NPQ dynamic (obtaining at different lights and , which is the steady-state value); • actinic light is then turned off and this dark phase lasts until the end of the experiment. During this dark phase, three saturating pulses (800 ms each) are used at intervals of 30 s, identifying local fluorescence maxima allowing to calculate the relaxation of NPQ.

Mathematical Background
In this section, the mathematical methods used in this work are described.

PLS Regression Fundamentals
Partial Least Square (PLS) is one of the most used techniques in data analytics [25,26] to analyze chemical data for process control, fault detection and diagnosis [27]. This technique can also be applied to images for analysis, classification, segmentation and prediction [15]. For example, it was successfully used in the detection of food frauds [28], to classify inhomogeneous food matrices [29], in quality monitoring [30] and in process control [31]. In general, if [ × ] is an autoscaled (i.e., each column has zero mean and unit variance) matrix of observations of predictors, and [ × ] is the response matrix, PLS aims at finding a reduced number of latent variables, LV, that capture the part of the variance that best predicts . In particular, the two matrices can be rewritten in the form: where A is the number of LV, To consider only pixels related to the culture and to avoid pixels at the border of the wells, which may be affected by edge effects, an inner ROI (region of interest) with a 50-pixel radius was selected, for a final image size of about 7500 pixels. For each ROI, it was possible to calculate the mean ( , , ) and variance ( , , ) of the light intensities in each channel of the RGB image, thus characterizing each well. To summarize, for each well the following features were extracted: Chl a content obtained with the spectrophotometric assay: Chl a.
A preliminary analysis showed that including the data from the first two days of the batch was detrimental for model performance (in the first two days photoacclimation phenomena strongly affect cultivation, see Section 3.1). For this reason, only the last three days of the batch campaigns were considered, for a total number of wells (e.g., observations) of = 72. Data from different batches were organized by rows, similar to the feature-wise unfolding approach [32]. Since all the measurements were performed in double, this allowed increasing the number of observations up to 144 to increase model robustness. We can define two matrices that constitute the predictor matrix of the model: one containing the features extracted from the RGB images, namely a color matrix [144 × 6], and one containing fluorescence indices [144 × 7]. The predictive models will be indicated as M[ | ‖ ], where M is the model ID, while in the squared brackets are listed the concatenated matrices (hence, the features) used in the predictor matrix and the predicted variable .

PLS for Biomass Concentration Estimation
The models built in this section aim at estimating biomass concentration in the µPBR wells, regardless of the incident light used in the experiment, or the day in which the image was collected. In this case, only indices obtained from the wells at D0 were used (e.g., the original culture sampled from the PBR as in Figure 1a). The matrix structure is represented in Figure 2. In particular, the following models were tested:

PLS for Chlorophyll Content Estimation
The models built in this section aim at estimating Chl a content (pg·cell −1 ), regardless of the incident light used in the experiment, or the day in which the image was collected. Only the wells at D0 were used. The matrix structure is shown in Figure 3. For this purpose, the biomass concentration was evaluated as an additional predictor, organized in the vector [144 × 1]. In particular, the following models were tested:

Assessing Model Performance
Cross-validation was performed by iterating 100 times the validation procedure in which 90% of the observations were randomly selected for calibration, while the remaining observations were used for validation. Model performance was assessed by calculating the average relative and average absolute errors on validation: where is the experimental value of the biomass concentration (or pigment content) and ̂ the predicted value; is the total number of validated samples. To assess the fitting performance of the different models, the average SSR (sum of squared residuals) was calculated.

Batch Culture Evolution
Here, the batch cultures used to set up the models are discussed. Growth curves at different impinging light intensities are reported in Figure 4, in which a typical batch growth trend can be observed, with an initial lag phase, an exponential phase and eventually a stationary phase for HL experiments only. The calculated growth rates (Table 1) were increasing with light as expected; however, the increase of growth rate was not linear, suggesting possible saturation under HL. This is further confirmed by the photosynthetic efficiency, which drastically decreased under HL, confirming a less efficient conversion of light into biomass.  As reported in Figure 5, during the initial days of the batches, an adjustment of Chl a was observed due to photoacclimation from the preinoculum condition to the new growth conditions in the flat-plate PBR. Eventually, Chl a reached a pseudo-stationary condition starting from day 2. While LL and ML experiments reached a similar value, HL experiments had a reduced Chl a content per cell. As a consequence, HL experiments had a higher ratio Car/Chl a, which is a common photoprotective response [33]. A similar conclusion can be drawn by looking at the / trends in Figure 6: the LL and ML experiments show values above 0.6 for all the samples (for healthy unstressed cells this value is usually in the range 0.6-0.8 [34]), while for cultures exposed at high light intensity a lower value was measured, suggesting possible photoinhibition. To observe variability fluorescence measurements, 0 was measured for a sample and its dilution. The sample was collected at the end of the exponential phase and transferred to the µPBR together with its five dilutions as described in the previous section ( Figure 7). Linearity in − 0 was found limited for a very small and narrow concentration range and a certain variability was always present. The range is strictly dependent on the specific PAM-imaging settings and protocol used [6], thus making the usage of such measurements for concentration estimation a complex task. In addition, at higher concentrations, self-absorbance and saturation issues may arise [5], which pose additional challenges to correct calibration.

Results of the Biomass Concentration Estimation
The models described in Section 2.2.3 were used to predict biomass concentration. Note that the same model is used to predict biomass concentration at three different light intensities and at three different times (days) in the dynamic batch cultivation curve.
Detailed results are reported in Table 2. At first, it can be seen that using the logarithmic model C[ ‖ ln(Cx)] helps reducing the average relative error with respect to C[ ‖Cx], while the average absolute error remains almost unchanged. Comparison between C[ ‖ln(Cx)] and C[ ‖ln(Cx)] confirmed that using only fluorescence measurements as predictors may lead to large errors in the prediction of biomass concentration. For the sake of completeness, it must be noticed that the biomass concentration range is quite wide (10-90×10 6 cell·mL −1 ) with respect to ranges usually adopted in literature for the application of fluorescence measurements, and saturation phenomena may be present. Model C[ | ‖ln(Cx)] allowed for the smallest average relative error by using all the available predictors, namely CI and FI. Parity plots for all models are reported in Figure  8. Here the effect of the logarithm transformation in the predictions can be noticed. In fact, the logarithm slightly improves predictions at lower concentrations, while at higher concentrations predictions are more scattered, suggesting that a linear dependence would be preferable at these values. Both relative and absolute error distributions for model C[ | ‖ln(Cx)] can be observed in Figure 9. Only a small subset of samples has a relative error larger than 20% and an absolute error larger than 15×10 6 cell·mL −1 , while the average relative error is 18.74% and the average absolute error is 9.51×10 6 cell·mL −1 (comparable to experimental measurements as also reported in the literature [35]). However, the rapidity of the response that could be obtained via image analysis makes it a promising and effective technique for fast estimation of a sample concentration.  In view of the results above, where only wells at D0 were used for calibration and validation, an additional model was built where the number of variables was increased by using the features extracted from diluted wells ( Figure 10). The best prediction performances were obtained by using three dilutions, namely D1, D2 and D3, for example, removing noisy dilutions. Matrix was therefore increased to (6 + 7)·4 = 52 predictors. The target concentration was again the logarithm of wells at D0 and the performance of this model D[ | ‖ln(Cx)] are summarized in Table 3. We can see in Figure 11 that errors are significantly smaller with respect to the ones obtained with C[ | ‖ln(Cx)]. In addition, by looking at the parity plot in Figure 12, predictions at the highest concentrations were improved with respect to Figure 8d. In general, this model was very effective at characterizing biomass concentration in a wide range, although the need to provide several dilutions of the culture makes its application on µPBR potentially problematic (although it can represent a viable solution at a larger scale).

Results of the Chl a Estimation
In this section, the goal is predicting the Chl a content [pg·cell −1 ]. Also, in this case, the model aim at predicting pigment content at 3 different light intensities and at 3 days in the dynamic batch cultivation curve. However, it is necessary to acknowledge that the technique based on pigments extraction and spectrophotometric measurements presents a significant variability (Figure 5a). In fact, the standard deviation of pigment assays performed in doubles ranges from 0.01 to 0.12 pg·cell −1 with an average of 0.05 pg·cell −1 . Model P[ ‖Chla] with only color indices seems underperforming as its average relative error is quite high. Biomass concentration was then added to take into account the different chlorophyll content depending on the concentration in a well. In fact, model P[ | ‖Chla] allowed decreasing the average relative error significantly. It can be observed that model P[ | | ‖Chla] in which FI were added, did not perform significantly better. In conclusion, P[ | ‖Chla] is able to represent adequately all the observations with a reasonable error, which is comparable to the traditional assay error ( Figure  5). By looking at the error distributions ( Figure 13), we can see that most samples are characterized by very low errors. From the parity plot ( Figure 14) it is possible to notice that the pigment duplicate measurements present a significant variability (points are quite spread along x-axis), affecting the prediction accuracy of the model. Detailed results concerning the model performances in estimating Chl a are reported in Table 4.

Conclusions
Multivariate image analysis has been assessed aimed at estimating biomass concentration or Chlorophyll a content in µPBRs.
Models built to predict biomass concentration led to promising results as the combined exploitation of color and fluorescence indices allows one to obtain an average relative error of about 18%, which is comparable to experimental measurements. A more complex but accurate model was built by including data based on diluted samples of the original culture. Although this methodology may not be suitable for a direct application on µPBR, it may represent a promising approach to reduce the operator effort and to avoid microscope counting.
Models built to predict Chl a also led to interesting results. In this case, however, information on biomass concentration may have to be combined with color indices in order to reach acceptable accuracy in the estimates of pigment content in a µPBR. The approach may represent a valuable substitute for traditional lab assays, which are typically time-consuming and labor-intensive.
These preliminary results also suggest that a larger dataset of images of culture at different illumination or conditions would be beneficial to increase model robustness and precision. Future work will aim at exploring different color spaces, such as HSL (Hue, Saturation, Lightness), CIELAB or RG chromaticity to assess if they may lead to an improvement in the monitoring performance.
Funding: This research received no external funding. Data Availability Statement: Data available on request due to restrictions.

Conflicts of Interest:
The authors declare no conflict of interest.