Rapid Characterization of Fatty Acids in Oleaginous Microalgae by Near-Infrared Spectroscopy

The key properties of microalgal biodiesel are largely determined by the composition of its fatty acid methyl esters (FAMEs). The gas chromatography (GC) based techniques for fatty acid analysis involve energy-intensive and time-consuming procedures and thus are less suitable for high-throughput screening applications. In the present study, a novel quantification method for microalgal fatty acids was established based on the near-infrared spectroscopy (NIRS) technique. The lyophilized cells of oleaginous Chlorella containing different contents of lipids were scanned by NIRS and their fatty acid profiles were determined by GC-MS. NIRS models were developed based on the chemometric correlation of the near-infrared spectra with fatty acid profiles in algal biomass. The optimized NIRS models showed excellent performances for predicting the contents of total fatty acids, C16:0, C18:0, C18:1 and C18:3, with the coefficient of determination (R2) being 0.998, 0.997, 0.989, 0.991 and 0.997, respectively. Taken together, the NIRS method established here bypasses the procedures of cell disruption, oil extraction and transesterification, is rapid, reliable, and of great potential for high-throughput applications, and will facilitate the screening of microalgal mutants and optimization of their growth conditions for biodiesel production.


Introduction
To date, fossil-derived fuels have still served as the main energy sources [1,2]. The ever-increasing energy demand, depleting reserves of fossil fuels, and environmental concerns, however, have urged the exploration of alternative energies that are green, renewable and sustainable [3]. Biodiesel, referring to a mixture of fatty acid methyl esters (FAMEs) produced by transesterification of oils, has attracted much attention due to its properties of being renewable, carbon neutral and portable for transporting use [4].
Microalgae are fast-growing photosynthetic organisms with the ability to accumulate high content of lipids, up to 70% of cell dry weight under certain growth conditions [5]. They have been considered better than oil crops for biodiesel production [1,6,7]. Among the oleaginous microalgae, Chlorella spp. are thought to be promising candidates of biodiesel feedstocks in that they are able to grow robustly for high cell density, produce high level of triacylglycerol, and serve as an ideal source for making biodiesel [8][9][10][11].
The key properties of biodiesel, such as cetane number, kinematic viscosity, oxidative stability, cloud point and cold filter plugging point, are largely determined by the composition of fatty acid methyl ester (FAME) [12][13][14][15][16]. Therefore, when evaluating the feasibility of biodiesel feedstocks, their fatty acid composition should be considered as an important indicator [10,17,18].
Gas chromatography-flame ionization detector (GC-FID) and Gas chromatography-mass spectrometry (GC-MS) represent the typical techniques to analyze the fatty acid profiles. Generally, these methods involve the energy-intensive and time-consuming procedures such as cell disruption, lipid extraction and transesterification and thus are less suitable for high-throughput screening applications [19,20]. Therefore, alternative techniques easier to conduct, but without significant loss of accuracy, are in sought for fatty acid analysis.
Near-infrared spectroscopy (NIRS) is such a technique; it is rapid, cost-effective, reliable, and of great potential for high-throughput applications. Fatty acids varying in chain length and unsaturation level possess different near-infrared spectra [21,22]. There have been several reports of employing NIRS for predicting individual fatty acids, such as C16:0, C18:0, C18:1 and C18:2, in pig adipose, lamb meat, chicken meat, milk powder and almond flour [23][24][25][26][27]. Recently, NIRS also demonstrated its applications in microalgae, but restricted to the quantification of lipid, carbohydrate, protein, and ash content [28][29][30][31][32][33]. The use of NIRS for individual fatty acid analysis in microalgae has not been reported, to the best of our knowledge. The aim of the present study was to establish a feasible NIRS method for the rapid analysis of microalgal fatty acid composition. With our optimized NIRS method, the microalgal fatty acid content and composition could be determined based on the NIR spectrum of a microalgal sample. Our work represents the first effort to develop a NIRS based method for the characterization of fatty acids in microalgae, which has great potential in high-throughput applications, in particular for the screening of microalgal mutants and optimization of their growth conditions for biodiesel production.

Algal Samples and Near-Infrared (NIR) Spectra
All 159 samples were obtained by growing in the medium with a series of C/N ratios [34,35]. The average NIR spectra of 3 species of Chlorella were given in Figure 1 in the form of absorption spectra. The major NIRS absorption bands (Figure 1) of lipids were centered at 1195-1215 nm for CH3 and CH2 second overtone of CH stretch, 1704-1780 nm for CH3 and CH2 first overtone of CH stretch, 2300-2370 nm for CH stretch in combination with CC stretch [36][37][38]. The absorption bands from 2100 to 2170 nm and absorptions around 1680 nm were contributed by CH stretch (-CH=CH-) and can be used to quantify the unsaturated fatty acids [39]. In general, the sample with high total fatty acid (TFA) contents possessed high absorption value in the wavelength range for CH stretch ( Figure 1).

NIRS Models Based on C. vulgaris Data
Forty-five samples of C. vulgaris were randomly assigned to the calibration set, and the left 15 ones were assigned to the validation set. Calibration set was used to create NIRS model and validation set was to validate the model. The means, maximum values, minimum values and standard deviation of total fatty acids (TFA), palmitic acid (C16:0), stearic acid (C18:0), oleic acid (C18:1), linoleic acid (C18:2) and linolenic acid (C18:3) contents of 60 samples were determined by GC-MS and shown in Table 1. These five fatty acids are the common components of biodiesel [40]. In order to obtain a NIRS model suitable for predicting a fatty acid in unknown samples of C. vulgaris, the content range of the fatty acid in calibration and validation set should be as wide as possible [41]. To meet this need, the 60 samples were collected under different culture conditions and contained very wide concentration ranges of TFA, C16:0, C18:0, C18:1 and C18:3 (Table 1). Myristic acid (C14:0), palmitoleic acid (C16:1), hexadecadienoic acid (C16:2) and hexadecatrienoic acid (C16:3) were present in trace amounts (in total less than 5% of TFA in each sample) and thus not considered here. The NIR spectra (wavelength range of 1000-2499 nm, WR I) and fatty acid contents data determined by GC-MS of C. vulgaris were combined by partial least squares 1 (PLS 1) regression with leave-one-out cross-validation. The resulting NIRS models for fatty acid quantification in C. vulgaris were named as CV-NIRS-WR I and shown in Table 2. The model (CV-NIRS-WR I) had a good performance for the prediction of TFA content, with root mean square error of calibration (RMSEC) (mg/g cell), multiple coefficient of determination (R 2 ), root mean square error of cross validation (RMSECV) (mg/g cell), standard error of performance (SEP) (mg/g cell), the coefficient of determination (r 2 ), and ratio of standard deviation of the validation set to standard error of prediction (RPD) being 5.81, 0.997, 7.15, 7.23, 0.994, and 10.83, respectively. The high RPD value suggested the feasibility of this model for broad applications, such as screening, quality control, and process control. As for the prediction of C16:0, C18:1 and C 18:3, the models had RPD values of over 6 and were therefore feasible for quality control use. When predicting C18:0, the RPD value of the model was 3.76 indicating possible screening use. In contrast, the model might be unsuitable for the prediction of C18:2, as the RPD values was less than 2. The poor prediction of CV-NIRS-WR I model for C18:2 may be attributed to the narrow range of C18:2 contents in C. vulgaris samples used for the model development (Table 1) [30]. It is well known that proteins (C-N and C=O bonds), polysaccharides (C-O bonds), and water (O-H bonds) have absorption at the wavelength range of 1880-2499 nm, which may interfere with the performance of NIR spectra for fatty acid analysis [42]. In order to minimize the interference caused by these compounds, we developed additional CV-NIRS models based on the data obtained from the wavelength ranges of 1030-1500 and 1600-1880 nm (WR II), where fatty acids show dominant absorbance over others. The model had an excellent performance for the prediction of TFA content, with RMSEC (mg/g cell), R 2 , RMSECV (mg/g cell), SEP (mg/g cell), r 2 , and RPD being 4.41, 0.998, 5.28, 6.47, 0.997, and 14.68, respectively. Besides, the RPD values of CV-NIRS-WR II model were higher than those of CV-NIRS-WR I for the prediction of TFA and individual fatty acids (Table 2). In contrast to CV-NIRS-WR I, RMSEC, RMSECV and SEP of CV-NIRS-WR II models for most fatty acids contents decreased significantly, which signified that precision and accuracy of prediction increased. Therefore, CV-NIRS-WR II was more suitable for rapid fatty acid composition analysis in C. vulgaris. CV-NIRS-WR II: the models based on the spectra of C. vugaris in the wavelength ranges of 1030-1500 and 1600-1880 nm; b R 2 : multiple coefficient of determination of calibration models; and c r 2 : coefficient of determination of regression models tested with validation sets.

NIRS Models Suitable for Three Species of Chlorella Simultaneously
CV-NIRS-WR II model, however, showed poor performance when predicting fatty acid composition in C. protothecoides and C. zofingiensis. This may indicate that the model built based on samples from a single strain is not suitable for other algal strains. In this context, we built new models by adding extra NIR spectra from 30 samples of C. zofingiensis and 69 samples of C. protothecoides. Briefly, 119 samples were randomly assigned to the calibration set, and the remaining 40 samples were assigned to the validation set. The means, maximum values, minimum values and standard deviation of TFA, C16:0, C18:0, C18:1, C18:2 and C18:3 contents of 159 samples determined by GC-MS were shown in Table 3. Likewise, other fatty acids in trace amounts were not considered in the present investigation. Based on the spectra of WR II (wavelength range of 1030-1500 and 1600-1880 nm) and fatty acid composition measured by GC-MS of all 159 samples, a series of new NIRS models, namely, CVPZ-NIRS-WR II, were created. Calibration and validation performances were calculated and shown in Table 4. Among these models, the one for prediction of TFA content had the best performance, with RMSEC (mg/g cell), R 2 , RMSECV (mg/g cell), SEP (mg/g cell), r 2 , and RPD being 14.68, 0.988, 18.81, 24.16, 0.964, and 4.98, respectively. Although CVPZ-NIRS-WR II models had lower RPD values and higher RMSEC, RMSECV and SEP than CV-NIRS-WR II models for predicting C16:0, C18:0, C18:1 and C18:3 contents (Table 4), they demonstrated suitability to predict fatty acids in the three Chlorella species with the same NIRS models for the possible screening purpose. The NIRS models for fatty acids composition prediction in C. protothecoides and C. zofingiensis have been developed based on these 69 samples of C. protothecoides and 30 samples of C. zofingiensis, respectively. The model from C. protothecoides for prediction of TFA (C16:0 and C18:1) content had good performance with R 2 and RPD being 0.992 (0.985 and 0.979) and 7.45 (4.56 and 2.80), respectively. As for the model from C. zofingiensis, they were, respectively, 0.998 (0.997 and 0.989) and 9.58 (9.81 and 5.61). Although the models based on 3 Chlorella species are not as good as those based on individual species, they are feasible for mutant screening use.

Discussion
Near-infrared spectroscopy (NIRS) consists of complex overtones and combinations of molecular vibrations [29]. In contrast to sharp absorption peaks in the infrared region, there is no strong and unique band associated with a special chemical bond in the NIR spectrum [39]. However, the corrections can be established between information in the NIRS and measured values by using chemometric methods, such as PLS 1 regression [43,44]. With appropriate NIRS models developed, the compounds of unknown samples can be determined rapidly by their NIR spectra, including carbohydrate, protein, ash content and lipid [28][29][30][31][32][33].
Recently, NIRS has also been applied to the characterization of fatty acids in meat and food power samples [23][24][25][26][27]. But the models developed in these reports need improvements for better prediction of some fatty acids. For example, in the study of Fernandez-Cuesta et al. [24], only C18:1 and C18:2 showed good prediction, with R 2 being 0.97 and 0.98, and RPD being 5.37 and 7.35, respectively; in contrast, the prediction performance for C16:0 (R 2 = 0.54, RPD = 1.41) and C18:0 (R 2 = 0.51, RPD = 1.44) was far less acceptable. Fatty acids varying in chain length and unsaturation level possess different near-infrared spectra [22]. Within the wavelength range of 1000-2499 nm, NIRS spectra also contain strong signals contributed by other compounds, including proteins (C-N and C=O bonds), polysaccharides (C-O bonds), and water (O-H bonds) [38,42]. In order to minimize the interference caused by these compounds, we developed CV-NIRS-WR II models by selecting the NIRS spectra within the wavelength ranges of 1030-1500 and 1600-1880 nm, where fatty acids show dominant absorbance. These models demonstrated excellent performances for predicting the contents of TFA, C16:0, C18:0, C18:1 and C18:3 in microalgae, with RMSECV, R 2 and RPD being 1.62-5.28 mg/g cell, 0.991-0.998 and 7.31-14.68, respectively (Table 2), superior to the previous reports mentioned above.
Microalgal biodiesel has been considered as a promising alternative to fossil fuels, but challenges remain to be addressed to improve its production economics [1]. Efforts have been made to search an ideal algal strain as the biodiesel feedstock, which is expected to have not only fast growth rate and high lipid content but also great fatty acid composition, as the key properties of a biodiesel are largely determined by the composition of its fatty acid methyl esters (FAMEs) [4,45]. Currently, fatty acid profiles determination is mainly based on GC-FID/GC-MS, which is energy-intensive and time-consuming and thus less suitable for high-throughput screening purposes. Our work, for the first time, established a novel NIRS technique for rapid determination of fatty acids in microalgae. Unlike GC/GC-MS, the NIRS based fatty acid determination is free of cell disruption, oil extraction and transesterification, can be done in a few seconds, and has great potential in high-throughput applications of algae screening for better biodiesel production.

Fatty Acid Analysis
Twenty milligrams of lyophilized algal cells were incubated in a solvent mixture (1 mL toluene, 2 mL 1% sulfuric acid in methanol (v/v) and 0.8 mg heptadecanoic acid in 0.8 mL hexane as the internal standard) overnight at 50 °C for transesterification to form fatty acid methyl esters (FAMEs). FAMEs were then extracted three times with hexane in a reciprocating shaker. The FAMEs were analyzed by using a GC-MS-QP 2010 SE (Electron Ionization type) gas chromatograph-mass spectrometer (SHIMADZU, Kyoto, Japan) and a Stabilwax-DA capillary column (30 m × 0.25 mm × 0.25 μm) (SHIMADZU, Kyoto, Japan). Helium was used as the carrier gas. The injection temperature, ion temperature and interface temperature were set at 250, 200 and 260 °C, respectively. The initial column temperature was set at 150 °C. The column temperature subsequently rose to 200 °C at 10 °C/min and then to 220 °C at 6 °C/min, followed by a hold at 220 °C for 10 min. FAMEs were identified by NIST 11 mass spectral library (NIST/EPA/NIH mass spectral library, 2011 edition). The quantities of individual FAMEs were calculated by the peak areas according to the total ion chromatogram (TIC) using heptadecanoic acid as the internal standard.

NIR Spectra Collection
NIR spectra were collected by a Portable NIRS Analyzer (SupNIRS 1550, Focused Photonics Inc., Hangzhou, China). Temperature and relative humidity conditions during scanning ranged from 22 to 26 °C and from 35% to 45%, respectively. About 200 mg biomass of each sample was packed into a 1.5-mL Eppendorf tube for collecting NIRS. Diffusely reflected radiation was detected with optical fiber probe from 1000 to 2499 nm at a 1 nm resolution. The NIRS of each individual sample were obtained by averaging 5 parallel spectra.

Regression Model Development
Spectra data of all samples were converted and imported into the chemometrics software of the Unscrambler version 9.7 (CAMO, Trondheim, Norway). First, spectra were pretreated with the approach of Savitzky-Golay smoothing filter to preserve the features of distribution. Second, the first order derivatives were computed by the convolution (Savitzky-Golay) method to reduce peak overlap and eliminate baseline shift [46]. Then the algorithms of multiplicative scatter correction (MSC) were used for polynomial baseline correction to remove the multiplicative interference of scatter and particle size. The last preprocessing method was mean centering, which translated the collected data to the origin of the multivariate space where analysis would be performed. We developed NIRS models to predict the content of various fatty acids using PLS 1 regression with leave-one-out cross-validation. Every PLS 1 model was developed by calibration and validation set, which was composed of three quarters and one quarter of pretreated spectra, respectively.

Calibration Performance
In common, the calibration performance of each regression model was evaluated by RMSECV, RMSEC, and R 2 . The regression models were tested with the validation sets and three parameters, namely SEP, r 2 and RPD, were calculated to assess the predictability. Notably, in the same concentration range, the accuracy of prediction result increases with the RPD value. RPD values of <2 indicate that the prediction result by the model is unacceptable; RPD values of 2-5 indicate the model is suitable for screening; RPD values of >5 indicate the model is suitable for quality control and even process control, and RPD values of >8 indicate the model is suitable for all possible applications [39]. Besides, the closer to 1 the R 2 or r 2 is, the more accurate the NIRS model is. As for RMSECV, RMSEC, and SEP, the smaller the better.

Conclusions
The key properties of biodiesel are largely determined by the fatty acid methyl ester profile. Therefore, when evaluating the feasibility of biodiesel feedstocks, the fatty acid composition should be considered as an important indicator. The present study developed a novel near-infrared spectroscopy (NIRS) technique for rapid and reliable analysis of fatty acids in microalgae, which can be done within a few seconds and requires only a small amount of samples. The optimized NIRS method demonstrated to have a good performance for the quantification of fatty acids across Chlorella species. In a word, compared to the traditional GC-mediated analyses, the reliable NIRS technique described here bypasses the involvement of cell disruption, oil extraction and transesterification and is thus easier to conduct and more environmentally friendly, and has great potential for screening purposes, in particular the high-throughput screening of oleaginous microalgal fatty acids for biodiesel uses.

Acknowledgments
This study was partially supported by the 985 Project of Peking University, the 863 Plan of Ministry of Science and Technology of China (2012AA023107) and National Natural Science Foundation of China (31471717).

Author Contributions
Bin Liu, Jin Liu and Feng Chen conceived the study, performed data analysis and wrote the manuscript; Tianpeng Chen and Bo Yang performed the NIR spectra collection and fatty acid analysis by GC-MS; Yue Jiang and Dong Wei contributed with valuable discussions. All authors read and approved the final manuscript.

Conflicts of Interest
The authors declare no conflict of interest.