Rapid Determination of Nutritional Parameters of Pasta/Sauce Blends by Handheld Near-Infrared Spectroscopy

Nowadays, near infrared (NIR) spectroscopy has experienced a rapid progress in miniaturization (instruments < 100 g are presently available), and the price for handheld systems has reached the < $500 level for high lot sizes. Thus, the stage is set for NIR spectroscopy to become the technique of choice for food and beverage testing, not only in industry but also as a consumer application. However, contrary to the (in our opinion) exaggerated claims of some direct-to-consumer companies regarding the performance of their “food scanners” with “cloud evaluation of big data”, the present publication will demonstrate realistic analytical data derived from the development of partial least squares (PLS) calibration models for six different nutritional parameters (energy, protein, fat, carbohydrates, sugar, and fiber) based on the NIR spectra of a broad range of different pasta/sauce blends recorded with a handheld instrument. The prediction performance of the PLS calibration models for the individual parameters was double-checked by cross-validation (CV) and test-set validation. The results obtained suggest that in the near future consumers will be able to predict the nutritional parameters of their meals by using handheld NIR spectroscopy under every-day life conditions.


Introduction
The miniaturization of vibrational spectrometers has started more than two decades ago, but only within the last decade have real hand-held Raman, MIR (mid-infrared) and near infrared (NIR) scanning spectrometers become commercially available and been utilized for a broad range of analytical applications [1][2][3][4][5][6]. While the weight of the majority of Raman and MIR spectrometers is still in the 1 kg range, the miniaturization of NIR spectrometers has advanced down to the < 100 g level, and developments are under way to integrate them into mobile phones [7,8]. Furthermore, most of the Raman and MIR handheld spectrometers are still in the price range of several ten thousand US$, whereas miniaturized NIR systems have reached the < 500 US$ level. In view of the high price level of Raman and MIR instruments in the near future, only the acquisition of NIR systems can be taken into consideration for private use, whereas handheld Raman and MIR spectrometers will be restricted to industrial, military and homeland security applications, as well as public use, by first responders, customs or environmental institutions.
Over the last years public health awareness has grown strongly, and the control of nutritional parameters of everyday life food is just one aspect of this issue. Beyond body weight control, nutritional parameters are directly related to quality of life and disease control, such as as obesity, high cholesterol, gastritis, diabetes and high blood pressure. Thus, in the present study the quantitative analysis of nutritional parameters by handheld NIR spectroscopy is exemplarily demonstrated in detail for different pasta/sauce blends in combination with a chemometric data evaluation. The objective of these investigations is to prove how feasible it will be for consumers in the near future to be able to predict the nutritional parameters of their meals by using handheld NIR spectroscopy [8].

Experimental Set-Up
For each pasta/sauce-type blend five different combinations (ranging from a 0% to 100% (w/w) sauce addition) were investigated. Each pasta/sauce mixture was prepared "ready-to-eat" on a plate, and the NIR spectra were recorded at room temperature (22 ± 1 • C) at a distance of 1-2 mm above the sample surface at five different positions of the plate in order to compensate inevitable compositional and surface heterogeneities ( Figure 1). Previous investigations have shown that the effective pathlength of NIR radiation for diffuse reflection measurements varies (wavelength and material dependent) from several hundred micrometers to millimeters [28][29][30].
Molecules 2019, 24, x FOR PEER REVIEW 2 of 10 years, primarily handheld near-infrared spectroscopy has demonstrated an immense potential in this respect for different purposes such as authentication [12][13][14], classification [15][16][17], quality control [18][19][20][21], the detection of adulteration [22][23][24], and the determination of food parameters [25] such as the preliminary investigations of pasta/sauce mixtures [26,27]. Over the last years public health awareness has grown strongly, and the control of nutritional parameters of everyday life food is just one aspect of this issue. Beyond body weight control, nutritional parameters are directly related to quality of life and disease control, such as as obesity, high cholesterol, gastritis, diabetes and high blood pressure. Thus, in the present study the quantitative analysis of nutritional parameters by handheld NIR spectroscopy is exemplarily demonstrated in detail for different pasta/sauce blends in combination with a chemometric data evaluation. The objective of these investigations is to prove how feasible it will be for consumers in the near future to be able to predict the nutritional parameters of their meals by using handheld NIR spectroscopy [8].

Experimental Set-Up
For each pasta/sauce-type blend five different combinations (ranging from a 0% to 100% (w/w) sauce addition) were investigated. Each pasta/sauce mixture was prepared "ready-to-eat" on a plate, and the NIR spectra were recorded at room temperature (22 ± 1 °C) at a distance of 1-2 mm above the sample surface at five different positions of the plate in order to compensate inevitable compositional and surface heterogeneities ( Figure 1). Previous investigations have shown that the effective pathlength of NIR radiation for diffuse reflection measurements varies (wavelength and material dependent) from several hundred micrometers to millimeters [28][29][30].

Instrumentation
Near-infrared spectra were measured in diffuse reflection with a Viavi MicroNIR 1700 (formerly JDSU, Santa Rosa, CA, USA) handheld spectrometer, based on a linear variable filter (LVF) monochromator.
The five replicate spectra were recorded with an integration time of 8.8 ms by averaging 1000 scans in the wavelength range of 908-1676 nm with an uncooled 128 pixel InGaAs array detector at a spectral resolution of 12.5 nm at 1000 nm. The S/N ratio derived from the 100% line, recorded with the parameters given above, was 5067:1. As reference, a 99% Spectralon reflectance standard (Labsphere Inc., North Sutton, NH, USA) was used.

Instrumentation
Near-infrared spectra were measured in diffuse reflection with a Viavi MicroNIR 1700 (formerly JDSU, Santa Rosa, CA, USA) handheld spectrometer, based on a linear variable filter (LVF) monochromator.
The five replicate spectra were recorded with an integration time of 8.8 ms by averaging 1000 scans in the wavelength range of 908-1676 nm with an uncooled 128 pixel InGaAs array detector at a spectral resolution of 12.5 nm at 1000 nm. The S/N ratio derived from the 100% line, recorded with the parameters given above, was 5067:1. As reference, a 99% Spectralon reflectance standard (Labsphere Inc., North Sutton, NH, USA) was used.

Materials
Five different commercial pastas (Farfalle-Edeka, Italy; Tortiglioni-Birkel, Germany; Penne-GutBio, Germany; Fusilli de lentilles corail-Barilla, Italy; Casarecce de pois chiches-Barilla, Italy) and five different commercial tomato sauces (Ricotta-Barilla, Italy; Gorgonzola-Barilla, Italy; Zucchini & Aubergine-Barilla, Italy; Siciliana-Bertolli, Italy; Kräuter-Knorr, Germany) were used for the preparation of the samples. Both the pastas and the sauces were carefully selected to represent a large variation of nutritional parameters and morphologies, in order to develop representative chemometric PLS [31,32] models for the individual parameters of energy, fat, protein, carbohydrates, sugar and fiber. The nutritional parameter values of the calibration mixtures were calculated from the package labels of the pastas and sauces according to the mixture compositions and are summarized in Table 1. The mutual assignment of the five sauces to the five pastas established 25 basic combinations, and for each combination five different proportions of pasta and sauce were prepared by mixing 75 g of dry pasta with five different weights of sauce (0.00 g, 18.75 g, 37.50 g, 56.25 g and 75.00 g). These proportions correspond to pasta/sauce blend ratios (%(w/w)) of 100/0, 100/25, 100/50, 100/75, and 100/100. Before mixing, the dry pastas were cooked by boiling in water for 10 min, and after draining for a defined time period of 5 min they were put on the plate, and the sauces were added and mixed with the pastas. Thus, 125 plates in total were prepared, and five replicate spectra were measured for each plate, yielding 625 NIR spectra for further processing and analysis.

Spectral Preprocessing Treatment
In Figure 2, the sample preparation and spectra acquisition scheme is exemplarily demonstrated with specific reference to the pasta-1/sauce-1 blends. Thus, in a first step, the average spectra of the replicate measurements were calculated, and the resulting 125 spectral datasets were then concatenated in a matrix. In the matrix containing the average spectra, Savitzky-Golay (SG) smoothing [33] was applied by using a window-size 7 and 2nd degree polynomial, followed by an extended multiplicative scatter correction (EMSC) [34][35][36]. Finally, the spectral range was truncated to 950-1350 nm. The effects of the subsequent pretreatment steps on the original 625 raw spectra are demonstrated in detail in Figure 3.
The 100 remaining samples were used as the calibration set, whereas the 25 removed samples were used as the test set. The test set samples were finally used for an additional validation step and the demonstration of the predictive capability for "unknown" samples.   used as the test set. The test set samples were finally used for an additional validation step and the demonstration of the predictive capability for "unknown" samples.

Chemometric Data Analysis
Individual PLS calibrations with mean centering and leave-one-out cross validation (CV) were developed for the different nutritional parameters with MatLab software (version R2016a, The MathWorks, Inc., Natick, MA, USA) and the PLS toolbox (version 8.6., Eigenvector Inc., Manson, WA, USA).
For the separation of the available pasta/sauce mixtures into calibration and test samples for the different nutritional parameters, the 125 samples were arranged by increasing order of the respective parameter, and one sample was removed randomly from each consecutive group of five samples. The 100 remaining samples were used as the calibration set, whereas the 25 removed samples were used as the test set. The test set samples were finally used for an additional validation step and the demonstration of the predictive capability for "unknown" samples.

Results and Discussion
The choice of the number of latent variables (factors) is a critical point in the PLS model development and should be based on the relation to other statistical parameters such as RMSEC and RMSECV [37]. Figure 4 shows plots of the RMSEC/RMSECV values versus the latent variable number for the individual calibrations of the nutritional parameters. Basically, the selection is a compromise between the magnitude of error, robustness of calibration and overfitting. In the present case, eight factors were chosen for energy, carbohydrate, sugar, fiber and protein, respectively, and only seven factors for fat, because the graphs of the RMSEs versus the number of latent variables flatten out beyond these numbers of latent variables (red arrows in Figure 4). The choice of the number of latent variables (factors) is a critical point in the PLS model development and should be based on the relation to other statistical parameters such as RMSEC and RMSECV [37]. Figure 4 shows plots of the RMSEC/RMSECV values versus the latent variable number for the individual calibrations of the nutritional parameters. Basically, the selection is a compromise between the magnitude of error, robustness of calibration and overfitting. In the present case, eight factors were chosen for energy, carbohydrate, sugar, fiber and protein, respectively, and only seven factors for fat, because the graphs of the RMSEs versus the number of latent variables flatten out beyond these numbers of latent variables (red arrows in Figure 4). The comparatively high number of factors can be readily explained by the complexity of the samples under investigation. Apart from the fact that six parameters are determined, the samples were prepared with five different types of pastas with varying morphologies and sauces, with considerable variations of ingredients (vegetables, cheese, etc.). Furthermore, residual amounts of water lead to hydrogen bonding interactions with carbohydrates, sugars, fibers, and proteins. In Table 2, the content ranges and selected calibration parameters such as root mean square error of calibration (RMSEC), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP), bias, slope, offset and correlation, have been summarized. The residual predictive deviation (RPD) was also included to estimate how well the calibration model can predict the compositional data [37,38]. Generally, the RMSEs and RPDs shown in Table 2 furnish evidence that, at best, medium quality calibrations have been achieved that can be used for the screening purposes of the nutritional parameters under investigation. In Figure 5, the predicted versus actual concentration graphs are shown for the calibration and test set samples for all nutritional parameters, The comparatively high number of factors can be readily explained by the complexity of the samples under investigation. Apart from the fact that six parameters are determined, the samples were prepared with five different types of pastas with varying morphologies and sauces, with considerable variations of ingredients (vegetables, cheese, etc.). Furthermore, residual amounts of water lead to hydrogen bonding interactions with carbohydrates, sugars, fibers, and proteins. In Table 2, the content ranges and selected calibration parameters such as root mean square error of calibration (RMSEC), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP), bias, slope, offset and correlation, have been summarized. The residual predictive deviation (RPD) was also included to estimate how well the calibration model can predict the compositional data [37,38]. Generally, the RMSEs and RPDs shown in Table 2 furnish evidence that, at best, medium quality calibrations have been achieved that can be used for the screening purposes of the nutritional parameters under investigation. In Figure 5, the predicted versus actual concentration graphs are shown for the calibration and test set samples for all nutritional parameters, with a linear regression fit.
As an additional feature, this figure also reflects two classes of calibration samples for the parameters of carbohydrate, protein and fiber.
Molecules 2019, 24, x FOR PEER REVIEW 6 of 10 with a linear regression fit. As an additional feature, this figure also reflects two classes of calibration samples for the parameters of carbohydrate, protein and fiber.  An overview of the prediction results for the test set samples is provided in Tables 3 and 4. The predictions for energy and carbohydrate of the test set samples were obtained with R 2 Cal 0.85 and 0.89, respectively, and average relative prediction errors of 2.7 and 6.4 %(w/w), respectively. Protein had an R 2 Cal of 0.87 and an average relative prediction error of 8.3 %(w/w). The calibrations for sugar   An overview of the prediction results for the test set samples is provided in Tables 3 and 4. The predictions for energy and carbohydrate of the test set samples were obtained with R 2 Cal 0.85 and 0.89, respectively, and average relative prediction errors of 2.7 and 6.4 %(w/w), respectively. Protein had an R 2 Cal of 0.87 and an average relative prediction error of 8.3 %(w/w). The calibrations for sugar   with a linear regression fit. As an additional feature, this figure also reflects two classes of calibration samples for the parameters of carbohydrate, protein and fiber. Figure 5. Graphs of the predicted versus actual content of the respective nutritional parameter per serving (calibration fit ( ), prediction fit ( ), calibration samples ( ) and predicted test set samples ( )). An overview of the prediction results for the test set samples is provided in Tables 3 and 4. The predictions for energy and carbohydrate of the test set samples were obtained with R 2 Cal 0.85 and 0.89, respectively, and average relative prediction errors of 2.7 and 6.4 %(w/w), respectively. Protein had an R 2 Cal of 0.87 and an average relative prediction error of 8.3 %(w/w). The calibrations for sugar ) and predicted test set samples (   Molecules 2019, 24, x FOR PEER REVIEW 6 of 10 with a linear regression fit. As an additional feature, this figure also reflects two classes of calibration samples for the parameters of carbohydrate, protein and fiber. Figure 5. Graphs of the predicted versus actual content of the respective nutritional parameter per serving (calibration fit ( ), prediction fit ( ), calibration samples ( ) and predicted test set samples ( )). An overview of the prediction results for the test set samples is provided in Tables 3 and 4. The predictions for energy and carbohydrate of the test set samples were obtained with R 2 Cal 0.85 and 0.89, respectively, and average relative prediction errors of 2.7 and 6.4 %(w/w), respectively. Protein had an R 2 Cal of 0.87 and an average relative prediction error of 8.3 %(w/w). The calibrations for sugar )). An overview of the prediction results for the test set samples is provided in Tables 3 and 4. The predictions for energy and carbohydrate of the test set samples were obtained with R 2 Cal 0.85 and 0.89, respectively, and average relative prediction errors of 2.7 and 6.4 %(w/w), respectively. Protein had an R 2 Cal of 0.87 and an average relative prediction error of 8.3 %(w/w). The calibrations for sugar and fat led to R 2 Cal values of 0.86 and 0.91, respectively, and average relative prediction errors of 11.4 and 16.1 %(w/w), respectively. With an R 2 Cal of 0.89, the largest average relative prediction error of 18.2 %(w/w) was obtained for the fiber calibration model. The comparatively large relative prediction errors for fat, sugar and fiber are not really unexpected and are partly due to the much lower content of these components and, for sugar and fiber, they are a consequence of the structural similarity with the main component carbohydrate. A comparison of the regression vectors of carbohydrate and sugar (not shown here), for example, highlighted an almost identical pattern of important wavelength variables for their calibration models. However, although the NIR spectra contain overlapping features, the PLS method takes into account both the spectral information and the reference nutritional values when building the quantification models. Thus, despite the addressed structural similarity, it is still possible to reasonably quantify the sugar and fiber parameters, as shown in Tables 2-4.

Conclusions
In combination with chemometric evaluation routines, NIR spectroscopy has proved a powerful analytical tool for authentication, adulteration and quality control in food science. The presented method, using a miniaturized spectrometer and PLS calibration models to quantify nutritional parameters of pasta/sauce mixtures, is simple, fast and non-destructive. The achieved calibration results provide an overview of the realistically expectable prediction accuracy for quantifying energy, carbohydrate, fat, fiber, protein and sugar via the application of handheld instruments. However, the results also demonstrate that the "cloud-derived" concentration data reported by several direct-to-consumer companies in commercial videos and advertising papers are beyond any realistic accuracy that is achievable with their relatively simple food-scanners.