Portable through Bottle SORS for the Authentication of Extra Virgin Olive Oil

: The authenticity of olive oil has been a signiﬁcant long-term challenge. Extra virgin olive oil (EVOO) is the most desirable of these products and commands a high price, thus unscrupulous individuals often alter its quality by adulteration with a lower grade oil. Most analytical methods employed for the detection of food adulteration require sample collection and transportation to a central laboratory for analysis. We explore the use of portable conventional Raman and spatially-offset Raman spectroscopy (SORS) technologies as non-destructive approaches to assess the adulteration status of EVOO quantitatively and for SORS directly through the original container, which means that after analysis the bottle is intact and the oil would still be ﬁt for use. Three sample sets were generated, each with a different adulterant and varying levels of chemical similarity to EVOO. These included EVOO mixed with sunﬂower oil, pomace olive oil, or reﬁned olive oil. Authentic EVOO samples were stretched/diluted from 0% to 100% with these adulterants and measured using two handheld Raman spectrometers (excitation at 785 or 1064 nm) and handheld SORS (830 nm). The PCA scores plots displayed clear trends which could be related to the level of adulteration for all three mixtures. Conventional Raman (at 785 or 1064 nm) and SORS (at 830 nm with a single spatial offset) conducted in sample vial mode resulted in prediction errors for the test set data ranging from 1.9–4.2% for sunﬂower oil, 6.5–10.7% for pomace olive oil and 8.0–12.8% for reﬁned olive oil; with the limit of detection (LOD) typically being 3–12% of the adulterant. Container analysis using SORS produced very similar results: 1.4% for sunﬂower, 4.9% for pomace, and 10.1% for reﬁned olive oil, with similar LODs ranging from 2–14%. It can be concluded that Raman spectroscopy, including through-container analysis using SORS, has signiﬁcant potential as a rapid and accurate analytical method for the non-destructive detection of adulteration of extra virgin olive oil.


Introduction
Olive oil is obtained from the fruit of the olive tree (Olea europaea) and extra virgin olive oil (EVOO) has been prized throughout the ages for possessing unique qualities based on its medicinal, cosmetic, nutritional and taste, as well as even ceremonial value.As a result, it is one of the most expensive-and also one of the most adulterated-food products in recorded history [1].Researchers at the University of California, Davis released the findings of a now famous 2010 study [2], showing that more than two-thirds of the EVOO sold in California is neither extra-virgin nor in some cases even olive oil, as the real oil is often adulterated with cheaper, more readily available oils.In 1981, more than 600 people died because of adulterated olive oil that contained rapeseed oil that was denatured with aniline [3], causing a huge scandal known as the "toxic oil syndrome" [4,5].
Extra virgin olive oil is typically produced via pressing the olive fruits into a paste followed by centrifugation [6].The process of extracting oil from olives is slow and different from other seed oils and this in part results in EVOO being a high-cost commodity with a price that is typically 3-5 times that of other edible oils [7].Despite its premium price, olive oil is in very high demand due to its superior quality over other edible oils.Therefore, adulteration of EVOO with low-cost or poor-quality edible oils for economic benefit is of considerable concern to the olive oil industry.Trade organisations such as the International Olive Council (IOC) have laid out clear definitions for the different categories of olive oil and olive-pomace oil, as well as having clear trade standards and approved testing methods for their identification.These include fatty acid composition analysis by gas chromatography, stigmastadiene content by liquid chromatography [8], as well as organoleptic tests.Although these methods are very powerful and provide very low detection limits for the adulterant, they are time consuming and expensive, requiring dedicated laboratories and trained professionals, and as such rely on the sample being taken to centralised laboratories for testing.Hence, there is an increasing demand to develop a quick, sensitive, portable, easy-to-use and cost-effective analytical approach for quantifying either the purity or adulteration of EVOO and potentially any additives that may be present in the oil [9].In addition to the IOC approved approaches, several analytical techniques have been applied to detect the authenticity of EVOO, and these include mass spectrometry, gas chromatography, high-performance liquid chromatography and nuclear magnetic resonance (NMR) spectroscopy [5,[10][11][12][13][14].
Vibrational spectroscopic techniques such as infrared in the near (NIR) or mid infrared (MIR) ranges and Raman offer alternative approaches for rapid screening and detection of adulteration in EVOO.The potential of Raman spectroscopy combined with multivariate statistics has been investigated for evaluating the authenticity (or purity) and concentration of EVOO [15], as well as for the detection and quantification of specific adulterants [16].Recently, spatially offset Raman spectroscopy (SORS) has been developed as a powerful technique that enables measurement at surface positions laterally offset from the excitation laser [17,18] allowing analysis of materials within container packaging [19]; with more recent advances in SORS having led to the development of a handheld device [20].Though the main applications of portable SORS relate to pharmaceutical analysis [21], security, and chemical detection [22], there have also been developments in food adulteration analysis with through-container detection of chemical markers associated with counterfeit alcohol [23], detection of adulterated butter [24], sugar [25], and subsurface analysis of salmon through skin [26] or assessment of freshness of intact prawns [27].
In the present study, we explored the potential of Raman spectroscopy combined with chemometrics approaches for exploratory analysis and multivariate statistical regression methods to identify adulterants and quantify their concentration in EVOO.Sunflower oil, pomace olive oil and refined olive oil were chosen as stretching agents as they are likely adulterant choices and provide varying degrees of difference in chemical composition when compared to EVOO.Sunflower oil was expected to be the easiest to detect, followed by pomace oil, and finally refined olive oil, the most chemically similar adulterant to EVOO being the most challenging.

Oils
Sunflower oil was purchased from a local grocery store.Authentic extra virgin olive oil (EVOO), pomace olive oil and refined olive oil were provided by the Campden BRI Group.

Sample Preparation for Vial-Mode Analysis
Mixtures of EVOO with volume percentage ranging from 0% to 100% were prepared in 5% intervals using the three adulterants: sunflower oil, pomace olive oil and refined olive oil.This resulted in 63 mixtures in glass vials (15 mm × 26 mm glass vials from Metrohm) containing 1.5 mL for each EVOO-adulterant oil.These were then analysed by Raman spectroscopy in vial mode (see below and Figure 1) using two handheld conventional Raman spectrometers and one portable SORS instrument.

Oils
Sunflower oil was purchased from a local grocery store.Authentic extra virgin olive oil (EVOO), pomace olive oil and refined olive oil were provided by the Campden BRI Group.

Sample Preparation for Vial-Mode Analysis
Mixtures of EVOO with volume percentage ranging from 0% to 100% were prepared in 5% intervals using the three adulterants: sunflower oil, pomace olive oil and refined olive oil.This resulted in 63 mixtures in glass vials (15 mm × 26 mm glass vials from Metrohm) containing 1.5 mL for each EVOO-adulterant oil.These were then analysed by Raman spectroscopy in vial mode (see below and Figure 1) using two handheld conventional Raman spectrometers and one portable SORS instrument.

Sample Preparation for Through-Barrier Analysis
A commercially available clear glass bottle of EVOO was purchased from a local shop and used as the sample container for analysing the mixtures throughout this study.The volume of the bottle was 125 mL and the dimensions were 4.2 cm × 4.2 cm × 15.5 cm (L × W × H).There is a photo of the bottle within Figure 1.Due to the size of the bottle and the availability of the oils, the bottle was filled sequentially and in general mixtures of EVOO with sunflower oil, pomace olive oil and refined olive oil ranging from 10% to 90% (vol/vol) purity and these were prepared in 10% increments.In addition, due to the size of the bottle for EVOO-sunflower mixtures, these were conducted in volumes of a maximum of 90 mL (range 9:90 mL) and 28 mL (range 6:28 mL) to assess variability in predictions due to sample volume.Fortunately, no variability in the results was observed (see below) as the EVOO-pomace OO and EVOO-refined OO were limited to a total volume of 28 mL.A table with details of the final mixture volumes is provided in SI (Table S1).

Sample Preparation for Through-Barrier Analysis
A commercially available clear glass bottle of EVOO was purchased from a local shop and used as the sample container for analysing the mixtures throughout this study.The volume of the bottle was 125 mL and the dimensions were 4.2 cm × 4.2 cm × 15.5 cm (L × W × H).There is a photo of the bottle within Figure 1.Due to the size of the bottle and the availability of the oils, the bottle was filled sequentially and in general mixtures of EVOO with sunflower oil, pomace olive oil and refined olive oil ranging from 10% to 90% (vol/vol) purity and these were prepared in 10% increments.In addition, due to the size of the bottle for EVOO-sunflower mixtures, these were conducted in volumes of a maximum of 90 mL (range 9:90 mL) and 28 mL (range 6:28 mL) to assess variability in predictions due to sample volume.Fortunately, no variability in the results was observed (see below) as the EVOO-pomace OO and EVOO-refined OO were limited to a total volume of 28 mL.A table with details of the final mixture volumes is provided in SI (Table S1).

Raman Spectroscopy
Three handheld Raman spectroscopy instruments were employed for the collection of Raman spectra.These are briefly detailed below:

•
CBEx handheld Raman spectrometer from Snowy Range (Laramie, WY, USA), equipped with 785 nm excitation laser wavelength, with laser power of 70 mW on the sample, was used to collected spectra over the 400-2300 cm −1 range with 12-14 cm −1 spectral resolution.The acquisition time was 2 s for vials-mode analysis; • CBEx handheld Raman spectrometer (Snowy Range) operating at 1064 nm, with laser power 300 mW on the sample, was employed to collect data in the 400 to 2300 cm −1 range with 12-14 cm −1 spectral resolution.The acquisition time was 15 s for vial-mode analysis; and • Resolve (Cobalt Light System, Oxfordshire, UK now part of Agilent) handheld SORS system, equipped with an 830 nm excitation laser wavelength, with laser power of 450 mW on the sample, was employed to collect spectra in the 350-2000 cm −1 range with 3 cm −1 spectral resolution.The acquisition time was less than 2 min for vial-mode, and 2 min for through-barrier analysis.
Raman spectra were collected in vial-mode using all three instruments.For these measurements, the scans were performed by placing the glass vial containing each sample into the sample holder housed within the spectrometer, where the light is focused into the sample.Five technical replicates were collected from each sample.This resulted in 105 spectra being collected for each series of mixtures: EVOO-sunflower oil; EVOO-pomace olive oil; EVOO-refined olive oil-with each containing 21 samples from 0-100% mixtures in 5% steps.Prior to Raman measurements, the oil samples were gently vortexed to ensure complete homogenization and left to rest for 5 min in order to free any trapped air bubbles.For the SORS instrument, the vial mode was performed by placing a vial containing the oil into the instrument and so only a single spatial offset Raman measurement was made.
The final series of experiments have been performed through the container.SORS spectra were collected with the handheld Resolve device with the through-barrier measurement setting and the standard SORS nose cone adapter following the manufacturer's recommended protocol.In order to keep the distance of the laser nozzle and sample bottle consistent, a wooden holder was designed and constructed, and this is shown in Figure S1.Four technical replicates were analysed from each mixture in these series.Again, prior to Raman measurements, the oil samples were gently vortexed to ensure complete homogenization and left to rest for 5 min in order to free any trapped air bubbles.

Data Analysis
The overall experimental design and data analysis pipeline of this study is briefly illustrated in Figure 1.Following data collection, all spectral analysis was carried out in Matlab R2014b (The Mathworks, Natick, MA, USA).Prior to data analysis, the spectra were truncated to only include the 700-1800 cm −1 range, as our preliminary analysis (data not shown) of the entire range (400-2300 cm −1 ) suggested that the 400-700 and 1800-2300 cm −1 spectral regions for CBEx 785 and 1064 nm instruments, and spectral regions of 350-700 and 1800-2000 cm −1 for the SORS data on the Resolve, had no significant discriminatory contribution to the model, and therefore were excluded from analysis.
All Raman spectra were then scaled using standard normal variate (SNV) [28].In preliminary studies, we also assessed extended multiplicative scatter correction (EMSC) [29], but this resulted in poorer predictions (data not shown) so we chose to use SNV.As each instrument had built-in automatic baseline correction algorithms, further correction prior to scaling was not necessary.Principal component analysis (PCA) was applied for exploratory analysis [30].PCA is an unsupervised method to visualize sample distribution in the multivariable space, reduce the dimensionally of a data set, check for any trends and clusters that could be linked to the percentage of adulterants in the samples, or to identify possible outliers [31][32][33].
Partial least-squares regression (PLSR) [34,35] is a popular multivariate regression method that is employed for quantitative calibration and we used this to predict the level of adulterant from the Raman spectra of the oils.In order to validate the model statistically, bootstrapping [1,36] of the PLSR models was employed, which involved resampling of the spectral data with replacement.This process was used to generate two data sets randomly: one is termed the training set and is used to calibrate the PLSR model; the second is used as a test set which is employed to test the predictive ability of the model [37].Plots were generated of known adulterant levels versus PLSR prediction and were used to assess prediction accuracy.In addition, we complied statistical metrics in terms of accuracy of measurements, linearity of predictions and the limit of detection (LOD) from these PLS models using methods described in [38,39].
With SORS, the data analysis is more complex and the company that manufactures the instrument has optimised the data processing for categorical analysis only; that is to say, the instrument is optimised for predicting the provenance of a substance belonging to one class or another.This is fundamentally different, and we found that this affected our ability to perform quantification.Thus, for quantitative analysis using SORS, we collaborated with the manufacturers to access the raw data then applied the chemometrics methods discussed above and this is visualised in Figure 1.

Visible Inspection of the Raman and SORS Spectra
There is increasing interest in methodologies that allow for rapid detection of food adulteration, and these methods would ideally be portable so that these 'capable guardians' can be used directly within the food chain [40].During the past two decades, the application of handheld Raman spectroscopy has attracted a lot of attention.In the present study we have employed three handheld Raman instruments to investigate whether it is possible to develop models that allow for the quantification of specific adulterants in EVOO.These three handheld instruments were intentionally selected as they are equipped with three different laser wavelengths at 785, 830 and 1064 nm in the near infrared (NIR) region, and thus any potential fluorescence interference should be minimal.Indeed, we found no broad spectral features that may be due to fluorescence.In addition, three adulterants were used in this study as these provide varying degrees of difference in chemical composition when compared to EVOO.Sunflower oil was expected to be the easiest to detect, followed by pomace oil (a solvent extract from the olive pulp after the generation of EVOO), and finally refined olive oil (a virgin oil that has been refined without alteration of the oil's glyceridic structure), which is thus chemically very similar to EVOO.
Figure 2 shows representative Raman spectra of EVOO, pomace OO, refined OO and sunflower oil collected directly through a commercially available olive oil bottle using a SORS instrument with excitation at 830 nm.The mixtures of sunflower oil with EVOO have been conducted at two different volumes of oil samples in order to test whether the volume of oil inside the container made any difference.These two oils were chosen as they were in abundant supply, unlike pomace OO and refined OO, and shall be discussed below to establish that using lower volumes (28 mL) provided very similar quantitative models to those generated using a larger volume (90 mL).
Inspection of the Raman spectra from all oils (conventional Raman (Figure S2) and SORS (Figure 2)) showed very similar features, with only subtle quantitative differences between them.However, the mixture of sunflower oil with EVOO displayed higher levels of a peak at 920 cm −1 which clearly originated from components in the sunflower oil (assigned to C-H bending from lipids).This and the assignments of the main Raman active vibrations are also highlighted in Figure 2. When all the spectra were viewed (Figure S3), increasing levels of sunflower oil in the mixtures resulted in an increase in the intensity of two strong peaks at 1265 cm −1 (C-H deformation) and 1657 cm −1 (C=C stretch), corresponding to changes in unsaturated fatty acid composition, and decreasing in the intensity of peaks at 1300 and 1441 cm −1 [42], which were also seen in PCA loadings plots (Figure S4).In the mixtures of EVOO and pomace OO and refined OO, the story is similar but the changes in peak intensities are much smaller, which is reasonable due to these olive oils having very similar chemical composition to EVOO, as they are also derived from olives [42]. in the y-axis so that the features can be more easily observed.Also provided are the origins of the main bands in these oils with assignments taken from [41].Also highlighted are bands not used within the data processing; these are seen in Supplementary Information.
Inspection of the Raman spectra from all oils (conventional Raman (Figure S2) and SORS (Figure 2)) showed very similar features, with only subtle quantitative differences between them.However, the mixture of sunflower oil with EVOO displayed higher levels of a peak at 920 cm −1 which clearly originated from components in the sunflower oil (assigned to C-H bending from lipids).This and the assignments of the main Raman active vibrations are also highlighted in Figure 2. When all the spectra were viewed (Figure S3), increasing levels of sunflower oil in the mixtures resulted in an increase in the intensity of two strong peaks at 1265 cm −1 (C-H deformation) and 1657 cm −1 (C=C stretch), corresponding to changes in unsaturated fatty acid composition, and decreasing in the intensity of peaks at 1300 and 1441 cm −1 [42], which were also seen in PCA loadings plots (Figure S4).In the mixtures of EVOO and pomace OO and refined OO, the story is similar but the changes in peak intensities are much smaller, which is reasonable due to these olive oils having very similar chemical composition to EVOO, as they are also derived from olives [42]. in the y-axis so that the features can be more easily observed.Also provided are the origins of the main bands in these oils with assignments taken from [41].Also highlighted are bands not used within the data processing; these are seen in Supplementary Information.

Principal Components Analysis (PCA) of the Raman and SORS Spectra
As simple visible inspection did not allow for the assessment of the quantitative levels of the three adulterants, PCA was used to analyse the three binary oil mixtures on the four analytical methods (vials for all three platforms and through the bottle for SORS) in order to look for any trends in the PCA ordination scores space that may be related to changing levels of the oils.
For brevity we shall here highlight Figure 3, which is a PCA scores plot of EVOO and pomace OO where Raman data were collected in vial mode using a CBEx 1064 nm instrument.It is clear from this plot that there was a concentration-dependant gradient along the PC1 axis, which accounted for ~29% of the total explained variance (TEV).
Inspection of this plot revealed that pure EVOO is located on the most negative part of principal component 1 (PC1) (located on the left-hand side of this plot) and that increasing the pomace OO content caused the spectra to follow a trend to the most positive side of the PC1 axis (located on right hand side).The corresponding PCA loadings plots (data not shown) highlighted that the main peaks involved in these trends were at 1265 and 1657 cm −1 , as also seen for sunflower-EVOO mixtures (Figure S4).These vibrations are from unsaturated fatty acids and specifically for 1265 cm −1 from =C-H deformation of cis R-HC=CH-R and for 1657 cm −1 from C=C stretch, cis -RHC=CHR.levels of the three adulterants, PCA was used to analyse the three binary oil mixtures on the four analytical methods (vials for all three platforms and through the bottle for SORS) in order to look for any trends in the PCA ordination scores space that may be related to changing levels of the oils.
For brevity we shall here highlight Figure 3, which is a PCA scores plot of EVOO and pomace OO where Raman data were collected in vial mode using a CBEx 1064 nm instrument.It is clear from this plot that there was a concentration-dependant gradient along the PC1 axis, which accounted for ~29% of the total explained variance (TEV).Inspection of this plot revealed that pure EVOO is located on the most negative part of principal component 1 (PC1) (located on the left-hand side of this plot) and that increasing the pomace OO content caused the spectra to follow a trend to the most positive side of the PC1 axis (located on right hand side).The corresponding PCA loadings plots (data not shown) highlighted that the main peaks involved in these trends were at 1265 and 1657 cm −1 , as also seen for sunflower-EVOO mixtures (Figure S4).These vibrations are from unsaturated fatty acids and specifically for 1265 cm −1 from =C-H deformation of cis R-HC=CH-R and for 1657 cm −1 from C=C stretch, cis -RHC=CHR.Similar trends were seen for all these multivariate analyses and the PCA scores plots of the Raman spectra collected from mixtures of EVOO with sunflower oil, pomace OO and refined OO using all three handheld instruments, which are provided in the Supplementary Materials (Figures S5-S7).The exception was the PCA scores plots of the spectral data collected using the SORS instrument in vial mode that did not show any clear separation for mixtures of EVOO with pomace and refined oil in the first PC, and the concentration gradient clustering was achieved on the PC2 axis, with a TEV of 15.5% and 10.9%, respectively, for EVOO-pomace OO (Figure S6C) and EVOO-refined OO (Figure S6C).Similar trends were seen for all these multivariate analyses and the PCA scores plots of the Raman spectra collected from mixtures of EVOO with sunflower oil, pomace OO and refined OO using all three handheld instruments, which are provided in the Supplementary Materials (Figures S5-S7).The exception was the PCA scores plots of the spectral data collected using the SORS instrument in vial mode that did not show any clear separation for mixtures of EVOO with pomace and refined oil in the first PC, and the concentration gradient clustering was achieved on the PC2 axis, with a TEV of 15.5% and 10.9%, respectively, for EVOO-pomace OO (Figure S6C) and EVOO-refined OO (Figure S6C).

Quantification of the Level of the Adulterant Using Partial Least Squares Regression (PLSR)
Having established with PCA that there were indeed consistent trends which could be largely related to the level of adulteration in EVOO using Raman spectroscopy and SORS, we aimed to quantify the level of adulteration.PLSR is a powerful multivariate regression algorithm that can be used to associate input data (Raman and SORS spectra) with the level of adulteration.This supervised learning algorithm requires a calibration (learning) phase before it can be tested for its predictive ability.
We again exemplify using the pomace OO as the adulterant in EVOO where data were collected using a handheld 1064 nm Raman spectrometer.This data set comprised 0-100% pomace OO in 100-0% EVOO in 5% steps (i.e., 21 mixtures) and five replicate spectra were collected from each mixture (105 spectra in total).As detailed above in the data analysis section for PLSR model calibration, we used a resampling technique based on bootstrapping.In this sampling process, samples to be used for model calibration are generated by choosing a sample randomly from the total data set (21 samples × five replicates), which is placed in the training set along with all five replicates.The sample is then replaced into the total data set and selection proceeds until there are the same number of samples in the training data as there are in the whole data set (i.e., 105 Raman spectra (x-data) with their corresponding % pomace OO (y-data)).This consequently leads to the generation of a series of training sets, which on average includes 63.2% of all samples, and a series of test sets, which includes the remaining 36.8% of samples [36].For this and all studies, PLSR was conducted using n = 1000 bootstrap selections, with an internal cross-validation step to optimise the number of latent variables (PLS factors) to use for PLSR.All 1000 models were then challenged using the 1000 test sets and we report the data from these training set and test set combinations in terms of linearity where the predictions are regressed against the known adulterant levels using coefficients of determination for the training (R 2 ) and test sets (Q 2 ).This is in addition to the root mean squared error of calibration (RMSEC) for the training set and RMSE of prediction (RMESP) for the test set, which are expressed as % RMSE.Finally, we also calculated the limit of detection (LOD) from these PLS models.
The results of the PLSR prediction for pomace OO adulteration are shown in Figure 4, where we have plotted the results from the 1000 test sets only and show the average and standard deviations of these predictions.It is clear from this plot that the predictions are excellent as they are very close to the expected y = x line for a perfect model, and the best fit line for these models is y = −0.7054+ 1.0272*x.The linearity is excellent with an R 2 of 0.97 and Q 2 of 0.95 (a perfect correlation would be 1) and the error is also low with RMSEC and RMSEP calculated as just 5.32% and 6.50%, respectively, for the calibration and test sets.Finally, the LOD was 11.50%.These figures of merit are shown in Table 1 along with all the PLSR analyses where data were collected within vials with the three different excitation wavelengths.Additional PLSR results plots for these models are provided in the Supplementary Materials (Figures S8-S10).The spectral range for all analyses was 700-1800 cm −1 .Standard normal variate (SNV) was used to pre-process these data.# PLS factors represents the number of latent variables used in each PLSR model.R 2 and Q 2 represent the goodness of the model's fit to the dataset for the 1000 training sets and the 1000 test sets, respectively.RMSEC and RMSEP denote root mean squared error results pertaining to the training sets and test sets, respectively.LOD is the limit of detection calculated from the PLSR models.* For these results, the processing was detrimental to the model, therefore the analysis was done directly on the raw data.

Quantification of Adulterants Using SORS and PLSR
Having established that these models were excellent for data collected from olive oil contained in vials, we then proceeded to analyse the data from SORS, which had been collected through-container mode where the mixtures of oils were contained within a commercially available olive oil bottle.In this case we set the instrument to SORS mode (so called "Thick, coloured or opaque" mode) where two Raman spectra were taken at different layers (at the surface of the bottle and from within the bottle) and then the spectrum from the surface layer was subtracted from the Raman spectrum collected from within the bottle containing the oils.
As previously stated, we first tested the volume of oil that was needed to obtain satisfactory results.As the sunflower oil and EVOO were in abundant supply, we used mixtures of these two oils and prepared these in maximum volumes of 90 and 28 mL, as detailed in Table S1.The raw SORS spectra looked very similar for both volumes (Figure 2).As SORS requires spectral subtraction to provide SORS spectra from different layers, we obtained the raw output files from the Resolve and used SNV for normalisation of the spectra prior to PLSR results for the test sets compared.The results of 1000 PLSR models from bootstrapping are shown in Table 2 where the predictions of % sunflower oil in 28 mL was actually better that 90 mL.We therefore proceeded to use 28 mL for all through-container analyses.
The PLSR results for the SORS approach are shown visually in Figures S8D-S10D and the figures of merit for linearity, error and LOD are provided in Table 2.We can see that the predictions are best for quantifying sunflower oil, followed by pomace OO, with refined OO being the worst.If we focus on LOD, this is 2.24% compared to 10.64% and 13.60%, respectively.Similar trends were also observed in terms of the linearity (Q 2 = 1.00, 0.97, 0.84) and error (RMSEP = 1.37, 4.91, 10.15) in the test sets, which confirmed that due to its chemical similarity, refined OO was the most difficult to predict.

Conclusions
Extra virgin olive oil (EVOO) is a popular, high-cost commodity which requires methods to assess its authenticity.These can include deliberate adulteration as well as mislabelling which can be misleading in terms of the origin of the product.While most of this is economically motivated adulteration (EMA), this can also be a cause for significant health concerns if the adulterant is allergenic-as was the case when hazelnut oil was used to adulterate EVOO in the 1990s due to its chemical similarity to olive oil [43].In terms of economic fraud, a recent study from the University of California, Davis and the Australian Oils Research Laboratory found that around 69% of imported European extra virgin olive oil was adulterated EVOO or mislabelled and was therefore not authentic EVOO [2].
The level of adulteration that is economically worthwhile for fraudsters to use is hard to calculate.The cost of food crime requires knowledge of the cost of the premium product, as well as the lower-grade adulterants, the scale of the volumes of oils used and the labour involved in performing substitutions [44].Even an article this year on the emerging trends within olive oil fraud did not provide levels at which EMA is worthwhile [45], and these authors stress that importance of adulteration is set at different levels depending on the type of adulterant.Unintentional carry over of oil can also occur if incomplete cleaning of presses and centrifuges does not take place.
In this study we have successfully demonstrated the application of a range of handheld Raman spectroscopy instruments combined with PLSR modelling that can be used to quantify adulteration of EVOO with sunflower, pomace, and refined oils.The low adulteration detection limits of these techniques, along with their ability to analyse samples through their original container, makes Raman and SORS highly beneficial for the food industry as there is potential for these Capable Guardians to be deployed on sites throughout the food chain.
Our present study has highlighted this approach with a single source of EVOO.Therefore, as Raman and SORS measure the whole (bio)chemistry of the sample, for these approaches to be developed further as generic tools for authenticity testing then databases will be needed to be constructed to account for subtle chemical differences in different olive oil varieties.These databases would need to capture geographical variation as well as seasonal variations as the oil itself is a product of the genetics of the olive tree, as well as where it is located and the weather patterns during a production season.In addition, in SORS, two Raman spectra are generated: one from the surface of the container and the other from within the bottle.SORS then uses a weighted subtraction to remove any contribution from the bottle revealing only olive oil components contained within the bottle.While we only demonstrated this with a single glass commercially available olive oil bottle, extensive work in security testing in airports has shown that this approach works robustly for many different types of containers [46][47][48][49].
We believe that in the future, working with regulatory authorities, these field portable point-and-shoot methods can be used onsite from farm to fork in olive oil production facilities, supply depots, as well as in retail settings.This is especially the case for SORS, which we have shown for the first time provides exciting potential for through-container rapid screening, with the PLSR modelling providing results in accordance with international validation guidelines [50].

Figure 1 .
Figure 1.General flow diagram showing the overall data analysis pipeline for the handheld Raman methods in vial mode, as well as handheld SORS for through-container analysis.The Raman spectrometers are not to scale.Abbreviations used: ALS, asymmetric least squares; PCA, principal components analysis; PLS-R, partial least squares regression; SNV, standard normal variate.For bootstrapping validation n = 1000 resamplings were made.

Figure 1 .
Figure 1.General flow diagram showing the overall data analysis pipeline for the handheld Raman methods in vial mode, as well as handheld SORS for through-container analysis.The Raman spectrometers are not to scale.Abbreviations used: ALS, asymmetric least squares; PCA, principal components analysis; PLS-R, partial least squares regression; SNV, standard normal variate.For bootstrapping validation n = 1000 resamplings were made.

14 Figure 2 .
Figure 2. Raman spectra of olive oil collected using SORS directly through bottles.The different authentic extra virgin olive oil-adulterant mixtures are shown in different colours.Spectra are offsetin the y-axis so that the features can be more easily observed.Also provided are the origins of the main bands in these oils with assignments taken from[41].Also highlighted are bands not used within the data processing; these are seen in Supplementary Information.

Figure 2 .
Figure 2. Raman spectra of olive oil collected using SORS directly through bottles.The different authentic extra virgin olive oil-adulterant mixtures are shown in different colours.Spectra are offsetin the y-axis so that the features can be more easily observed.Also provided are the origins of the main bands in these oils with assignments taken from[41].Also highlighted are bands not used within the data processing; these are seen in Supplementary Information.

Figure 3 .
Figure 3. PCA scores plots of Raman spectral data collected using a handheld instrument with 1064 nm excitation showing the adulteration of authentic EVOO with pomace olive oil.The adulteration has been performed in increments of 5% and a rainbow colour gradient from blue for pure EVOO to red for pure pomace OO is shown.The values in parentheses on the axes labels are the percentage of the total explained variance (TEV) for each principal component (PC).

Figure 3 .
Figure 3. PCA scores plots of Raman spectral data collected using a handheld instrument with 1064 nm excitation showing the adulteration of authentic EVOO with pomace olive oil.The adulteration has been performed in increments of 5% and a rainbow colour gradient from blue for pure EVOO to red for pure pomace OO is shown.The values in parentheses on the axes labels are the percentage of the total explained variance (TEV) for each principal component (PC).

Figure 4 .
Figure 4. PLSR prediction model plots comparing the predicted actual concentrations for the pomace olive oil against the known level of this adulterant in extra virgin olive oil (EVOO).The data used in this model are from a handheld 1064 nm Raman spectrometer.PLSR was conducted using bootstrap validation (n = 1000) and this figure shows the results from the 1000 test sets only (i.e., not the training data used for model construction).The symbols depict the means of the test set predictions with standard deviation error bars.The yellow solid line is the line of best fit from these models (y = −0.7054+ 1.0272*x) and the blue dotted line the expected y = x line which would be from a perfect model.

Figure 4 .
Figure 4. PLSR prediction model plots comparing the predicted actual concentrations for the pomace olive oil against the known level of this adulterant in extra virgin olive oil (EVOO).The data used in this model are from a handheld 1064 nm Raman spectrometer.PLSR was conducted using bootstrap validation (n = 1000) and this figure shows the results from the 1000 test sets only (i.e., not the training data used for model construction).The symbols depict the means of the test set predictions with standard deviation error bars.The yellow solid line is the line of best fit from these models (y = −0.7054+ 1.0272*x) and the blue dotted line the expected y = x line which would be from a perfect model.

:
Pictures of the Heath Robinson constructed wooden holder with a Resolve and commercially available clear glass bottle in situ, Figure S2: Conventional Raman spectra of olive oil and the adulterants, Figure S3: Raman and SORS spectra showing the adulteration of authentic EVOO with sunflower oil, Figure S4: PCA loadings plots on Raman spectral data collected showing the adulteration of authentic EVOO with sunflower oil, Figure S5: PCA scores plots on Raman spectral data collected showing the adulteration of authentic EVOO with sunflower oil, Figure S6: PCA scores plots on Raman spectral data collected showing the adulteration of authentic EVOO with pomace olive oil, Figure S7: PCA scores plots on Raman spectral data collected showing the adulteration of authentic EVOO with refined olive oil, Figure S8: PLSR prediction model plots comparing the predicted actual concentrations for the sunflower oil against the known level of this adulterant in EVOO, Figure S9: PLSR prediction model plots comparing the predicted actual concentrations for the pomace olive oil against the known level of this adulterant in EVOO, Figure S10: PLSR prediction model plots comparing the predicted actual concentrations for the refined olive oil against the known level of this adulterant in EVOO.

Table 1 .
PLSR results for the bootstrap models of the adulterated oils in EVOO with data collected in vial mode.

Table 1 .
PLSR results for the bootstrap models of the adulterated oils in EVOO with data collected in vial mode.

Table 2 .
PLSR results for the bootstrap models of the adulterated oils in EVOO using SORS for through-container analysis.The spectral range for all analyses was 596-1420 cm −1 .Standard normal variate (SNV) was used to pre-process the data.#PLS factors represents the number of latent variables used in each PLSR model.R 2 and Q 2 represent the goodness of the model's fit to the dataset for the 1000 training sets and the 1000 test sets, respectively.RMSEC and RMSEP denote root mean squared error results pertaining to the training sets and test sets, respectively.LOD is the limit of detection calculated from the PLSR models.