Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques

Venturini, Francesca; Sperti, Michela; Michelucci, Umberto; Herzig, Ivo; Baumgartner, Michael; Caballero, Josep Palau; Jimenez, Arturo; Deriu, Marco Agostino

doi:10.3390/foods10051010

Open AccessArticle

Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques

by

Francesca Venturini

^1,2,*

,

Michela Sperti

³

,

Umberto Michelucci

^2,4

,

Ivo Herzig

¹,

Michael Baumgartner

¹,

Josep Palau Caballero

⁵,

Arturo Jimenez

⁵ and

Marco Agostino Deriu

^3,*

¹

Institute of Applied Mathematics and Physics, Zurich University of Applied Sciences, Technikumstrasse 9, 8401 Winterthur, Switzerland

²

TOELT LLC, Birchlenstr. 25, 8600 Dübendorf, Switzerland

³

Polito BIO Med Lab., Department of Mechanical and Aerospace Engineering, Politecnico di Torino, 10129 Turin, Italy

⁴

School of Computing, University of Portsmouth, Portsmouth PO1 3HE, UK

⁵

SCA San Sebastián Puente del Ventorro, s/n, 18566 Benalua de las Villas, Spain

^*

Authors to whom correspondence should be addressed.

Foods 2021, 10(5), 1010; https://doi.org/10.3390/foods10051010

Submission received: 9 April 2021 / Revised: 29 April 2021 / Accepted: 30 April 2021 / Published: 6 May 2021

(This article belongs to the Special Issue Advanced Analysis Methods for Food Safety, Authenticity and Traceability Assessment)

Download

Browse Figures

Versions Notes

Abstract

Extra virgin olive oil (EVOO) is the highest quality of olive oil and is characterized by highly beneficial nutritional properties. The large increase in both consumption and fraud, for example through adulteration, creates new challenges and an increasing demand for developing new quality assessment methodologies that are easier and cheaper to perform. As of today, the determination of olive oil quality is performed by producers through chemical analysis and organoleptic evaluation. The chemical analysis requires advanced equipment and chemical knowledge of certified laboratories, and has therefore limited accessibility. In this work a minimalist, portable, and low-cost sensor is presented, which can perform olive oil quality assessment using fluorescence spectroscopy. The potential of the proposed technology is explored by analyzing several olive oils of different quality levels, EVOO, virgin olive oil (VOO), and lampante olive oil (LOO). The spectral data were analyzed using a large number of machine learning methods, including artificial neural networks. The analysis performed in this work demonstrates the possibility of performing the classification of olive oil in the three mentioned classes with an accuracy of 100%. These results confirm that this minimalist low-cost sensor has the potential to substitute expensive and complex chemical analysis.

Keywords:

fluorescence spectroscopy; fluorescence sensor; olive oil; machine learning; artificial neural networks; quality control

Graphical Abstract

1. Introduction

Olive oil is an important commodity in the world, and its demand has grown substantially in recent years. Interest in highest quality grade, extra virgin olive oil (EVOO), is due to its high nutritional value, its richness in bioactive molecules, and its importance to our health due to its content of anti-inflammatory and antioxidant substances [1]. The increased demand has led, however, to an increase in fraudulent activities like adulteration. As a result, edible olive oil quality assessment has become increasingly important. To develop a trusted means of control, the European Economic Community (EEC) has created regulations that define the categorization of olive oils according to several chemical properties, obtainable by accredited laboratories, and organoleptic evaluation, obtainable by accredited panels, to guarantee its quality [2]. For the highlighted reasons, quality control is complex, costly, and cannot be carried out easily at any desired moment during the product life cycle. An inexpensive tool for an accessible analysis will boost consumers’ trust in the product and decrease dramatically production costs, reducing at the same time the possibilities for fraudulent activities.

Fluorescence spectroscopy has attracted a lot of interest in recent years as a fast, cost-efficient, and at the same time a sensitive method to study the properties of vegetables, particularly olive oils [3,4]. This is due to the fact that olive oils contain several natural fluorescence molecules like pigments, such as chlorophyll and beta-carotene, phenolic compounds, such as tocopherol, and their oxidation products. The most frequently used techniques are either the acquisition of excitation emission matrices (EEMs) or the use of synchronous scanning [5]. Both take advantage of the multidimensional characteristic of fluorescence spectroscopy to create a fingerprint to uniquely identify and characterize virgin olive oils [6,7]. Applications of those methods range include discrimination of different quality grades of olive oils [8,9], detection of adulteration [10,11,12], monitoring of the oxidation processes [13,14,15], shelf-life monitoring [16], and geographical origin authentication [17,18,19].

The extraction of information of interest from the spectral data can be a difficult task depending on the type of data acquired, which may range from a single spectrum to the more complex EEMs, and on the specificity of the application. Several multivariate analysis techniques and classification methods have been successfully employed, like for example, principal component analysis (PCA), partial least square regression (PLS) and PLS discriminant analysis (PLS-DA), linear discriminant analysis (LDA), K-nearest neighbors (k-NN), and random forest (RF), to mention only the most widely used. More recently the use of artificial neural networks (ANN) has proven to be a useful tool, particularly because it does not require the pre-processing of data or a dimensionality reduction [20]. Complete overviews of the mentioned statistical and machine learning methods, including ANN, can be found in [4,21,22,23,24].

The acquisition of high-quality data, particularly of EEM, and the necessary data post-processing require special instrumentation and knowledge, thus limiting the accessibility of these methods. To the best of the author’s knowledge, no portable and low-cost sensors for fluorescence spectroscopy for quality assessment of olive oils are available so far. This work presents a minimalist sensor for olive oil quality assessment based on fluorescence spectroscopy and shows how it can be used to perform classification without any sample-preparation and without any pre-processing of the acquired data with several machine learning methods.

The main contributions of this paper are three. Firstly, a new miniaturized and low-cost fluorescence sensor is presented. The sensor is demonstrated by using it to produce data (fluorescence spectra) that can be used to successfully classify olive oil samples into three quality classes. Secondly, eight different machine learning methods are applied to the data acquired with the sensor, to demonstrate that the data are extremely effective in allowing machine learning models to learn to predict olive oil’s quality almost perfectly. A detailed comparison of the models used is discussed. Finally, the performance of ANNs is analyzed in detail. The study of ANNs’ performance is an important contribution since ANNs allow the application of explainability techniques to better understand how olive oil quality is linked to its chemical properties. This has the potential of completely superseding the classical chemical analysis.

2. Materials and Methods

2.1. Olive Oil Samples

All samples were obtained from the 2019–2020 harvest and provided by the producer Conde de Benalúa, Granada, Spain. In total, 27 olive oil samples were analyzed in this study, divided into 12 EVOO, 8 VOO, and 7 LOO (Table 1). The quality assessment of all the olive oils was performed by the producer according to the current European regulation for the commercial classification into EVOO, VOO, and LOO categories [2]. The quality is determined by both chemical parameters, such as acidity and peroxide index, and sensory parameters, such as the fruity median and the median defect.

All oils were stored in the dark and at 20

^{\circ}

C during the entire time of the measurements. For data acquisition, the samples were placed into commercial transparent 4-mL glass vials, taking care that no headspace was present to reduce oxidation [25]. For a few selected oils, two samples were prepared from the same olive oil bottle to check the variability of the samples and no difference was observed.

All the measurements in this work were done on undiluted samples. It is well known that fluorescence in olive oil is subjected to the inner effect [5], which includes both the attenuation of the excitation light due to the strong absorption from the sample and the re-absorption of the fluorescence light from the sample itself, due to the overlap of the excitation and emission spectra. However, for the technology described in this work, this effect does not pose a problem. In facts, the fluorescence is intense enough that the strong absorption does not influence the signal-to-noise ratio, and possible sample-dependent effects are learned and compensated by machine learning models.

2.2. Miniaturized Low-Cost Fluorescence Sensor

The design of the sensor was conceived to have as few elements as possible, to minimize the complexity and the costs. For the first time, the sensor itself does not contain any optical component or optical filters, as it is typical in fluorescence spectroscopy [26], nor lenses. The schema of the minimalist sensor is shown schematically in Figure 1.

The excitation light was provided by a UV LED with emission at 395 nm (Kingbright Electronic Co, New Taipei City, Taiwan), driven by a current driver (MIC4801, Micrel Inc., San Jose, CA, USA) which allows regulating the current and, therefore, the illumination intensity. This excitation wavelength is advantageous because it is close to an absorption maximum in the absorption band of the different pigments present in olive oil, mainly chlorophylls and carotenoids [27,28,29]. The fluorescence was collected by a miniature spectrometer (STS-Vis, Ocean Optics, Dunedin, FL, USA) with a 1024-element CCD array which acquires the entire spectrum in one single measurement. The resolution of the spectrometer was 16 nm. The spectrometer was placed at 90

^{\circ}

with respect to the LED to avoid the LED light transmitted by the sample to reach the spectrometer. Both the LED driver and the spectrometer are controlled by a Raspberry Pi. The optomechanics of the sensor is designed to minimize the amount of stray light from the excitation LED that is collected by the spectrometer. The current for the LED was chosen so as to have a good signal-to-noise ratio for a single spectrum with short integration time avoiding, however, heating the sample with the LED light. The sensor has a recess where standard 4-mL clear glass vials can be inserted. The sensor has a very small footprint of 12.5 cm × 12.5 cm × 5 cm and is shown in Figure 2.

2.3. Dataset Preparation and Description

All measurements were taken under ambient conditions in a single day to avoid different aging of the olive oils to influence the results. The description of the samples is reported in Section 2.1. For each olive oil sample, 20 measurements were performed. A total of 27 samples × 20 measurements produced 540 spectra. The dataset, therefore, consists of 540 arrays, each having 1024 values (the number of pixels), whose elements are the measured intensities at the different pixel position after background subtraction, normalized to have an average of zero and a standard deviation of one. This normalization is a very common one with neural networks, as it makes the input data small enough to avoid numerical problems during the training phase [20]. The dataset contains a different row for each acquisition repetition of each oil, with the spectrum points as features, and the corresponding label for the quality classification.

2.4. Machine Learning Classifiers

The quality of the data acquired with the sensor and the feasibility of using them for quality control were tested by applying different machine learning methods. The goal was to classify the oils into three categories EVOO, VOO, and LOO. The performance of the sensor as a tool for quality control can be defined as its ability to generate data that allow a classification with an accuracy as close to 100% as possible. The following eight machine learning algorithms were tested: Support vector machines (SVM), naïve Bayes (NB), multinomial logistic regression (MLR), PCA combined with LDA, decision tree (DT), ANN, RF, and k-NN. The implementation parameters and the references describing the methods are listed in Table 2. The methods have been implemented using the Python library scikit-learn [30]. A detailed description of each algorithm goes beyond the scope of this paper, and the interested reader is referred to the listed references. The details of the ANN implementation are described in Section 2.5.

2.5. Artificial Neural Network-Based Classifiers

For oil classification, a feed-forward neural network architecture was used. To find the best parameters of the neural network’s model (NNM), namely the number of layers, the number of neurons in each layer and the number of epochs, a hyperparameter optimization was performed with a grid-search approach [20]. The number of layers varied from one to three, the number of neurons in each layer from 2 to 32 and the number of epochs tested were 350, 600, and 1000. The activation function of the hidden layer’s neurons is the rectified linear unit (ReLU) [20]:

ReLU (x) \equiv max {0, x}

(1)

while for the output layer the softmax [20] function was used. The loss function used is the cross-entropy [20]:

L = - \sum_{i} \sum_{j = 1}^{3} (y_{i}^{[j]} log {\hat{y}}_{i}^{[j]} + (1 - y_{i}^{[j]}) log (1 - {\hat{y}}_{i}^{[j]}))

(2)

where the sum over i is performed over all the observations on the mini-batch extracted from the training dataset used for the weight update,

y_{i}^{[j]}

assumes the value of one if the observations is of class j, and

{\hat{y}}_{i}^{[j]}

is the predicted probability of the observation i of being of the j-th class.

j = 1, 2, 3

indicates the three expected classes: EVOO, VOO, and LOO.

The NNM was trained using the optimizer Adaptive Moment Estimation (Adam) [51] with a mini-batch size of 32. The implementation was performed using the TensorFlow

^{TM}

Python library. As will be discussed in the results section, the NNM that gave the best performance was the one with three layers, 32 neurons in each layer, and was trained for 1000 epochs.

To measure the performance of the models, the accuracy calculated as the number of correctly classified oils divided by the total number of oils was used. All the models were trained with backpropagation.

2.6. External Validation of Models

To assess the performance of the machine learning models, these need to be applied to data not used during training and the resulting prediction tested against the expected results. For this purpose, the dataset was split into 80% used as training dataset, and 20% used for validation [20,32]. All the results reported in this work were obtained on the validation portion of the dataset. The accuracy is defined as the percentage of the olive oils of the validation dataset which are correctly classified. Since variation in the accuracy may arise from the specific split which was performed, the split and train process needs to be repeated several times [52]. In this work, the split and train process was repeated 100 times for all algorithms. Then, for all the methods, the average and standard deviation of the accuracy over 100 splits were calculated. These are the results described in Section 3.2.

3. Results and Discussion

In this section, firstly, the results of the measurements are presented. Then, the results of the classification using the different techniques are reported.

3.1. Spectral Response of the Olive Oils

The raw fluorescence spectra of selected EVOOs, VOOs, and LOOs are shown in Figure 3. In all the figures the curves are just one single spectrum with the background subtracted, without averaging or smoothing. The integration time is 1 second.

Figure 3A shows the fluorescence spectra of EVOOs. For clarity, the spectra of only 5 of the 12 oils are plotted. The spectra are characterized by a strong signal in the region between 650 nm and 750 nm, with an intense peak at ca. 678 nm and a weaker broader one at ca. 720 nm, typical of chlorophyll and pheophytins [13,14,15,53]. The stronger peak has not always the same spectra position and intensity, while the broader one weakly varies between the samples. These observations are consistent with those previously reported and are attributed to the inner filter effect [28].

Noticeably, the spectra below 650 nm do not show any significant fluorescence intensity. This spectral region is usually attributed to underlying chemical constituents, such as vitamin E, hydrolysis, and oxidation products [3,15]. The lack of significant fluorescence signal in this region is due to the choice of the excitation wavelength. These compounds absorb in the UV, well below the excitation wavelength peaked at 395 nm used here. Depending on the sensor purpose, the inclusion of an additional UV LED to also acquire the UV fluorescence contributions to the spectrum could increase the performance by providing additional specific information. For the problem studied in this work, the strong fluorescence contribution between 650 nm and 750 nm proved to be enough to achieve 100% classification.

For comparison, the fluorescence spectra of VOOs and LOOs are shown in Figure 3B,C. The VOOs show emission spectra which are similar to the EVOOs, with a stronger variability particularly in the intensity of the broader shoulder at 720 nm. The variability between the spectra increases further in the LOOs. The fluorescence from EVOO and VOO samples is generally stronger than LOO ones, which is consistent with previously reported observations for LOOs obtained with synchronous fluorescence spectroscopy [9].

3.2. Classification with Machine Learning Methods

The results of the classification with all the machine learning methods are summarized in Table 3. The results are given as the average of the accuracy

\bar{a}

over 100 different splits and the standard deviation of the accuracy, as described in Section 2.6.

As seen from Table 3, several methods allow reaching an average accuracy above 99% without any pre-processing, namely the DT, ANN, and PCA combined with LDA, RF, and k-NN. There results are better than previously reported for the classification between VOO and LOO with Hierarchical Cluster Analysis (HCA) on EEMs and similar to what is obtained with PCA [9]. Unsurprisingly, the results obtained with SVM are poorer than those obtained with the other methods as typically with those algorithms pre-processing is a key part of the analysis. In fact, in previous work, SVM was applied after pre-processing the data, for example with PCA, to obtain a good accuracy [54]. PCA with LDA was studied using an increasing number of PCA components: 2, 3, 4, 5, 10, 15, 20, 25, and 30. By using only 10 PCA components, LDA was able to reach an accuracy over 90%. With 30 the accuracy reached was over 99%. It is important to note that each spectrum (input to the PCA) consists of 1024 values (the pixels of the CCD of the spectrometer), thus using 30 PCA components is equivalent to using only 2.9% of the amount of features in the original spectra.

To find the optimal architecture for the ANN, hyperparameter tuning was performed as described in Section 2.5. The evolution of the average of accuracy and its standard deviation with increasing ANN complexity is shown in Figure 4. The vertical bar indicates the standard deviation calculated from the 100 different splits. Only the results obtained with 1000 epochs are shown. The effect of increasing the number of epochs from 350 to 1000 was to improve the accuracy and reduce the standard deviation of the accuracy’s average. At above 1000 epochs, the performance increase is smaller than what is obtained by changing the number of layers. Since for moderately complex networks the accuracy was 100% with 1000 epochs, the training was not performed for a larger number of epochs.

For very simple architectures, with only two neurons, the accuracy is below 60%. The use of eight neurons already improves the accuracy to above 80%. When using 32 neurons the accuracy is always above 90%, and increases to above 99% when using two layers. The increase from two to three layers does not affect the results significantly, as the accuracy is already at approximately 100%. This means that the ANN can always correctly identify the three classes of olive oil quality (EVOO, VOO, and LOO).

The goal of this work is to demonstrate that the fluorescence sensor is able to generate data that can be used without any pre-processing or manual feature engineering to make the classification process as easy and automatic as possible. As seen from Table 3 this is the case. These results indicate without any doubt that the data acquired with this very simple and low-cost spectrometer contain sufficient information to allow the correct discrimination between the three quality classes with almost perfect accuracy.

4. Conclusions

The current work presented a new type of compact and low-cost fluorescence sensor which allows high-quality data acquisition that can be reliably used for data-processing or inference for classification purposes. The sensor is simply and conceived to minimize size and costs so as to allow portability. The results demonstrate the use of a minimalist optical sensor based on fluorescence spectroscopy associated with machine-learning methods that can reliably distinguish between different qualities of olive oil: EVOO, VOO, and LOO. This new low-cost sensor has the advantage of being a portable, easy-to-use, and low-cost device, which works with undiluted samples, without any handling of olive oils, like dilution, and without any pre-processing of data, thus simplifying the analysis to the maximum degree possible. Problems like strong absorption and inner filter effect do not affect performance because they are learnt and compensated by the machine learning methods. Among the methods, the use of ANN is particularly important because it does not require pre-processing of data and allows the use of flexible explainability techniques to better optimize and understand the classification process.

The problem investigated here is just one example of the many possible applications. The sensor can be used to solve other classification and regression problems. The details of the machine learning models are expected to be specific of the problem to be addressed.

Author Contributions

Conceptualization and methodology F.V., U.M. and M.A.D.; software U.M. and M.S.; data curation M.S.; hardware M.B. and I.H.; writing—original draft preparation F.V.; writing—review and editing F.V., U.M. and M.S.; samples J.P.C. and A.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by Innosuisse, Swiss Innovation Agency, Grant no. 36761.1 INNO-LS and the Virtuous project, funded by the European Union’s Horizon 2020 research and innovation programme under the Maria Sklodowska-Curie-RISE Grant Agreement No 872181.

Conflicts of Interest

All the authors declare no conflict of interest. The company TOELT LLC declares no conflict of interest, economic or of any other form, with this work or with the affiliations of the other authors. Additionally the company did not receive any funding, royalties, or any other advantages of any form from this work or from any parties involved in this research work. The company Benalua de las Villas provided the samples and declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

EVOO	Extra Virgin Olive Oil
VOO	Virgin Olive Oil
LOO	Lampante Olive Oil
EEC	European Economic Community
EEM	Excitation Emission Matrix
PCA	Principal Component Analysis
PLS	Partial Least Square Regression
PLS-DA	Partial Least Square Discriminant Analysis
LDA	Linear Discriminant Analysis
KNN	K-Nearest Neighbors
RF	Random Forest
ANN	Artificial Neural Network
NNM	Neural Network Model
UV	Ultraviolet
LED	Light Emitting Diode
CCD	Charge-Coupled Device
SVM	Support Vector Machine
NB	Naïve Bayes
DT	Decision Tree
MLR	Multinomial Logistic Regression
ReLU	Rectified Linear Unit
Adam	Adaptive Moment Estimation
HCA	Hyerarchical Cluster Analysis

References

Georgiou, C.A.; Danezis, G.P. Food Authentication: Management, Analysis and Regulation; John Wiley & Sons: Hoboken, NJ, USA, 2017. [Google Scholar]
Regulation, H. Commission Regulation (EEC) No. 2568/91 of 11 July 1991 on the characteristics of olive oil and olive-residue oil and on the relevant methods of analysis Official Journal L 248, 5 September 1991. Offic. JL 1991, 248, 1–83. [Google Scholar]
Kongbonga, Y.G.M.; Ghalila, H.; Onana, M.B.; Majdi, Y.; Lakhdar, Z.B.; Mezlini, H.; Sevestre-Ghalila, S. Characterization of vegetable oils by fluorescence spectroscopy. Food Nutr. Sci. 2011, 2, 692–699. [Google Scholar] [CrossRef]
Sikorska, E.; Khmelinskii, I.; Sikorski, M. Analysis of olive oils by fluorescence spectroscopy: Methods and applications. In Olive Oil-Constituents, Quality, Health Properties and Bioconversions; IntechOpen: Rijeka, Croatia, 2012; pp. 63–88. [Google Scholar]
Skoog, D.A.; Holler, F.J.; Crouch, S.R. Principles of Instrumental Analysis; Cengage Learning: Boston, MA, USA, 2017. [Google Scholar]
Guzmán, E.; Baeten, V.; Pierna, J.A.F.; García-Mesa, J.A. Evaluation of the overall quality of olive oil using fluorescence spectroscopy. Food Chem. 2015, 173, 927–934. [Google Scholar] [CrossRef] [PubMed]
Merás, I.D.; Manzano, J.D.; Rodríguez, D.A.; de la Peña, A.M. Detection and quantification of extra virgin olive oil adulteration by means of autofluorescence excitation-emission profiles combined with multi-way classification. Talanta 2018, 178, 751–762. [Google Scholar] [CrossRef] [PubMed]
Guimet, F.; Boqué, R.; Ferré, J. Cluster analysis applied to the exploratory analysis of commercial Spanish olive oils by means of excitation- emission fluorescence spectroscopy. J. Agric. Food Chem. 2004, 52, 6673–6679. [Google Scholar] [CrossRef]
Poulli, K.I.; Mousdis, G.A.; Georgiou, C.A. Classification of edible and lampante virgin olive oil based on synchronous fluorescence and total luminescence spectroscopy. Anal. Chim. Acta 2005, 542, 151–156. [Google Scholar] [CrossRef]
Sayago, A.; Morales, M.; Aparicio, R. Detection of hazelnut oil in virgin olive oil by a spectrofluorimetric method. Eur. Food Res. Technol. 2004, 218, 480–483. [Google Scholar] [CrossRef]
Poulli, K.I.; Mousdis, G.A.; Georgiou, C.A. Rapid synchronous fluorescence method for virgin olive oil adulteration assessment. Food Chem. 2007, 105, 369–375. [Google Scholar] [CrossRef]
Ali, H.; Saleem, M.; Anser, M.R.; Khan, S.; Ullah, R.; Bilal, M. Validation of fluorescence spectroscopy to detect adulteration of edible oil in extra virgin olive oil (EVOO) by applying chemometrics. Appl. Spectrosc. 2018, 72, 1371–1379. [Google Scholar] [CrossRef]
Hernández-Sánchez, N.; Lleó, L.; Ammari, F.; Cuadrado, T.R.; Roger, J.M. Fast fluorescence spectroscopy methodology to monitor the evolution of extra virgin olive oils under illumination. Food Bioprocess Technol. 2017, 10, 949–961. [Google Scholar] [CrossRef]
Mishra, P.; Lleó, L.; Cuadrado, T.; Ruiz-Altisent, M.; Hernández-Sánchez, N. Monitoring oxidation changes in commercial extra virgin olive oils with fluorescence spectroscopy-based prototype. Eur. Food Res. Technol. 2018, 244, 565–575. [Google Scholar] [CrossRef]
Baltazar, P.; Hernández-Sánchez, N.; Diezma, B.; Lleó, L. Development of rapid extra virgin olive oil quality assessment procedures based on spectroscopic techniques. Agronomy 2020, 10, 41. [Google Scholar] [CrossRef]
Lobo-Prieto, A.; Tena, N.; Aparicio-Ruiz, R.; García-González, D.L.; Sikorska, E. Monitoring Virgin Olive Oil Shelf-Life by Fluorescence Spectroscopy and Sensory Characteristics: A Multidimensional Study Carried Out under Simulated Market Conditions. Foods 2020, 9, 1846. [Google Scholar] [CrossRef] [PubMed]
Dupuy, N.; Le Dréau, Y.; Ollivier, D.; Artaud, J.; Pinatel, C.; Kister, J. Origin of French virgin olive oil registered designation of origins predicted by chemometric analysis of synchronous excitation- emission fluorescence spectra. J. Agric. Food Chem. 2005, 53, 9361–9368. [Google Scholar] [CrossRef] [PubMed]
Jiménez-Carvelo, A.M.; Lozano, V.A.; Olivieri, A.C. Comparative chemometric analysis of fluorescence and near infrared spectroscopies for authenticity confirmation and geographical origin of Argentinean extra virgin olive oils. Food Control 2019, 96, 22–28. [Google Scholar] [CrossRef]
Al Riza, D.F.; Kondo, N.; Rotich, V.K.; Perone, C.; Giametta, F. Cultivar and geographical origin authentication of Italian extra virgin olive oil using front-face fluorescence spectroscopy and chemometrics. Food Control 2021, 121, 107604. [Google Scholar] [CrossRef]
Michelucci, U. Applied Deep Learning—A Case-Based Approach to Understanding Deep Neural Networks; APRESS Media, LLC: New York, NY, USA, 2018. [Google Scholar]
Sikorska, E.; Khmelinskii, I.; Sikorski, M. Vibrational and electronic spectroscopy and chemometrics in analysis of edible oils. In Methods in Food Analysis; Cruz, R.M.S., Khmelinskii, I., Vieira, M., Eds.; CRC Press: Boca Raton, FL, USA, 2014; pp. 201–234. [Google Scholar]
Zaroual, H.; Chénè, C.; El Hadrami, E.M.; Karoui, R. Application of new emerging techniques in combination with classical methods for the determination of the quality and authenticity of olive oil: A review. Crit. Rev. Food Sci. Nutr. 2021, 1–24. [Google Scholar] [CrossRef]
Meenu, M.; Cai, Q.; Xu, B. A critical review on analytical techniques to detect adulteration of extra virgin olive oil. Trends Food Sci. Technol. 2019, 91, 391–408. [Google Scholar] [CrossRef]
Gonzalez-Fernandez, I.; Iglesias-Otero, M.; Esteki, M.; Moldes, O.; Mejuto, J.; Simal-Gandara, J. A critical review on the use of artificial neural networks in olive oil production, characterization and authentication. Crit. Rev. Food Sci. Nutr. 2019, 59, 1913–1926. [Google Scholar] [CrossRef] [PubMed]
Iqdiam, B.M.; Welt, B.A.; Goodrich-Schneider, R.; Sims, C.A.; Baker IV, G.L.; Marshall, M.R. Influence of headspace oxygen on quality and shelf life of extra virgin olive oil during storage. Food Packag. Shelf Life 2020, 23, 100433. [Google Scholar] [CrossRef]
Lakowicz, J.R. Principles of Fluorescence Spectroscopy; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
Ferreiro-González, M.; Barbero, G.F.; Álvarez, J.A.; Ruiz, A.; Palma, M.; Ayuso, J. Authentication of virgin olive oil by a novel curve resolution approach combined with visible spectroscopy. Food Chem. 2017, 220, 331–336. [Google Scholar] [CrossRef] [PubMed]
Torreblanca-Zanca, A.; Aroca-Santos, R.; Lastra-Mejias, M.; Izquierdo, M.; Cancilla, J.C.; Torrecilla, J.S. Laser diode induced excitation of PDO extra virgin olive oils for cognitive authentication and fraud detection. Sens. Actuators B Chem. 2019, 280, 1–9. [Google Scholar] [CrossRef]
Borello, E.; Domenici, V. Determination of pigments in virgin and extra-virgin olive oils: A comparison between two near UV-Vis spectroscopic techniques. Foods 2019, 8, 18. [Google Scholar] [CrossRef] [PubMed]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Hearst, M.A.; Dumais, S.T.; Osuna, E.; Platt, J.; Scholkopf, B. Support vector machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28. [Google Scholar] [CrossRef]
Huang, J.Z. An Introduction to Statistical Learning: With Applications in R By Gareth James, Trevor Hastie, Robert Tibshirani, Daniela Witten; Springer: New York, NY, USA, 2014. [Google Scholar]
Platt, J.C. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers; MIT Press: Cambridge, MA, USA, 1999; pp. 61–74. [Google Scholar]
Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, DC, USA, 4–10 August 2001; Volume 3, pp. 41–46. [Google Scholar]
Islam, M.J.; Wu, Q.J.; Ahmadi, M.; Sid-Ahmed, M.A. Investigating the performance of naive-bayes classifiers and k-nearest neighbor classifiers. In Proceedings of the 2007 International Conference on Convergence Information Technology (ICCIT 2007), Gwangju, Korea, 21–23 November 2007; pp. 1541–1546. [Google Scholar]
Berrar, D. Bayes’ theorem and naive Bayes classifier. In Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics; Elsevier Science Publisher: Amsterdam, The Netherlands, 2018; pp. 403–412. [Google Scholar]
Bewick, V.; Cheek, L.; Ball, J. Statistics review 14: Logistic regression. Crit. Care 2005, 9, 1–7. [Google Scholar] [CrossRef]
Lemeshow, S.; Hosmer, D.W., Jr. A review of goodness of fit statistics for use in the development of logistic regression models. Am. J. Epidemiol. 1982, 115, 92–106. [Google Scholar] [CrossRef]
Van Houwelingen, J.; Le Cessie, S. Logistic regression, a review. Stat. Neerl. 1988, 42, 215–232. [Google Scholar] [CrossRef]
Martinez, A.M.; Kak, A.C. Pca versus lda. IEEE Trans. Pattern Anal. Mach. Intell. 2001, 23, 228–233. [Google Scholar] [CrossRef]
Tominaga, Y. Comparative study of class data analysis with PCA-LDA, SIMCA, PLS, ANNs, and k-NN. Chemom. Intell. Lab. Syst. 1999, 49, 105–115. [Google Scholar] [CrossRef]
Rokach, L.; Maimon, O. Top-down induction of decision trees classifiers—A survey. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2005, 35, 476–487. [Google Scholar] [CrossRef]
Safavian, S.R.; Landgrebe, D. A survey of decision tree classifier methodology. IEEE Trans. Syst. Man, Cybern. 1991, 21, 660–674. [Google Scholar] [CrossRef]
Song, Y.Y.; Ying, L. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130. [Google Scholar]
Swain, P.H.; Hauska, H. The decision tree classifier: Design and potential. IEEE Trans. Geosci. Electron. 1977, 15, 142–147. [Google Scholar] [CrossRef]
Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar]
Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 832–844. [Google Scholar]
Amit, Y.; Geman, D. Shape Quantization and Recognition with Randomized Trees. Neural Comput. 1997, 9, 1545–1588. [Google Scholar] [CrossRef]
Hodges, J.L. Discriminatory Analysis; Number 11; USAF School of Aviation Medicine: Dayton, OH, USA, 1950. [Google Scholar]
Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175–185. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J.A. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015; pp. 1–15. [Google Scholar]
Michelucci, U.; Venturini, F. Estimating Neural Network’s Performance with Bootstrap: A Tutorial. Mach. Learn. Knowl. Extr. 2021, 3, 357–373. [Google Scholar] [CrossRef]
Galeano Díaz, T.; Durán Merás, I.; Correa, C.A.; Roldán, B.; Rodríguez Cáceres, M.I. Simultaneous fluorometric determination of chlorophylls a and b and pheophytins a and b in olive oil by partial least-squares calibration. J. Agric. Food Chem. 2003, 51, 6934–6940. [Google Scholar] [CrossRef] [PubMed]
El Orche, A.; Bouatia, M.; Mbarki, M. Rapid analytical method to characterize the freshness of olive oils using fluorescence spectroscopy and chemometric algorithms. J. Anal. Methods Chem. 2020, 2020, 8860161. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Schematics of the minimalist fluorescence sensor. Blue: Excitation light, red: Fluorescence light.

Figure 2. Photo of the minimalist fluorescence sensor with olive oil samples in the glass vials and a bottle of olive oil.

Figure 3. Fluorescence emission spectra of selected olive oils. Panel (A) five EVOOs, panel (B) five VOOs, and panel (C) five LOOs. Each curve shows a single spectrum without averaging or smoothing.

Figure 4. Evolution of the average of the accuracy and its standard deviation with increasing ANN complexity. For each architecture the points indicate the average of the accuracy of 100 split and train runs, and the error lines indicate the standard deviation.

Table 1. The number of olive oils samples in each quality class. EVOO: Extra virgin olive oil, VOO: Virgin olive oil, and LOO: Lampante olive oil.

Quality	Number of Samples
EVOO	12
VOO	8
LOO	7

Table 2. List of the machine learning methods used, with implementation parameters and references to the methods description: Support vector machine (SVM), naïve Bayes (NB), multinomial logistic regression (MLR), principal component analysis (PCA) and linear discriminant analysis (LDA), decision tree (DT), random forest (RF), and k-nearest neighbor (k-NN).

Algorithm	Implementation Details	References
SVM	Regularization parameter $C = 1.0$ , kernel = radial basis function	[31,32,33]
NB	None	[32,34,35,36]
MLR	regularization penalty = l2, solver algorithm = Newton conjugate gradient	[32,37,38,39]
PCA + LDA	Number of components used with LDA: 2, 3, 4, 5, 10, 15, 20, 25, and 30	[32,40,41]
DT	Split quality criterion used = Gini impurity	[32,42,43,44,45]
RF	Number of trees = 100, split quality criterion used = Gini impurity	[32,42,46,47,48]
k-NN	Numbers of neighbors $k = 3$	[32,49,50]

Table 3. Summary of results of the classification given by the average of the accuracy

\bar{a}

and its standard deviation

σ

; machine learning methods: Support vector machine (SVM), naïve Bayes (NB), multinomial logistic regression (MLR), principal component analysis (PCA) and linear discriminant analysis (LDA), decision tree (DT), random forest (RF), and k-nearest neighbor (k-NN).

Table 3. Summary of results of the classification given by the average of the accuracy

\bar{a}

and its standard deviation

σ

; machine learning methods: Support vector machine (SVM), naïve Bayes (NB), multinomial logistic regression (MLR), principal component analysis (PCA) and linear discriminant analysis (LDA), decision tree (DT), random forest (RF), and k-nearest neighbor (k-NN).

Algorithm	Average Accuracy $\bar{a}$	Standard Deviation $σ$
SVM	0.51	0.07
NB	0.64	0.05
MLR	0.88	0.03
PCA + LDA	0.93	0.02
(10 PCA Components)
DT	0.99	0.01
ANN	0.99	0.04
PCA + LDA	0.999	0.006
(30 PCA Components)
RF	1.0	0.0
k-NN	1.0	0.0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Venturini, F.; Sperti, M.; Michelucci, U.; Herzig, I.; Baumgartner, M.; Caballero, J.P.; Jimenez, A.; Deriu, M.A. Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques. Foods 2021, 10, 1010. https://doi.org/10.3390/foods10051010

AMA Style

Venturini F, Sperti M, Michelucci U, Herzig I, Baumgartner M, Caballero JP, Jimenez A, Deriu MA. Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques. Foods. 2021; 10(5):1010. https://doi.org/10.3390/foods10051010

Chicago/Turabian Style

Venturini, Francesca, Michela Sperti, Umberto Michelucci, Ivo Herzig, Michael Baumgartner, Josep Palau Caballero, Arturo Jimenez, and Marco Agostino Deriu. 2021. "Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques" Foods 10, no. 5: 1010. https://doi.org/10.3390/foods10051010

APA Style

Venturini, F., Sperti, M., Michelucci, U., Herzig, I., Baumgartner, M., Caballero, J. P., Jimenez, A., & Deriu, M. A. (2021). Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques. Foods, 10(5), 1010. https://doi.org/10.3390/foods10051010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploration of Spanish Olive Oil Quality with a Miniaturized Low-Cost Fluorescence Sensor and Machine Learning Techniques

Abstract

1. Introduction

2. Materials and Methods

2.1. Olive Oil Samples

2.2. Miniaturized Low-Cost Fluorescence Sensor

2.3. Dataset Preparation and Description

2.4. Machine Learning Classifiers

2.5. Artificial Neural Network-Based Classifiers

2.6. External Validation of Models

3. Results and Discussion

3.1. Spectral Response of the Olive Oils

3.2. Classification with Machine Learning Methods

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI