Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning
Abstract
1. Introduction
2. Materials and Methods
2.1. Experiment
2.2. Spectral Analysis
2.3. Determination of Caffeine
2.4. Machine Learning Analysis
3. Results
3.1. Basic Information on Caffeine
3.2. Caffeine Prediction Using Machine Learning
3.3. Comparison of Different Methods
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| ML | machine learning | 
| CG | Spectral information of the bean | 
| CG+C | Spectral information of the bean with additional clone information (CG+C) | 
| CGRG | spectral information of the bean after roasting and grinding | 
| CGRG+C | spectral information of the bean after roasting and grinding with additional clone information | 
| ANN | artificial neural networks | 
| DT | decision tree | 
| LR | linear regression | 
| RF | random forest | 
| SCA | Specialty Coffee Association | 
| UPLC | ultra-performance liquid chromatography | 
| F.V. | Sources of variation | 
| G.L. | degree of freedom | 
| C.V. | coefficient of variation | 
| r | Pearson correlation coefficient | 
| MAE | mean absolute error | 
| RMSE | root mean square error | 
References
- Caporaso, N.; Whitworth, M.B.; Grebby, S.; Fisk, I.D. Non-Destructive Analysis of Sucrose, Caffeine and Trigonelline on Single Green Coffee Beans by Hyperspectral Imaging. Food Res. Int. 2018, 106, 193–203. [Google Scholar] [CrossRef] [PubMed]
- Eron, F.; Noman, M.; de Oliveira, R.R.; Chalfun-Junior, A. Computer Vision-Aided Intelligent Monitoring of Coffee: Towards Sustainable Coffee Production. Sci. Hortic. 2024, 327, 112847. [Google Scholar] [CrossRef]
- Freitas, V.V.; Borges, L.L.R.; Vidigal, M.C.T.R.; dos Santos, M.H.; Stringheta, P.C. Coffee: A Comprehensive Overview of Origin, Market, and the Quality Process. Trends Food Sci. Technol. 2024, 146, 104411. [Google Scholar] [CrossRef]
- Sualeh, A.; Tolessa, K.; Mohammed, A. Biochemical Composition of Green and Roasted Coffee Beans and Their Association with Coffee Quality from Different Districts of Southwest Ethiopia. Heliyon 2020, 6, e05812. [Google Scholar] [CrossRef]
- Loukri, A.; Sarafera, C.; Goula, A.M.; Gardikis, K.; Mourtzinos, I. Green Extraction of Caffeine from Coffee Pulp Using a Deep Eutectic Solvent (DES). Appl. Food Res. 2022, 2, 100176. [Google Scholar] [CrossRef]
- Ayu, P.C.; Budiastra, I.W.; Rindang, A. NIR Spectroscopy Application for Determination Caffeine Content of Arabica Green Bean Coffee. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2020; Volume 454, p. 012049. [Google Scholar]
- Miras-Moreno, B.; Monterisi, S.; Rouphael, Y.; Colla, G.; Lucini, L.; Cesco, S.; Pii, Y. Integrated Metabolomics and Morpho-Biochemical Analyses Reveal a Better Performance of Azospirillum brasilense over Plant-Derived Biostimulants in Counteracting Salt Stress in Tomato. Int. J. Mol. Sci. 2022, 23, 14216. [Google Scholar] [CrossRef]
- Arai, K.; Terashima, H.; Aizawa, S.; Taga, A.; Yamamoto, A.; Tsutsumiuchi, K.; Kodama, S. Simultaneous Determination of Trigonelline, Caffeine, Chlorogenic Acid and Their Related Compounds in Instant Coffee Samples by HPLC Using an Acidic Mobile Phase Containing Octanesulfonate. Anal. Sci. 2015, 31, 831–835. [Google Scholar] [CrossRef]
- Craig, A.P.; Fields, C.; Liang, N.; Kitts, D.; Erickson, A. Performance Review of a Fast HPLC-UV Method for the Quantification of Chlorogenic Acids in Green Coffee Bean Extracts. Talanta 2016, 154, 481–485. [Google Scholar] [CrossRef] [PubMed]
- Fajara, B.E.P.; Susanti, H. HPLC Determination of Caffeine in Coffee Beverage. In Proceedings of the IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2017; Volume 259, p. 012011. [Google Scholar]
- Nogales-Bueno, J.; Baca-Bocanegra, B.; Romero-Molina, L.; Martínez-López, A.; Rato, A.E.; Heredia, F.J.; Hernández-Hierro, J.M.; Escudero-Gilete, M.L.; González-Miret, M.L. Control of the Extractable Content of Bioactive Compounds in Coffee Beans by near Infrared Hyperspectral Imaging. LWT 2020, 134, 110201. [Google Scholar] [CrossRef]
- Caporaso, N.; Whitworth, M.B.; Fisk, I.D. Prediction of Coffee Aroma from Single Roasted Coffee Beans by Hyperspectral Imaging. Food Chem. 2022, 371, 131159. [Google Scholar] [CrossRef]
- Syed, T.A.; Ansari, K.B.; Banerjee, A.; Wood, D.A.; Khan, M.S.; Al Mesfer, M.K. Machine-learning Predictions of Caffeine Co-crystal Formation Accompanying Experimental and Molecular Validations. J. Food Process Eng. 2023, 46, e14230. [Google Scholar] [CrossRef]
- Beitollahi, M.; Hosseini, S.A. Using Savitsky-Golay Smoothing Filter in Hyperspectral Data Compression by Curve Fitting. In Proceedings of the Iranian Conference on Electrical Engineering (ICEE), Mashhad, Iran, 8–10 May 2018; pp. 452–457. [Google Scholar]
- De Gregori, G.S.; de Souza Loureiro, E.; Amorim Pessoa, L.G.; de Azevedo, G.B.; Azevedo, G.T.d.O.S.; Santana, D.C.; de Oliveira, I.C.; de Oliveira, J.L.G.; Teodoro, L.P.R.; Baio, F.H.R. Machine Learning in the Hyperspectral Classification of Glycaspis brimblecombei (Hemiptera Psyllidae) Attack Severity in Eucalyptus. Remote Sens. 2023, 15, 5657. [Google Scholar] [CrossRef]
- Santana, D.C.; dos Santos, R.G.; da Silva, P.H.N.; Pistori, H.; Teodoro, L.P.R.; Poersch, N.L.; de Azevedo, G.B.; de Oliveira Sousa Azevedo, G.T.; da Silva Junior, C.A.; Teodoro, P.E. Machine Learning Methods for Woody Volume Prediction in Eucalyptus. Sustainability 2023, 15, 10968. [Google Scholar] [CrossRef]
- Pereira Ribeiro Teodoro, L.; Estevão, R.; Santana, D.C.; de Oliveira, I.C.; Lopes, M.T.G.; de Azevedo, G.B.; Rojo Baio, F.H.; da Silva Junior, C.A.; Teodoro, P.E. Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning. Forests 2023, 15, 39. [Google Scholar] [CrossRef]
- Bouckaert, R.R.; Frank, E.; Hall, M.A.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I.H. WEKA—Experiences with a Java Open-Source Project. J. Mach. Learn. Res. 2010, 11, 2533–2541. [Google Scholar]
- Zareef, M.; Hassan, M.M.; Arslan, M.; Ahmad, W.; Ali, S.; Ouyang, Q.; Li, H.; Wu, X.; Chen, Q. Rapid Prediction of Caffeine in Tea Based on Surface-Enhanced Raman Spectroscopy Coupled Multivariate Calibration. Microchem. J. 2020, 159, 105431. [Google Scholar] [CrossRef]
- Mori, A.L.B.; Viegas, M.C.; Ferrão, M.A.G.; Fonseca, A.F.; Ferrão, R.G.; Benassi, M.T. Coffee Brews Composition from Coffea canephora Cultivars with Different Fruit-Ripening Seasons. Br. Food J. 2020, 122, 827–840. [Google Scholar] [CrossRef]
- Poisson, L.; Blank, I.; Dunkel, A.; Hofmann, T. The Chemistry of Roasting—Decoding Flavor Formation. In The Craft and Science of Coffee; Elsevier: Amsterdam, The Netherlands, 2017; pp. 273–309. [Google Scholar]
- Hall, R.D.; Trevisan, F.; de Vos, R.C.H. Coffee Berry and Green Bean Chemistry–Opportunities for Improving Cup Quality and Crop Circularity. Food Res. Int. 2022, 151, 110825. [Google Scholar] [CrossRef]
- Joët, T.; Laffargue, A.; Descroix, F.; Doulbeau, S.; Bertrand, B.; Dussert, S. Influence of Environmental Factors, Wet Processing and Their Interactions on the Biochemical Composition of Green Arabica Coffee Beans. Food Chem. 2010, 118, 693–701. [Google Scholar] [CrossRef]
- Baggenstoss, J.; Poisson, L.; Kaegi, R.; Perren, R.; Escher, F. Coffee Roasting and Aroma Formation: Application of Different Time—Temperature Conditions. J. Agric. Food Chem. 2008, 56, 5836–5846. [Google Scholar] [CrossRef]
- da Rosa, J.S.; Freitas-Silva, O.; Rouws, J.R.C.; da Silva Moreira, I.G.; Novaes, F.J.M.; de Almeida Azevedo, D.; Schwab, N.; de Oliveira Godoy, R.L.; Eberlin, M.N.; de Rezende, C.M. Mass Spectrometry Screening of Arabica Coffee Roasting: A Non-Target and Non-Volatile Approach by EASI-MS and ESI-MS. Food Res. Int. 2016, 89, 967–975. [Google Scholar] [CrossRef]
- Munawar, A.A.; Mörlein, D. Prediction Accuracy of Near Infrared Spectroscopy Coupled with Adaptive Machine Learning Methods for Simultaneous Determination of Chlorogenic Acid and Caffeine on Intact Coffee Beans. Case Stud. Chem. Environ. Eng. 2024, 10, 100913. [Google Scholar] [CrossRef]
- Xu, S.; Lu, B.; Baldea, M.; Edgar, T.F.; Nixon, M. An Improved Variable Selection Method for Support Vector Regression in NIR Spectral Modeling. J. Process Control 2018, 67, 83–93. [Google Scholar] [CrossRef]







| Depth | pH | Ca | Mg | Al | H + Al | K | SB | T | P | V | M | MO | Sand | Silt | Clay | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| m | H O2 | -----------------------cmolc dm3 --------------------- | mg dm3 | -----%------- | -------------g kg------------ | ||||||||||
| 0–0.20 | 6.1 | 3.51 | 0.55 | 0.00 | 2.31 | 0.25 | 4.31 | 6.62 | 24.91 | 65.1 | 0.0 | 15.3 | 851 | 5 | 144 | 
| 0.20–0.40 | 4.9 | 0.56 | 0.15 | 0.10 | 1.82 | 0.05 | 0.76 | 2.57 | 4.23 | 29.5 | 11.6 | 4.0 | 823 | 9 | 167 | 
| 0.40–0.60 | 4.9 | 0.37 | 0.13 | 0.10 | 4.62 | 0.05 | 0.54 | 5.16 | 1.90 | 10.5 | 15.6 | 1.4 | 770 | 18 | 213 | 
| F.V. | G.L. | r | MAE | RMSE | 
|---|---|---|---|---|
| ML | 5 | 0.814 * | 0.000 * | 0.0004 * | 
| input | 3 | 1.002 * | 0.000 * | 0.0001 * | 
| ML * input | 15 | 0.133 * | 0.000 * | 0.0000 * | 
| Waste | 216 | 0.007 | 0.000 | 0.0000 | 
| Total | 239 | 0.045 | 0.000 | 0.0000 | 
| C.V. (%) | 36.4 | 11.79 | 10.28 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Santana, D.C.; Ratke, R.F.; Zanatta, F.L.; Campos, C.N.S.; Seron, A.C.d.S.C.; Teodoro, L.P.R.; Silva, N.P.d.; Oliveira, G.S.; Santos, R.G.d.; Alvarez, R.d.C.F.; et al. Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning. AgriEngineering 2024, 6, 4480-4492. https://doi.org/10.3390/agriengineering6040255
Santana DC, Ratke RF, Zanatta FL, Campos CNS, Seron ACdSC, Teodoro LPR, Silva NPd, Oliveira GS, Santos RGd, Alvarez RdCF, et al. Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning. AgriEngineering. 2024; 6(4):4480-4492. https://doi.org/10.3390/agriengineering6040255
Chicago/Turabian StyleSantana, Dthenifer Cordeiro, Rafael Felipe Ratke, Fabio Luiz Zanatta, Cid Naudi Silva Campos, Ana Carina da Silva Cândido Seron, Larissa Pereira Ribeiro Teodoro, Natielly Pereira da Silva, Gabriela Souza Oliveira, Regimar Garcia dos Santos, Rita de Cássia Félix Alvarez, and et al. 2024. "Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning" AgriEngineering 6, no. 4: 4480-4492. https://doi.org/10.3390/agriengineering6040255
APA StyleSantana, D. C., Ratke, R. F., Zanatta, F. L., Campos, C. N. S., Seron, A. C. d. S. C., Teodoro, L. P. R., Silva, N. P. d., Oliveira, G. S., Santos, R. G. d., Alvarez, R. d. C. F., Junior, C. A. d. S., Blanco, M., & Teodoro, P. E. (2024). Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning. AgriEngineering, 6(4), 4480-4492. https://doi.org/10.3390/agriengineering6040255
 
        









 
       