Detection of Chocolate Properties Using Near-Infrared Spectrophotometry

Knowing the chemical composition of a substance provides valuable information about it. That is why numerous techniques have been developed to try to obtain it. One of them is the Near Infrared Spectrometry technique, a non-destructive technique that analyzes the electromagnetic spectrum in search of waves of a certain length. The aim of this project is to combine this technology with machine learning techniques to try to detect the presence of milk, as well as the level of cocoa present in an ounce of chocolate. This has given satisfactory results in both cases, so it is considered that the combination of these techniques offers great possibilities.


Introduction
Near-infrared spectrophotometry [1] is a technique whith which we can measure the chemical compositions of substances, and its main object of measurement is carbon. This technique is non-destructive, which gives the possibility of repeating the analysis.
The devices used for this analysis are usually very large and are housed in laboratories dedicated to this type of analysis. It is for this reason that portable spectrophotometers of reduced size and accuracy were developed. In this way, the device can be moved and measurements can be taken in the field.
Chocolate [2] is a food that is the result of the combination of various ingredients. These can vary, from nuts to different additives. Among the possible ingredients, two main ones stand out, milk and cocoa.
The aim of this work is to detect different properties in chocolate, such as the presence of milk or the level of cocoa contained in an ounce, using machine learning [3].

Related Work
A large number of articles can be found, in which different types of food, including chocolate, are analyzed using NIR technology. They try to find different relevant properties in chocolate, such as cocoa. It is necessary to mention at this point that the measurements performed on the different substances have been carried out using laboratory spectrophotometers. These tools have a higher precision and a longer range than the device used to carry out the data collection.
The article [4] determines sugar in chocolate using NIR technology. In this case, the temperature of the sample was raised to 40 ºC and transferred to a spectrophotometry cell prior to analysis.
Noteworthy are also the articles [5,6], which try to detect some characteristics such as sugar, dairy products or moisture in chocolate, among others, using Fourier transform. The work of [7], which tries to identify whether the lard contained in the sample is adulterated, also stands out in this field.

Materials and Methods
The collected database has been integrated into the repository Data Mendeley and can be accessed on the site: https://data.mendeley.com/v1/datasets/xr4cct5fc7/draft?preview=1 (accessed on 28 July 2021).
The different data obtained were then analyzed. Differences can be seen in the three variables provided by the spectrophotometer: intensity, absorbance and reflectance. It is therefore considered possible to try to solve the proposed problem using ANN [8].
In this work, first of all, the detection of milk in each of the measurements is addressed. Subsequently, we have tried to tackle the problem of the cocoa level. For this purpose, it was decided to divide the tablets according to the cocoa level, with a low level being less than 50%, a medium-low level between 50% and 75%, a medium-high level between 75% and 90%, and a high level with a content above 90%.

Results
The results offered for each of the different problems will be presented below. The main models to be taken into account will be shown. For this purpose, 34 models of neuron networks with different configurations have been developed and trained 50 times each, with cross-validation [9]. As an objective measure, the average accuracy of these 50 trainings as well as the standard deviation will be used. Tables 1 and 2 show relevant information about the main models, such as the network architecture, the features trained with or the optimizer used in addition to the performance metrics.

Detection of Milk in Chocolate Ounces
The first problem to be addressed is related to the detection of milk in different chocolate samples. The models with the best results can be seen in Table 1. The model composed of 128 neurons in the first hidden layer and 64 in the second one, using the Adamax optimizer [10], stands out. This model obtains an accuracy of over 92% with a standard deviation of 0.1.

Detection of Cocoa Level in Ounces of Chocolate
The second problem to be addressed is related to the detection of the cocoa level in the different chocolate samples. The best models can be seen in Table 2. In this case, the model composed of 64 neurons in the first hidden layer and 32 in the second one, using the optimizer Adam [10], stands out. This model obtains an accuracy of over 81% with a standard deviation of 0.1.

Discussion
The use of NIR technology to try to solve the two proposed problems seems appropriate. This technology offers a good quality of the data collected and gives very valuable information that can be exploited by machine learning techniques.
ANNs are a great resource when working with this type of signals. While it is true that better results can be obtained in both problems once the data are analyzed in greater depth, it is considered a good first approach for the work to be done.

Conclussions
Due to the high accuracy provided by NIR technology, it is considered a very useful resource in the treated sector, providing a simple and fast way to discover different properties of various foods, such as chocolate, in this case. Good results have been obtained in the classification of both types of problems, exceeding 90% accuracy in the milk detection problem and 80% in the classification problem for cocoa levels.

Future Work
As for the domain, there are numerous future works that can be done in this area. It should be noted that a larger volume of measurements would be necessary for the application of regression. Some examples of future work could be related to the application of regression for detecting the percentage of cocoa, detecting the amount of sugar or detecting the presence of palm oil.