Estimation of the Alcoholic Degree in Beers through Near Infrared Spectrometry Using Machine Learning †

: It is a fact that, non-destructive measurement technologies have gain a lot of attention over the years. Among those technologies, NIR technology is the one which allows the analysis of electromagnetic spectrum looking for carbon-link interactions. This technology analyzes the electromagnetic spectrum in the band between 700 nm and 2500 nm, a band very close to the visible spectrum. Traditionally, the devices used to measure are utterly expensive and enormously bulky. That is why this project was focused on a portable spectrophotometer to make measures. This device is smaller and cheaper than the common spectrophotometer, although at the cost of a lower resolution. In this work, that device in combination with the use of machine learning was used to detect if a beer contains alcohol or it can be labeled as non-alcoholic drink.


Introduction
Spectroscopy is one of the main techniques when a non-destructive analysis on the composition of a sample is required, however those analysis usually has to be performed by expensive equipment host only on special laboratories.However recent advances allow the reduction of that expensive equipment into something portable.Those devices allow the analysis in field conditions of the samples, but what is now laking is the expertise to understand the collected spectra.There is where machine learning can help.The main aim of this work is to being able to classify "lagger" beers according to their alcohol level into "Non-alcohol" and "With alcohol".In order to accomplish this objective, a portable NIR scanner from Texas Instrument ( [1]) was used.Because, the idea behind this development, was to perform the measurements with out destroying the sample.Therefore, the liquid has to be poured into an special container called cell to be scanned.The result of those scanner are a particular spectrogram of the beer which has to be processed in order to solve the problem.In this particular work, 36 beers of different brands were used, as well as different graduation levels.Following an strong methodology, the dataset was collected and subsequently analyzed by using ANNs.We must underline that the alcohol is situated by most of the works in the range of 1110 to 1300 nm [2] and there are some articles that use NIR technology with big spectrophotometer to treat problems like this [3][4][5][6].

Materials and Methods
The first task to be tackle in this study was to build the dataset, which requires:

•
Spectrophotometry cells.These are the containers into which the sample will be poured.

•
A glass container.Liquids would be poured into them to reduce the amount of gas as much as possible.

•
Coffee filters.They were used to dramatically reduce the amount of gas in a short time period.
Once all the material was collected, the beers were leaved in a room for 3 days in order to ensure that the temperature of the beers was the same before performing the measurements.Those measurements were performed by a portable NIR scanner which operates in the range between 900 and 1700 nm and, therefore, contains the objective range of the alcohol.Attending to the spectrometer setup, an average of 30 measurements performed with the Hadamard's algorithm was chosen.This process was repeated 3 times for each beer.
Once the liquid was in the cells, the spectrometer was 2mm from the sample while the scan was performed.The resulting samples were labeled as "without alcohol" when the beer contains less than 1 degree, while the label "with alcohol" was reserved for the ones containing more than 1 degree.

Results
All ANNs develop in this work used TensorFlow as its development framework by following a K-fold Record-wise cross-validation schema.Therefore, all three measurements from the same beer were used only on train or on test.The developed ANNs are fully-connected forward networks with a single hidden layer.The optimizer was set to Adam and the loss function was the binary crossentropy.Table 1 shows the average of 50 repetitions of the training for each of the beers used in test.Additionally to the size of the hidden layer, the table shows the accuracies for train and test together with the corresponding standard deviation.

Discussion
First and foremost, we must emphasize that this are barely preliminary results although those have been encouraging.In this work a first dataset has been collected following a methodology pre-established by an expert in the field and it has proved that the machine learning technique can be of great utility in order to move the analysis from the laboratory to the field.As the measurements were collected, it has been possible to appreciate the difference between beers with alcohol of those without alcohol in the range between 1100 and 1300 nanometers, so it was to be expected that, making use of techniques of machine learning, it was possible to solve this problem.However, the high standard deviation observed on the test is something that requires additional study being the most probable reason the small size of the dataset and the use of a highly restricted technique as the K-fold Record-wise cross-validation.Consequently, the plan for the close future is to increase the size of the dataset and perform more test to solve it.