Development of a Pattern Recognition Tool for the Classiﬁcation of Electronic Tongue Signals Using Machine Learning

: Electronic tongue type sensor arrays are made of different materials with the property 1 of capturing signals independently by each sensor. The signals captured when conducting 2 electrochemical tests often have high dimensionality, which increases when performing the data 3 unfolding process. This unfolding process consists of arranging the data coming from different 4 experiments, sensors, and sample times, thus the obtained information is arranged in a two- 5 dimensional matrix. In this work, a description of a tool for the analysis of electronic tongue 6 signals is developed. This tool is developed in Matlab® App Designer, to process and classify 7 the data from different substances analyzed by an electronic tongue type sensor array. The data 8 processing is carried out through the execution of the following stages: (1) data unfolding, (2) 9 normalization, (3) dimensionality reduction, (4) classiﬁcation through a supervised machine 10 learning model, and ﬁnally (5) a cross-validation procedure to calculate a set of classiﬁcation 11 performance measures. Some important characteristics of this tool are the possibility to tune the 12 parameters of the dimensionality reduction and classiﬁer algorithms, and also plot the two and 13 three-dimensional scatter plot of the features after reduced the dimensionality. This to see the 14 data separability between classes and compatibility in each class. This interface is successfully 15 tested with two electronic tongue sensor array datasets with multi-frequency large amplitude 16 pulse voltammetry (MLAPV) signals. The developed graphical user interface allows comparing 17 different methods in each of the mentioned stages to ﬁnd the best combination of methods and 18 thus obtain the highest values of classiﬁcation performance measures. 19

1 of capturing signals independently by each sensor. The signals captured when conducting 2 electrochemical tests often have high dimensionality, which increases when performing the data 3 unfolding process. This unfolding process consists of arranging the data coming from different 4 experiments, sensors, and sample times, thus the obtained information is arranged in a two- 5 dimensional matrix. In this work, a description of a tool for the analysis of electronic tongue 6 signals is developed. This tool is developed in Matlab® App Designer, to process and classify 7 the data from different substances analyzed by an electronic tongue type sensor array. The data 8 processing is carried out through the execution of the following stages: (1) data unfolding, (2)

22
The data set obtained from an MLAPV (multifrequency large amplitude pulse 23 voltammetry) electronic tongue device comes from various types of sensors and their 24 magnitudes can have different scales [1]. These signals are characterized by having 25 high dimensionality [2]. This can cause problems in Machine Learning models, both 26 in pattern recognition and in the accuracy of data classification [3]. Due to this, it is 27 necessary to perform the correct processing of these data sets to obtain high precision 28 values for the classification of liquid substances.

29
In 2020, Leon-Medina et al. [2] developed a methodology that seeks to improve

51
The measurements of the responses of an electronic tongue system are discretized   In this work two tests with the developed tool are performed using two different 60 datasets. These tests are described below:

61
For the first test, a dataset obtained by means of a MLAPV electronic tongue 62 developed by Liu et al [5] is used. The electronic tongue consisted of a platinum pillar 63 auxiliary sensor, an Ag / AgCl reference sensor, and six working electrodes made of 64 different materials, gold, platinum, palladium, titanium, tungsten, and silver. In the 65 experiment, the fourth titanium electrode was damaged, so it was not considered in 66 the data analysis [5]. Seven liquids or aqueous matrices were used to collect the data 67 from the first dataset: 1) red wine, 2) Chinese liquor, 3) beer, 4) black tea, 5) oolog 68 tea, 6 ) you maofeng and 7) you pu'er. Each one with three different concentrations 69 (14%, 25% and 100%) of the original solution mixed with distilled water, to which three replications were made, that is, 9 samples for each liquid [2], for a total of 63 samples.

71
With 2050 measurement points per sensor and 5 sensors in the electronic tongue, when 72 performing the Unfolding procedure of the data (described above, see Figure 1), the 73 dataset is composed of a matrix of size 63 X 10250. 74 The second test uses a dataset obtained from the study by Zhang et al. [6]. This 75 second dataset contains the data collected from an MLAPV electronic tongue with five 76 working electrodes made of gold, silver, palladium, tungsten and silver. The auxiliary 77 electrode used is platinum pillar and the reference electrode is Ag / AgCl [7]. For this 78 study, 13 liquids or aqueous matrices (number of samples) were used: 1) beer (19), 2) 79 red wine (8), 3) white alcohol (6), 4) black tea (9), 5) tea Maofeng (9), 6) pu'er tea (9), 7) 80 Oolong tea (9), 8) coffee (9), 9) milk (9), 10) cola (6), 11) vinegar (9) , 12) medicine (6) and   In relation with the plots int the GUI, the data loaded in the GUI, as well as the 121 normalized data and data after the dimensionality reduction process can be visualized.

126
Through the GUI, the following tests are performed with the datasets described 127 above, see section 2. In the Plot tab the 2D and scatter graphs obtained after applying a dimensionality 130 reduction technique are displayed. Figure 5 shows the graphs obtained with three differ-