Neural Reduction of Image Data in Order to Determine the Quality of Malting Barley

Image analysis using neural modeling is one of the most dynamically developing methods employing artificial intelligence. The feature that caused such widespread use of this technique is mostly the ability of automatic generalization of scientific knowledge as well as the possibility of parallel analysis of the empirical data. A properly conducted learning process of artificial neural network (ANN) allows the classification of new, unknown data, which helps to increase the efficiency of the generated models in practice. Neural image analysis is a method that allows extracting information carried in the form of digital images. The paper focuses on the determination of imperfections such as contaminations and damages in the malting barley grains on the basis of information encoded in the graphic form represented by the digital photographs of kernels. This choice was dictated by the current state of knowledge regarding the classification of contamination that uses undesirable features of kernels to exclude them from use in the malting industry. Currently, a qualitative assessment of kernels is carried by malthouse-certified employees acting as experts. Contaminants are separated from a sample of malting barley manually, and the percentages of previously defined groups of contaminations are calculated. The analysis of the problem indicates a lack of effective methods of identifying the quality of barley kernels, such as the use of information technology. There are new possibilities of using modern methods of artificial intelligence (such as neural image analysis) for the determination of impurities in malting barley. However, there is the problem of effective compression of graphic data to a form acceptable for ANN simulators. The aim of the work is to develop an effective procedure of graphical data compression supporting the qualitative assessment of malting barley with the use of modern information technologies. Image analysis can be implemented into dedicated software.


Introduction
Modern technologies used in the agri-food industry frequently help to improve the efficiency of production processes, which allow increasing the revenues of enterprises and thus strengthening their position on the market [1]. Currently, the challenge for this branch of the economy is the production of agri-food products characterized by the best parameters in terms of quality, while maintaining optimal costs of the production and distribution of the processed biological material [2][3][4]. It is therefore essential to search for new, increasingly sophisticated methods and technologies to meet these requirements. A new solution, which is still being developed, is the use of the so-called machine vision that can replace human work in both qualitative and quantitative assessment processes. The undoubted advantage of using this type of method is maintaining the objectivity of the assessment, increasing its speed and, importantly, eliminating the expert's fatigue [5,6].
The work focuses on the determination of malting barley grain (Latin: Hordeum vulgare) impurities. This choice was dictated by unsatisfactory reports concerning the state of the art of classification methods currently used in the malting industry [7,8]. The qualitative evaluation of kernels is conducted by certified malt workers acting as experts. They manually separate the impurities from the malting barley sample and then calculate the percentage of (predefined) impurity groups. The analysis of the problem area shows that there is no effective method of qualitative identification of barley kernels, e.g., with the use of information technology [9,10]. Therefore, it is justified to investigate the possibility of using modern methods of artificial intelligence to determine the contamination or damages of malting barley [11][12][13].
The work aimed to develop a new, effective method of malting barley quality assessment with the use of neural image analysis techniques [14][15][16]. The biological material presented in the form of digital images was classified. In this context, the process of compression of graphical empirical data was analyzed, especially with the use of self-associative neural networks (SANN).

Materials
The object of the research was the malting barley used for the production of malt. In Poland, 32 registered malting barley varieties dominate, including 29 spring and 3 winter varieties (the small number of the latter is due to reduced resistance of barley to low temperatures) [7]. In order to supplement the range, malt houses often import other varieties of barley, e.g., from European Union countries. This is due to the precise requirements imposed on malt houses by international brewing concerns. In practice, this means combining malting barley varieties until the defined and expected brewing parameters are achieved. The material used in the research was obtained from the malt house of Soufflet Polska Limited Company (Poznań, Poland).
Experiments were conducted involving a popular spring variety of malting barley-Sebastian, characterized by good technological features [17]. The most important technological characteristics of malt barley are: extractivity, wort viscosity, and Kolbach index (see Table 1). It is customary to determine the brewing value synthetically (column 1, Table 1) [18]. The scheme of preparation of the samples of the kernel of the above-selected malting barley is shown in Figure 1. The scheme of preparation of the samples of the kernel of the above-selected malting barley is shown in Figure 1. To obtain digital images representing seeds of the Sebastian variety, an EPSON V750-M Pro 2D flatbed scanner was used, which allowed for obtaining high-quality sets of photos. The applied parameters of the scanner were as follows: optical resolution-6400 dpi, optical density-4 Dmax, color depth-input/output 48 Bit Color, converter-CCD, and light source-cold cathode fluorescent lamp [17] (see Figure 2). A total of 176 digital images of malting barley kernels of the cultivar Sebastian were acquired. For the processing and analysis of digital images of kernels, the original IT system "Hordeum v.3.2" (see Figure 3) was developed, which is not only dedicated to image processing and analysis but is also a useful tool for the generation of the teaching files for ANN. For the design and construction of the above-mentioned application, GUI (graphical user interface) was used as a standard implemented in the MATLAB 2014b environment ( Figure 3). Elements of the Statistica v. 10 statistical package were used to create a neural compressor of graphic empirical data. The standard procedure implemented in the To obtain digital images representing seeds of the Sebastian variety, an EPSON V750-M Pro 2D flatbed scanner was used, which allowed for obtaining high-quality sets of photos. The applied parameters of the scanner were as follows: optical resolution-6400 dpi, optical density-4 Dmax, color depth-input/output 48 Bit Color, converter-CCD, and light source-cold cathode fluorescent lamp [17] (see Figure 2). The scheme of preparation of the samples of the kernel of the above-selected malting barley is shown in Figure 1. To obtain digital images representing seeds of the Sebastian variety, an EPSON V750-M Pro 2D flatbed scanner was used, which allowed for obtaining high-quality sets of photos. The applied parameters of the scanner were as follows: optical resolution-6400 dpi, optical density-4 Dmax, color depth-input/output 48 Bit Color, converter-CCD, and light source-cold cathode fluorescent lamp [17] (see Figure 2). A total of 176 digital images of malting barley kernels of the cultivar Sebastian were acquired. For the processing and analysis of digital images of kernels, the original IT system "Hordeum v.3.2" (see Figure 3) was developed, which is not only dedicated to image processing and analysis but is also a useful tool for the generation of the teaching files for ANN. For the design and construction of the above-mentioned application, GUI (graphical user interface) was used as a standard implemented in the MATLAB 2014b environment ( Figure 3). Elements of the Statistica v. 10 statistical package were used to create a neural compressor of graphic empirical data. The standard procedure implemented in the A total of 176 digital images of malting barley kernels of the cultivar Sebastian were acquired. For the processing and analysis of digital images of kernels, the original IT system "Hordeum v.3.2" (see Figure 3) was developed, which is not only dedicated to image processing and analysis but is also a useful tool for the generation of the teaching files for ANN. For the design and construction of the above-mentioned application, GUI (graphical user interface) was used as a standard implemented in the MATLAB 2014b environment ( Figure 3). Elements of the Statistica v. 10 statistical package were used to create a neural compressor of graphic empirical data. The standard procedure implemented in the "STATISTICA Neural Networks PL-autoassociative networks-nonlinear dimension reduction" module by StatSoft, described in the Section 2.2 (pages: 7 and 8), was used.
"STATISTICA Neural Networks PL-autoassociative networks-nonlinear dimension reduction" module by StatSoft, described in the Section 2.2 (pages: 7 and 8), was used.  (Table 2) features. The analysis aimed to obtain a precise description of the impurity groups of barley kernels, which were then defined by appropriate characteristic parameters. The most common groups of imperfections occurring in a given batch of barley during the production process in malt houses were considered. Table 2. Types of pollutants included in the "Hordeum 3.2" IT system. The blackgroud is the copyright shaders.

Type of Pollution
Halves 3.
Grain affected by pests 7.
The grain is affected by mold 9.
Other grains/seeds  The analyses distinguished 12 types of imperfection (Table 2) features. The analysis aimed to obtain a precise description of the impurity groups of barley kernels, which were then defined by appropriate characteristic parameters. The most common groups of imperfections occurring in a given batch of barley during the production process in malt houses were considered. Table 2. Types of pollutants included in the "Hordeum 3.2" IT system. The blackgroud is the copyright shaders.   The created IT system "Hordeum v.3.2" (Figure 3) was equipped with a module designed to generate graphic parameters characterizing digital images. Sixty-four standard descriptors representing kernel graphics were selected, which are further input variables of training files necessary in the process of creation of ANN models (see Figure 5). The generated training data file, containing the above-mentioned representative features in its structure, was subsequently used in the process of creating a neural model. The structure of the training data set consisted of 64 input variables describing the geometry, shape factors, color, and texture of barley kernels. The created file contained 176 cases, normally divided in the ratio 2:1:1, into the following subsets: training, validation, and test, respectively. The structure of the training file is shown in Figure 6. The created IT system "Hordeum v.3.2" (Figure 3) was equipped with a module designed to generate graphic parameters characterizing digital images. Sixty-four standard descriptors representing kernel graphics were selected, which are further input variables of training files necessary in the process of creation of ANN models (see Figure 5). The created IT system "Hordeum v.3.2" (Figure 3) was equipped with a module designed to generate graphic parameters characterizing digital images. Sixty-four standard descriptors representing kernel graphics were selected, which are further input variables of training files necessary in the process of creation of ANN models (see Figure 5). The generated training data file, containing the above-mentioned representative features in its structure, was subsequently used in the process of creating a neural model. The structure of the training data set consisted of 64 input variables describing the geometry, shape factors, color, and texture of barley kernels. The created file contained 176 cases, normally divided in the ratio 2:1:1, into the following subsets: training, validation, and test, respectively. The structure of the training file is shown in Figure 6. The generated training data file, containing the above-mentioned representative features in its structure, was subsequently used in the process of creating a neural model. The structure of the training data set consisted of 64 input variables describing the geometry, shape factors, color, and texture of barley kernels. The created file contained 176 cases, normally divided in the ratio 2:1:1, into the following subsets: training, validation, and test, respectively. The structure of the training file is shown in Figure 6.

Methods
The relatively large number of descriptors (64) compared to the number of training cases (176 images) may cause difficulties in neural analysis based on discrete optimization methods [19,20]. One of the solutions used in such cases is the use of a statistical method of reducing the input data in the analyzed set, e.g., the standard technique of PCA (Principal Component Analysis).
The methods of discrete compression of digital images are based on the use of the phenomenon of redundancy of information encoded in graphic form and transforming it in such a way as to lead to a new representation of data devoid of mutual stochastic relationships. The consequence of these actions is to obtain a "new" data space with a smaller dimension. Thus, data compression means expressing the initial set of information encoded, for example, in the form of coefficients characterizing the image by means of a representation characterized by a smaller dimension. The technique based on image decomposition in the form of matrix representation is one of the possibilities of frequently used methods (algorithms) of graphic data compression.
The transformation of the proper covariance matrix is performed to remove correlations between adjacent pixels. Practice shows that encoding uncorrelated data produces better results and the information contained in a given coefficient is not duplicated when encoding another representative parameter.
A relatively popular way to reduce the dimension of correlated multivariate data is the standard method of PCA [21]. It determines the directions of the maximum variability of the original input data by rotating the coordinate system in such a way that the maximum variance of the data after the transformation occurs along the new axes. It is expected to retain as much valuable information as possible in the processed data. However, the directions of the maximum variance are not necessarily the directions of the maximum amount of information [20]. In technical sciences (signal processing, artificial neural networks, etc.), this method is referred to as the lossy Karhunen-Loeve transform (lossy KLT). This determines a linear transformation involving the rotation of data to a new coordinate system, formed by the eigenvectors of the covariance matrix, determined for the analyzed data. The eigenvalues (corresponding to individual eigenvectors) determine how the amount of "variation" in the analyzed data is represented by the respective eigenvectors. The innovative approach to the KLT method formalized by the Finnish scientist Erkki Oja is based on the use of a special ANN topology, represented by self-associative neural networks (SANN). The idea of using neural networks implementing KLT transformations

Methods
The relatively large number of descriptors (64) compared to the number of training cases (176 images) may cause difficulties in neural analysis based on discrete optimization methods [19,20]. One of the solutions used in such cases is the use of a statistical method of reducing the input data in the analyzed set, e.g., the standard technique of PCA (Principal Component Analysis).
The methods of discrete compression of digital images are based on the use of the phenomenon of redundancy of information encoded in graphic form and transforming it in such a way as to lead to a new representation of data devoid of mutual stochastic relationships. The consequence of these actions is to obtain a "new" data space with a smaller dimension. Thus, data compression means expressing the initial set of information encoded, for example, in the form of coefficients characterizing the image by means of a representation characterized by a smaller dimension. The technique based on image decomposition in the form of matrix representation is one of the possibilities of frequently used methods (algorithms) of graphic data compression.
The transformation of the proper covariance matrix is performed to remove correlations between adjacent pixels. Practice shows that encoding uncorrelated data produces better results and the information contained in a given coefficient is not duplicated when encoding another representative parameter.
A relatively popular way to reduce the dimension of correlated multivariate data is the standard method of PCA [21]. It determines the directions of the maximum variability of the original input data by rotating the coordinate system in such a way that the maximum variance of the data after the transformation occurs along the new axes. It is expected to retain as much valuable information as possible in the processed data. However, the directions of the maximum variance are not necessarily the directions of the maximum amount of information [20]. In technical sciences (signal processing, artificial neural networks, etc.), this method is referred to as the lossy Karhunen-Loeve transform (lossy KLT). This determines a linear transformation involving the rotation of data to a new coordinate system, formed by the eigenvectors of the covariance matrix, determined for the analyzed data. The eigenvalues (corresponding to individual eigenvectors) determine how the amount of "variation" in the analyzed data is represented by the respective eigenvectors. The innovative approach to the KLT method formalized by the Finnish scientist Erkki Oja is based on the use of a special ANN topology, represented by self-associative neural networks (SANN). The idea of using neural networks implementing KLT transformations also allows for the generalization of this method, which is free from the limitation resulting from the linear nature of the transformation matrix. In this way, SANNs are also able to execute a non-linear version of the KLT transformation.
Self-associative neural networks (SANNs), usually in the form of linear networks or MLP (multilayer perceptron), have a specific topology. These networks are characterized by an identical number of neurons in both the input and output layers [22,23]. They are therefore intended to reproduce on their outputs the values given at the input. The characteristic feature of the auto-associative network is that the hidden layer contains fewer neurons than the input and output layers. This results in the reduction of the number of data in the input vector.
Networks of this type can be used successfully, mostly to reduce the dimension of the vector representing the input data [24,25], which significantly supports the process of creating an optimal neural topology to solve a given problem. In particular, this technique can be an effective tool for the nonlinear compression of various types of data. In order to perform neural compression (reduction) of empirical data, the module of the Statistica v. 10 package was used. The standard procedure was followed with the steps as below: 1.
Preparation of a data file for the training of the self-associative network. For this purpose, all the original output variables (i.e., the problem that will be solved after obtaining a compressed representation of the input data) were initially "Omitted" (at the stage of searching for methods of reducing the input data set, the output signals do not matter at all). Then all input variables were treated as output data; 2.
Creation of a five-layer MLP self-associated network. The middle layer contains significantly fewer neurons than the input or output layer. The other two hidden layers have a relatively large and equal number of neurons; 3.
Training of a self-associative neural networks (SANN) on the basis of the training file prepared as described above. For this purpose, the coupled gradients (CG) method was used; 4.
Removal of the last two layers in the self-associative neural networks (SANN). After the application of this procedure, the created network represents a structure that converts the primary (numerous) input data into less numerous data in the middle layer (formerly hidden, now output). The "reduced" network executes data processing from the input layer to the output (formerly hidden) layer to perform non-linear dimension reduction; 5.
Use of the obtained network to generate a new version of the input data with a reduced dimension. In this way, a representative training file with a reduced dimension of the input data vector is obtained; 6.
Finally, the creation of a new network solving the fundamental problem and then conducting the process of training it, using a data file with a reduced dimension.
The schematic procedure of neural classification is presented in Figure 7.
Sensors 2021, 21, x FOR PEER REVIEW 7 of 11 also allows for the generalization of this method, which is free from the limitation resulting from the linear nature of the transformation matrix. In this way, SANNs are also able to execute a non-linear version of the KLT transformation. Self-associative neural networks (SANNs), usually in the form of linear networks or MLP (multilayer perceptron), have a specific topology. These networks are characterized by an identical number of neurons in both the input and output layers [22,23]. They are therefore intended to reproduce on their outputs the values given at the input. The characteristic feature of the auto-associative network is that the hidden layer contains fewer neurons than the input and output layers. This results in the reduction of the number of data in the input vector.
Networks of this type can be used successfully, mostly to reduce the dimension of the vector representing the input data [24,25], which significantly supports the process of creating an optimal neural topology to solve a given problem. In particular, this technique can be an effective tool for the nonlinear compression of various types of data. In order to perform neural compression (reduction) of empirical data, the module of the Statistica v. 10 package was used. The standard procedure was followed with the steps as below: 1. Preparation of a data file for the training of the self-associative network. For this purpose, all the original output variables (i.e., the problem that will be solved after obtaining a compressed representation of the input data) were initially "Omitted" (at the stage of searching for methods of reducing the input data set, the output signals do not matter at all). Then all input variables were treated as output data; 2. Creation of a five-layer MLP self-associated network. The middle layer contains significantly fewer neurons than the input or output layer. The other two hidden layers have a relatively large and equal number of neurons; 3. Training of a self-associative neural networks (SANN) on the basis of the training file prepared as described above. For this purpose, the coupled gradients (CG) method was used; 4. Removal of the last two layers in the self-associative neural networks (SANN). After the application of this procedure, the created network represents a structure that converts the primary (numerous) input data into less numerous data in the middle layer (formerly hidden, now output). The "reduced" network executes data processing from the input layer to the output (formerly hidden) layer to perform non-linear dimension reduction; 5. Use of the obtained network to generate a new version of the input data with a reduced dimension. In this way, a representative training file with a reduced dimension of the input data vector is obtained; 6. Finally, the creation of a new network solving the fundamental problem and then conducting the process of training it, using a data file with a reduced dimension.
The schematic procedure of neural classification is presented in Figure 7. For designing the neural models, an artificial neural network simulator, implemented in the statistical package Statistica v.10 suite, was used. Creating of the neural models was conducted in two stages. In the first stage, the efficient "Automatic network designer" implemented in the statistical environment was used. This tool allowed for the automation and simplification of initial network set searching procedures that would best model the analyzed process. During the second stage, the "User network designer" tool For designing the neural models, an artificial neural network simulator, implemented in the statistical package Statistica v.10 suite, was used. Creating of the neural models was conducted in two stages. In the first stage, the efficient "Automatic network designer" implemented in the statistical environment was used. This tool allowed for the automation and simplification of initial network set searching procedures that would best model the analyzed process. During the second stage, the "User network designer" tool was used. This tool was used repeatedly, modifying initial parameter-related settings and learning algorithms and the network structure itself [26].

Results and Discussion
The created MLP (multilayer perceptron) neural model was trained with the use of optimization algorithms implemented in the Statistica package v.10 [5]. According to the procedure implemented to the "Neural Networks" module, the learning process of the autoassociative network was conducted using the standard CG (conjugate gradients) algorithm. The optimization of SANN with the structure: 64:32:16:32:64 was realized in 1000 epochs. Then, the generated network was divided according to its symmetry axis (three hidden layers, Figure 8). The reduced, three-layer MLP SNN: 64-32-16 was further used to generate a new data set consisting of 16 compressed input variables and 176 training cases [27]. was used. This tool was used repeatedly, modifying initial parameter-related settings and learning algorithms and the network structure itself [26].

Results and Discussion
The created MLP (multilayer perceptron) neural model was trained with the use of optimization algorithms implemented in the Statistica package v.10 [5]. According to the procedure implemented to the "Neural Networks" module, the learning process of the auto-associative network was conducted using the standard CG (conjugate gradients) algorithm. The optimization of SANN with the structure: 64:32:16:32:64 was realized in 1000 epochs. Then, the generated network was divided according to its symmetry axis (three hidden layers, Figure 7). The reduced, three-layer MLP SNN: 64-32-16 was further used to generate a new data set consisting of 16 compressed input variables and 176 training cases [27]. Subsequently, the second MLP network (solving the main problem) was created and trained. A data set with a reduced dimension (16 descriptors) was used. For this purpose, the "Automatic designer" IT tool implemented in the Statistica v. 10 package was used. The training of the reduced neural model was performed using the BP (back propagation) algorithm, realized in 1000 epochs. The MLP topology 16-14-1 turned out to be the best neural network.
One-way MLP neural networks are among the best researched network topologies that are most commonly used in practice. The MLP multilayer perceptron represents a Subsequently, the second MLP network (solving the main problem) was created and trained. A data set with a reduced dimension (16 descriptors) was used. For this purpose, the "Automatic designer" IT tool implemented in the Statistica v. 10 package was used. The training of the reduced neural model was performed using the BP (back propagation) algorithm, realized in 1000 epochs. The MLP topology 16-14-1 turned out to be the best neural network.
One-way MLP neural networks are among the best researched network topologies that are most commonly used in practice. The MLP multilayer perceptron represents a class of so-called parametric neural models [28]. It is a one-way, multi-layer neural network taught by means of the "with teacher" technique. It is characterized by the number of neurons constituting its structure being significantly smaller than the number of cases of the training file. The most frequently used measure of the performance of an artificial neural network is the total error known as the RMS (Root Mean Square) error created by the generated model on the training file (training, testing, and validation) [29]. It is determined by summing the squares of individual errors, dividing the obtained sum by the number of used cases, and then determining the square root from the obtained quotient as below: n-number of cases, y i -real values, z i -values determined using the network. The RMS error is usually the most interpretable single value for the summary network error. The quality of the neural network generated as above should be considered good as RMS error for MLP: 16-14-1 was respectively: • 0.051272 for the training file; • 0.064537 for the validation file; • 0.073453 for the test file.
The similar values and low RMS error for the training, validation, and test file prove good generalization properties of the resultant ANN. Its small value, in turn, implies good classification properties of the generated model. The model structure of the generated synergistic MLP classifier and the scheme of the simulation process are shown in Figure 8.
The created hybrid neural model was taught with the use of optimization algorithms implemented in the Statistica v.10 package [11,12]. Table 3 presents a set of top 10 generated neural classifiers. RBF-radial basis function; MLP-multilayer perceptron; PI-pseudo-inversion, linear least-squares optimization; KMK-means, the assignment of centers; KN-K-nearest neighbor, the assignment of deviations; BP50-back propagation-50 training epochs; CG54conjugate gradient descent-54 training epochs.

Conclusions
Neural modeling and image analysis methods for identifying the quality of malting barley were found to be effective tools used in supporting the decision-making processes occurring during beer production. Quality identification of malting barley based on digital photos of malting barley kernels was best performed by a reduced neural network of the multilayer (MLP: 16-14-1) perceptron type. The conducted analysis allowed us to state that, for the correct qualitative classification of malting barley, it is enough to know