Classification of Varieties of Grain Species by Artificial Neural Networks

In this study, an Artificial Neural Network (ANN) model was developed in order to classify varieties belonging to grain species. Varieties of bread wheat, durum wheat, barley, oat and triticale were utilized. 11 physical properties of grains were determined for these varieties as follows: thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour parameters. It was found that these properties had been statistically significant for the varieties. An Artificial Neural Network was developed for classifying varieties. The structure of the ANN model developed was designed to have 11 inputs, 2 hidden and 2 output layers. Thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour were used as input parameters; and species and varieties as output parameters. While classifying the varieties by the ANN model developed, R2, RMSE and mean error were found to be 0.99, 0.000624 and 0.009%, respectively. In classifying the species, these values were found to be 0.99, 0.000184 and 0.001%, respectively. It has shown that all the results obtained from the ANN model had been in accordance with the real data.


Introduction
While marketing agricultural products, it is quite important for producers, industrialists and consumers to know the varieties of the concerned products.To do farming right, producers want to know the variety of the product they sow.At the same time, marketers also want to make sure of the product variety they sell in order to establish standards for target markets.For these reasons, reliable methods are necessary for identification of varieties.Use of traditional classification methods are slow and complex.Moreover, identification of varieties of grain species are carried out by subject matter experts and thus results are not objective and sound.Nevertheless, properties such as shape, size, colour and tissue belonging to grain products are not subject to a single mathematical function.It is extremely difficult to identify and classify these products because of their natural variability.To overcome these difficulties, fast, reliable and computer-based methods are preferred [1][2][3][4].It is crucially important to determine the physical properties of agricultural materials in terms of accurately estimating the parameters and characteristics recognized in engineering for designing [5,6].
Computer technology is an interdisciplinary research area.Different techniques such as Artificial Neural Network (ANN), fuzzy logic and genetic algorithm from Artificial Intelligence methods are used by researchers for data analysis.In recent years, artificial intelligence methods are also quite commonly seen in agricultural applications [7,8].
Agronomy 2018, 8, 123 2 of 14 ANN are used at the phase of model building in most of the works in the field of engineering.ANN are very effective methods in terms of modelling uncertain, nonlinear and complicated structures.Most of classical software used in predicting similar structures fail to give a result.ANN models constructed can give faster results.Also, ANN are capable of solving complicated problems [7].
ANN are the systems designed to model the methods used by the human brain.They are realizable with electronic circuits as equipment and with computers as software.In accordance with the data processing method of the brain, ANN can be considered as a parallel processor capable of collecting data after a learning period, keeping these data with connection weights between cells and generalizing.ANN are formed by the reunion of artificial neuron cells.Generally, cells are composed of 3 layers, where they come together in these layers to constitute the network [7].
In biological applications, ANN is frequently used for classification and product identification.ANN is quite effective and successful while working with non-linear and indefinite data.Therefore, it has a significant potential for classification and identification of agricultural products [9,10].
There are three layers in an ANN, consisting of input, hidden and output layers (Figure 1).Every layer comprises of neurons.These layers are connected to each other with weights.There are many learning algorithms to determine weights.The most popular of them is the back-propagation learning algorithm.This back-propagation algorithm is used to minimize the total error by changing the weights.Inputs coming from the previous layer are multiplied by the weights of connections corresponding to these inputs.To produce its output, every neuron processes the weighted inputs by a transfer function.The transfer function could be a linear or non-linear function.Data are randomly split into two as training and test sets.The objective is to find the weight values minimizing the difference between the real output and estimated output values in the output layer.The trained net is tested later by the data in the test set.Training of the net is finished when the test error has reached the determined tolerance value [11].
methods are used by researchers for data analysis.In recent years, artificial intelligence methods are also quite commonly seen in agricultural applications [7,8].
ANN are used at the phase of model building in most of the works in the field of engineering.ANN are very effective methods in terms of modelling uncertain, nonlinear and complicated structures.Most of classical software used in predicting similar structures fail to give a result.ANN models constructed can give faster results.Also, ANN are capable of solving complicated problems [7].
ANN are the systems designed to model the methods used by the human brain.They are realizable with electronic circuits as equipment and with computers as software.In accordance with the data processing method of the brain, ANN can be considered as a parallel processor capable of collecting data after a learning period, keeping these data with connection weights between cells and generalizing.ANN are formed by the reunion of artificial neuron cells.Generally, cells are composed of 3 layers, where they come together in these layers to constitute the network [7].
In biological applications, ANN is frequently used for classification and product identification.ANN is quite effective and successful while working with non-linear and indefinite data.Therefore, it has a significant potential for classification and identification of agricultural products [9,10].
There are three layers in an ANN, consisting of input, hidden and output layers (Figure 1).Every layer comprises of neurons.These layers are connected to each other with weights.There are many learning algorithms to determine weights.The most popular of them is the back-propagation learning algorithm.This back-propagation algorithm is used to minimize the total error by changing the weights.Inputs coming from the previous layer are multiplied by the weights of connections corresponding to these inputs.To produce its output, every neuron processes the weighted inputs by a transfer function.The transfer function could be a linear or non-linear function.Data are randomly split into two as training and test sets.The objective is to find the weight values minimizing the difference between the real output and estimated output values in the output layer.The trained net is tested later by the data in the test set.Training of the net is finished when the test error has reached the determined tolerance value [11].ANN is widely used in the classification of grain products.Paliwal et al. [12], using the physical properties of grains, studied the classification successes of various ANN structures.Visen et al. [9] classified 5 grain species using image processing and ANN techniques.They found the classification successes as follows: in barley 98.7%, in spring wheat 99.3%, in durum wheat 96.7%, in oat 98.4% and ANN is widely used in the classification of grain products.Paliwal et al. [12], using the physical properties of grains, studied the classification successes of various ANN structures.Visen et al. [9] classified 5 grain species using image processing and ANN techniques.They found the classification successes as follows: in barley 98.7%, in spring wheat 99.3%, in durum wheat 96.7%, in oat 98.4% and in rye 96.9%.Wang et al. [13] aimed at determining vitreous and non-vitreous kernels in durum wheat, having used image processing techniques and ANN.In their studies, they achieved the classification Agronomy 2018, 8, 123 3 of 14 success rate of 90.1%.Using image processing and ANN techniques, Visen et al. [14] classified barley, durum wheat, spring wheat, oat and rye.They found the classification successes as follows: in barley 96.4%, in durum wheat 90.8%, in spring wheat 98%, in oat 95.5% and in rye 96.4%.Baykan et al. [15] used image processing and ANN techniques to classify wheat kernels according to their species.They obtained 9 physical properties and grey level average of the kernel.They classified the descriptive properties of the kernel by means of multi-layer sensors.They reported the classification success rate to be 72.62% for 5 wheat species.Dubey et al. [10] classified three species of bread wheat grown in three different centres.Using ANN, they conducted classification by 45 formal property data.They found the classification success rate to be 88%.Babalık et al. [16] used 9 physical properties and colour information of kernels in order to determine wheat species.They developed the ANN model, using this descriptive information belonging to kernels and obtained the classification success rate of 90.66%.Pazoki and Pazoki [17] used the ANN techniques in their wheat classification study in which they obtained the mean success rate of 86.48%.Taner et al. [18] used the ANN techniques in their study to classify oats and obtained a classification success rate of 99.99%.
In purchasing grain species, operations for setting price and classification are conducted by experts by means of visual inspection method; thus, results are subjective and they may differ.Furthermore, grain species have a non-linear structure with respect to properties such as form, size and colour.This also causes difficulties in identification and classification of products.Computer-aided and intelligent systems are needed to eliminate these negations and contribute to the design parameters required in agricultural machinery sector.In this study, it is aimed at developing an ANN for determining the physical properties of varieties of grain species and classifying these varieties.

Materials and Methods
In the study, 28 bread wheat varieties, 11 durum wheat varieties, 8 barley varieties, 6 oat varieties and 4 triticale varieties were used as material (Table 1).The varieties used were obtained from the Bahri Dagdas International Agricultural Research Institute (Konya, Turkey, 2015).The physical properties belonging to the varieties were determined in the study.Thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour were the parameters used.Kernels belonging to the varieties used in the study were cleaned from all sorts of foreign materials such as dust, stones, hays, immature and damaged kernels.Measurements were made at the moisture content of 8.9%.

Physical Properties
In order to determine the physical sizes of kernels, 100 kernels were randomly divided into groups of 10 each.Lengths, widths and thicknesses of the 10 kernels from each group defined were measured and their averages were taken.In the measurements, a digital Vernier calliper with a sensitivity of 0.1 mm was used [25,26].
Geometric mean diameters, sphericities, kernel volumes and kernel surface areas were calculated by the following formulas [23,25].
Thousand kernel weight was determined by weighing 400 kernels counted with 4 replications and taking their average [27,28].
Bulk density was measured according to the method of Association of Official Analytical Chemists.In this method, kernels were filled into a cylinder of 500 mL from the height of 15 cm.The kernels in the cylinder were weighed by flattening and sweeping them without any pressure applied.Bulk density was calculated by proportioning the weight of the kernels to the cylinder volume [25,29,30].
True density was measured by using the water displacement method.In this method, first, 500 mL water was filled into a cylinder of 1000 mL.Then, 30 g of kernels were put into the water in this cylinder.The rise in the water level was immediately measured.True density was calculated by proportioning the weight of the kernels to the displaced liquid volume [29,31].
To calculate the porosity, the following equation was used [23].
Colour parameters (L, a, b) were measured by using the Hunter Lab Mini Scan XEplus Colourimeter (Hunter Associates Laboratory Inc., Reston, VA, USA).In the Hunter scale, L value of 100 means white while zero means black; when the value of a is positive, it means redness; when negative, greenness; and the value of b is positive, it means yellowness; when negative, blueness [32,33].

Artificial Neural Networks
The ANN model was developed by using the Matlab NN Toolbox (The Mathworks Inc., Natick, MA, USA).In the model, 539 data in total were used.In the ANN model, thousand kernel weight, geometric mean diameter, sphericity, kernel volume, surface area, bulk density, true density, porosity and colour parameters (L, a, b) were used as input parameters; and species and varieties as output parameters.
While establishing the ANN model, all the data were normalized between 0 and 1 [34].
For normalization, the following equation was used: To obtain real values from the normalized values, "y" value was calculated using the same formula.To develop the ANN model, normalized data were divided into two data sets of training and test.In the training set, 502 data were used, whereas 37 data in the test set.The numbers of the most fit neurons in the hidden layers were found to be in the range of 2-25 by the trial and error method.In the ANN model, to obtain the most fit epoch number, epoch numbers from 1 to 10,000 were tried.As a result of trials, the most fit epoch number for the model was determined.
In the ANN model, Feed Forward Back Propagation, Multilayer Perceptron network structure was used.The back-propagation algorithm in this network is the most popular and commonly used algorithm.It minimizes the total error by varying the weights in order to enhance the network performance [35,36].The training algorithm used is the Levenberg-Marquart algorithm [37,38].
Training of the network was continued until the test error reaches the determined tolerance value.After training of the network ended successfully, the network was tested by test data [9].
In order to determine the performances of the results, RMSE and R 2 values that are considered to be principal accuracy measures and that are based on the concept of mean error and commonly used were calculated using the following formulas [39].
Here RMSE, Root Mean Square Error, R 2 , coefficient of determination, m, number of data, x, real value and x 1 , estimated value.
The error between real values and estimated ones was calculated by means of the following equation [40].
Here ε, relative error, m, data number, x, real value and x 1 , estimated value.Data concerning the physical parameters obtained were evaluated by conducting a factorial experiment in a randomized complete block design, using the JMP statistical program (SAS Institute Inc., Cary, NC, USA) [41].

Physical Properties
The mean and standard deviation values of the physical properties of the varieties of species of bread wheat, durum wheat, barley, oat and triticale used in the study are given in Table 2.All of the physical properties are found to be statistically significant (p < 0.05).

Thousand Kernel Weight
The highest thousand kernel weight values were 48.47 g for durum wheat and 48.49 for barley, the lowest was 33.83 g for oat being placed at the last group.The highest thousand kernel weight was obtained by 56.80 g for durum wheat variety of Meram-2002.The lowest value obtained was 28.18 g for the oat variety of Faikbey.Babi´c et al. [42], in their study on bread wheat, obtained a mean thousand kernel weight of 44.01 g.Topal et al. [43], in their study on 16 lines and varieties of durum wheat, obtained a mean thousand kernel weight of 41.73 g.Güner [44] reported the mean thousand kernel weight to be 47.38 g for durum wheat.Dursun and Güner [45], in their study on barley, obtained a thousand kernel weight of 43 g; and Güner [44] obtained 38.18 g in another study.Molenda and Horabik [46], in their study on oat, found a mean thousand kernel weight of 35.6 g; and Nelson [47] found it to be 34.8 g.

Geometric Mean Diameter
Geometric mean diameter values varied from 3.93 mm to 4.44 mm.The highest value was obtained in barley, the lowest value was obtained in bread wheat.While the barley variety of Larende (4.60 mm) had the highest geometric mean diameter, the bread variety of Yakar-99 (3.71 mm) had the lowest value.Babi´c et al. [42], in their study on bread wheat, obtained the geometric mean diameter to be 3.13 mm; Markowski et al. [48] obtained it to be 4.13 mm.Güner [44] obtained the geometric mean diameter as 4.35 mm for durum wheat.In studies on barley, geometric mean diameters were reported to be 4.43 mm by Tavakoli et al. [49]; 4.38 mm by Güner [44] and 4.54 mm by Song and Litchfield [50].

Sphericity
Sphericity values ranged from 34.76% to 60.85%.The highest value was obtained in the bread wheat variety of Kınacı-97 (65.46%), the lowest value in the oat variety of Faikbey (32.44%).Markowski et al. [48], in their study on bread wheat, determined the sphericity value to be 57.12%.Güner [44] obtained the sphericity value of durum wheat to be 55.06%; and Topal et al. [43] obtained it as 50.56%.Tavakoli et al. [49], in their study on barley, found it to be 47.70% and Güner [44] found it as 46.10%.

Kernel Volume
Kernel volume values ranged from 21.04 mm 3 to 28.11 mm 3 .The highest value was obtained in the barley variety of Larende (31.13 mm 3 ), the lowest was obtained in the bread wheat variety of Karahan-99 (17.01 mm 3 ).Markowski et al. [48], in their study on bread wheat, determined the kernel volume value to be 36.95mm 3 .Güner [44], in his study, obtained the kernel volume of the durum wheat to be 35.76mm 3 ; Nelson [47] found it as 26.1 mm 3 .Güner [44], in his studies on barley, obtained the kernel volume to be 38.37 mm 3 .Nelson [47] obtained the mean kernel volume to be 21.4 mm 3 for spring oat and 26.8 mm 3 for winter oat.

Surface Area
Whereas oat (55.22 mm 2 ) had the largest surface area, bread wheat (40.96 mm 2 ) had the lowest value.Babi´c et al. [42], in their study on bread wheat, obtained the surface area to be 30.07mm 2 .Topal et al. [43], in their study, reported that they found the surface area of durum wheat to be 26.93 mm 2 .Güner [44], in his study, obtained the surface area of barley to be 25.10 mm 2 .

Bulk Density
Bulk density values varied from 482.80 kg/m 3 to 773.17 kg/m 3 .The highest value obtained was in the wheat variety of Pehlivan (804.44 kg/m 3 ), the lowest value in the oat variety of Argentina (449.99 kg/m 3 ).Markowski et al. [48], in their study on bread wheat, reported the bulk density value to be 732.57kg/m 3 .In studies on durum wheat, bulk density values were reported by Nelson [47] Agronomy 2018, 8, 123 8 of 14 788 kg/m 3 , Güner [44] 815 kg/m 3 and Sokhansanj and Lang [51] 760 kg/m 3 .In the studies on oat by different researchers, bulk density values varied in the range of 412-576 kg/m 3 [23,46,47].

True Density
True density values ranged from 997.36 kg/m 3 to 1271.88 kg/m 3 .The highest value obtained in the bread wheat of Alpu 2001 (1316.20 kg/m 3 ), the lowest value in the oat variety of Argentina (954.32 kg/m 3 ).Markowski et al. [48], in their study on bread wheat, reported the true density as 1382.17kg/m 3 .Nelson [47] obtained the true density value for durum wheat to be 1411 kg/m 3 .In other durum wheat studies, the true density value was obtained by Güner [44] 1325 kg/m 3 ; Sokhansanj and Lang [51] 1370 kg/m 3 .Güner [44], in his study on barley, obtained the true density as 995 kg/m 3 .Nelson [47] found a mean true density value of 1314 kg/m 3 for spring oat and 1295 kg/m 3 for winter oat.In other studies on oat, true density values varied in the range of 950-1397 kg/m 3 [23,46].

Porosity
Porosity values varied from 39.18% to 51.54%.The highest value obtained in the oat line of Y-1779 (53.42%), the lowest value in the bread wheat variety of Kınacı-97 (37.48%).Markowski et al. [48], in their study on bread wheat, reported the porosity value to be 46.9%.Whereas Güner [44] found the porosity value to be 38.49%,Sokhansanj and Lang [51] found it to be 45%.Güner [44] obtained the porosity value 31.25% for barley.Molenda and Horabik [46], in their study, obtained the porosity values of in the range of 59.5-62.5% for oat.

Colour
Values of colour parameters (L, a, b) belonging to varieties are given in Table 2. On average, L, a and b values were obtained to be 50.74,7.61 and 17.85, respectively.It was determined that there had been a statistically significant difference between species and varieties (p < 0.05).Colour differences in the classification of varieties in wheat are important [4].
All the physical parameters evaluated are placed at statistically different groups indicated that each of these parameters is important in classification of varieties.

Artificial Neural Networks
In the ANN model, the structure of the network was designed in the form of 11-(7-7)-2, consisting of 11 input, 2 hidden and 2 output layers (Figure 1).As training algorithm, the Levenberg-Marquart algorithm was used [37,38], As transfer function, tansig was used in the first hidden layer, logsig in the second hidden layer; and linear functions were used in the output layer.For the network, the lowest training error was obtained at the epoch number of 10,000.
The mathematical formula of the ANN model is given in the following equation.
LOGSIG transfer function for the second hidden layer (F k ), Agronomy 2018, 8, 123 9 of 14 TANSIG transfer function for the first hidden layer (F j ), (1 + e (−2NET j ) ) − 1 ( 14) were calculated using the equations above.In these equations; i, number of inputs, j, number of neurons in the first hidden layer, k, number of neurons in the second hidden layer, m, number of outputs, W 1 , W 2 , W 3 , connection weight, x, input parameter, y m , output parameter and b, bias.Weights are given in Tables 3-5 and bias values in Table 6.Among the models obtained, the ANN model with the lowest RMSE and the highest R 2 value were determined to be the best fit.Whereas R 2 and RMSE values for species in the training set were found to be 0.99 and 0.000027, respectively; in the test set, they were found to be 0.99 and 0.000184, respectively.While R 2 and RMSE values for varieties in the training set were found to be 0.99 and 0.000318, respectively; in the test set, they were found to be 0.99 and 0.000624 (Table 7).Coefficient of determination (R 2 ) of the correlation between experimental data and the predicted values from the ANN model was found to be 99.99% for both species and varieties (Figures 2 and 3).Coefficient of determination (R 2 ) of the correlation between experimental data and the predicted values from the ANN model was found to be 99.99% for both species and varieties (Figures 2 and 3).Experimental data, predicted values calculated from the ANN model and the error values between them are given in Table 8.The mean error value for variety was found to be 0.009% and 0.001% for species.Coefficient of determination (R 2 ) of the correlation between experimental data and the predicted values from the ANN model was found to be 99.99% for both species and varieties (Figures 2 and 3).Experimental data, predicted values calculated from the ANN model and the error values between them are given in Table 8.The mean error value for variety was found to be 0.009% and 0.001% for species.Experimental data, predicted values calculated from the ANN model and the error values between them are given in Table 8.The mean error value for variety was found to be 0.009% and 0.001% for species.

Figure 2 .
Figure 2. Regression graphics between ANN-predicted values and experimental data for species.

Figure 3 .
Figure 3. Regression graphics between ANN-predicted values and experimental data for variety.

Figure 2 .
Figure 2. Regression graphics between ANN-predicted values and experimental data for species.

Figure 2 .
Figure 2. Regression graphics between ANN-predicted values and experimental data for species.

Figure 3 .
Figure 3. Regression graphics between ANN-predicted values and experimental data for variety.

Figure 3 .
Figure 3. Regression graphics between ANN-predicted values and experimental data for variety.

Table 1 .
Varieties used in the study.

Table 2 .
The mean and standard deviation values of the physical properties of the grain varieties.
Mean: Arithmetic mean.CV: Coefficient of Variation.LSD: Least Significant Difference.

Table 7 .
Performance of the ANN model.

Table 7 .
Performance of the ANN model.

Table 7 .
Performance of the ANN model.