Yield and Quality Prediction of Winter Rapeseed—Artiﬁcial Neural Network and Random Forest Models

: As one of the greatest agricultural challenges, yield prediction is an important issue for producers, stakeholders, and the global trade market. Most of the variation in yield is attributed to environmental factors such as climate conditions, soil type and cultivation practices. Artiﬁcial neural networks (ANNs) and random forest regression (RFR) are machine learning tools that are used unambiguously for crop yield prediction. There is limited research regarding the application of these mathematical models for the prediction of rapeseed yield and quality. A four-year study (2015–2018) was carried out in the Republic of Serbia with 40 winter rapeseed genotypes. The ﬁeld trial was designed as a randomized complete block design in three replications. ANN, based on the Broyden–Fletcher–Goldfarb–Shanno iterative algorithm, and RFR models were used for prediction of seed yield, oil and protein yield, oil and protein content, and 1000 seed weight, based on the year of production and genotype. The best production year for rapeseed cultivation was 2016, when the highest seed and oil yield were achieved, 2994 kg/ha and 1402 kg/ha, respectively. The RFR model showed better prediction capabilities compared to the ANN model (the r 2 values for prediction of output variables were 0.944, 0.935, 0.912, 0.886, 0.936 and 0.900, for oil and protein content, seed yield, 1000 seed weight, oil and protein yield, respectively).


Introduction
High and stable yield and oil content are the most important traits in rapeseed (Brassica napus L.) breeding programs.According to [1], in the last five years the world average rapeseed yield was about 2.1 t/ha.Rapeseed seed yield and quality vary depending on location, cultivar and their mutual interaction [2,3].Seed yield is mainly affected by environmental variation such as climatic factors (temperature, precipitation, length of photoperiod, abiotic stresses), soil type, and cultivation practice (density and time of sowing, fertilization).Due to the abovementioned factors, seed yield prediction is an exceedingly challenging task.
Early yield prediction especially comes to focus in years when extreme weather events unfavourably influence crop yield.Being able to forecast low yield leaves space to make on-time warning and develop a strategy to maintain a stable food supply chain.It is forecasted that in the near future precipitation levels will rise in northern Europe, which is among others expected to reflect on higher seed yield [4].On the other hand, southern Europe will suffer from high temperatures accompanied by drought, which both adversely affect yield [5]. Ref. [6] tested several regression models and concluded that in all of the models an increase in precipitation during autumn and winter was negatively correlated with the yield of winter rapeseed, whereas a temperature rise during flowering had a positive effect.However, different authors claimed that higher temperatures negatively affect rapeseed yield [7][8][9].Differences in the observed temperature effects may have risen because of different growing conditions in locations where the trials were set up.Namely, the trials of [6] were conducted in Denmark, where the climate was cooler with temperate springs.Hence, it is possible that measured temperatures did not surpass critical values as in [7][8][9], when they reflected negatively on seed yield.Seed and silique forming and development are the most important phenological phases that affect yield, which is mainly determined before ripening [10].
Yield prediction is an important part of the precision agriculture concept.Knowledge of weather and plant conditions may assist farmers, big producers, output buyers and suppliers in the early prediction of crop yield by providing them with valuable information regarding return and expected financial benefit.Data gained via remote sensing imaging with unmanned aerial vehicles (UAVs) are not only valuable for monitoring crop conditions, especially changes in crop nitrogen concentration, disease occurrence, flowering time and pod ripening [11][12][13], but can also help in estimating final yield.In the study of [14], remote sensing of vegetation indices with UAV during flowering was used to estimate rapeseed yield before harvest.They tested various vegetation indices, where the most accurate had an estimation error under 13%.Machine learning models are handy for different tasks, especially when considering living systems in which linear regression models often disregard complex interactions between variables.In [15], an enhanced vegetation index, solar-induced chlorophyll fluorescence, climate and different combinations of the mentioned variables were used as input data to compare the performance of different models.Non-linear models, such as random forest regression (RFR) and neural networks, outperformed linear models mostly because relations between examined variables were non-linear.Ref. [16] emphasized the efficacy of RFR in staple crop yield prediction.An RFR model that relies on near-infrared vegetation reflectance during several growth stages was successfully used to forecast rapeseed yield [17]. Apart from regression analysis, cutting-edge statistical models that use artificial neural network (ANN) models can be incorporated into yield predictions [18].In addition, machine learning models are capable of establishing patterns and correlations among data [19].Still, they do not reveal the actual cause of a relationship.This is why the dataset for a model of interest needs to go through a training phase first.
Lately, ANN was used for the estimation of crop yield and quality [18,20].ANN models are considered to have higher accuracy in comparison with regression models [21].The number of hidden nodes influences the precision of yield prediction in terms that models with fewer nodes than the starting number of nodes are better [21].Ref. [22] reported that machine learning models perceive seed yield as a function of input variables, such as genotypes and environments.Ref. [23] developed an ANN with weather, soil and management data as input and predicted maize yield with 80% accuracy.Ref. [24] predicted genotypic effects of rapeseed lines and hybrids with the aid of SNP markers.Since the correlation of genetic prediction with phenotype for yield-related traits produced similar values, such as estimated heritability, it was highlighted that this approach could be used to predict best-performing genotypes [24].
ANN is used as an additional tool to assist in the seed classification of rapeseed varieties [25,26].In [18], input consisted of quantitative (precipitation, temperatures, applied fertilizers) and qualitative data (fertilizer type, liming, tillage type, sowing date and previous forecrop).Ref. [18] tested three models that differed in terms of predictive dates for plant development stages to foresee rapeseed yield.
To the best of the authors' knowledge, this is the first study to use the ANN model to predict yield-related traits as well as oil and protein content in rapeseed.The objective of this study was to investigate the possibility of predicting oil and protein content, seed yield, oil and protein yield, and 1000 seed weight, based on the year of production and genotype, using artificial neural network (ANN) and random forest regression (RFR) models.

Plant Material and Trial Design
Six traits related to yield and seed quality were surveyed (i.e., oil and protein content (OC, PC), seed yield (SY), oil and protein yield (OY, PY), and thousand seed weight (TSW)) on 40 winter-type rapeseed genotypes (Table 1) during four consecutive years (2015-2018).The trial was set up as a randomized complete block design with three replicates at Rimski šančevi, Serbia (45 • 19 53.7 N 19 • 50 12.6 E).The size of the experimental plot was sized 4 × 1.5 m with 55-65 plants/m 2 at the harvest.Sowing and harvesting were carried out in the optimum times, which are in September and June in each season.Prior to sowing, the soil received an adequate amount of NPK 15:15:15 (nitrogen, phosphorus, potassium) fertilizer (250-450 kg/ha), with respect to soil analysis results.Standard production technol-ogy was applied during plant cultivation.The yield was surveyed on each plot.Thousand seed weight was calculated by counting subsamples of 200 seeds per plot per replicate.Oil content in dry seeds was determined using Newport 4000 NMR and is represented as % of dry matter (d.m.).Nitrogen content was determined by the Dumas combustion method (EN ISO 16634-1:2008) and expressed in % of dry matter.Nitrogen content in % was multiplied with a conversion factor 6.25 to gain the overall protein content.Oil and protein yield in kg/ha were obtained by multiplying seed yield by seed oil and protein content, respectively.
Meteorological data (average daily temperature and precipitation) were collected from the meteorological station "Rimski šančevi" of the Republic hydrometeorological service of Serbia, which is located near the experimental field.
The colour plot diagram for mean genotypic values of the rapeseed samples was calculated and plotted using R software v.4.0.3 (64-bit version).The corrplot instruction was applied, with the "circle" method enabled, as a graphical tool to represent the correlation between the mean genotypic values of the observed samples.
Two different machine learning algorithms were employed to foresee the oil and protein content, seed yield, oil and protein yield, and 1000 seed weight based on the year of production and genotype, including ANN and RFR.These two machine learning methods are broadly utilized and proved to be effective [27,28].

ANN Modeling
The artificial neural network model is inspired by the structure and function of the neural network of human brain.An ANN consists of three input layers in addition to hidden and output layers.The nodes of such a network are interconnected and pass on information in the same way as neurons do in a brain.Our ANN model was built using data from Table 1.The inputs were the year of production and genotype.A multi-layer perceptron model (MLP) scheme, which consisted of three layers, was used for modelling two artificial neural network models (ANN) for the prediction of oil and protein content, seed yield, oil and protein yield, and 1000 seed weight, based on the year of production and genotype.According to the literature, the ANN models were proven as quite capable of approximating non-linear functions [29][30][31][32].This is important for the study of living organisms where many relations between the examined variables are complex and nonlinear.An important advantage of the ANN model is its ability to derive previously unseen relationships.Before the calculation, both input and output data were normalized (according to the min-max normalization scheme) to improve the behaviour of the ANN.During this iterative process, input data were repeatedly presented to the network [33,34].The Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was used as an iterative method for solving unconstrained non-linear optimization during the ANN modelling.
The experimental database for the ANN was randomly divided into training, crossvalidation, and testing datasets (with 60%, 20%, and 20% of experimental data, respectively).The training dataset was used for the learning cycle of the ANN and also for the evaluation of the optimal number of neurons in the hidden layer and the weight coefficient of each neuron in the network.A series of different topologies were used, in which the number of hidden neurons varied from 5 to 10, and the training process of the network was run 100,000 times with random initial values of weights and biases.The optimization process was performed based on validation of error minimization.It was assumed that successful training was achieved when the learning and cross-validation curves approached zero.
Coefficients associated with the hidden layer (weights and biases) were grouped in matrices W 1 and B 1 , respectively.Similarly, coefficients associated with the output layer were grouped in matrices W 2 and B 2 .It is possible to represent the neural network by using matrix notation (Y is the matrix of the output variables, f 1 and f 2 are transfer functions in the hidden and output layers, respectively, and X is the matrix of input variables) [35]: Weight coefficients (elements of matrices W 1 and W 2 ) were determined during the ANN learning cycle.They were updated using optimization procedures to minimize the error between the network and experimental outputs [33,36,37], according to the sum of squares (SOS) and BFGS algorithms, used to speed up and stabilize convergence [38].The coefficients of determination were used as parameters to check the performance of the obtained ANN model.
The collected data for ANN modelling were processed statistically using the software package StatSoft Statistica, ver.10.0, Palo Alto, CA, USA.

RFR Modeling
The random forest model (RF) is a broadly employed machine learning algorithm that is constructed upon decision trees to predict outputs according to prediction variables [39].The RF model can be utilized for classification or regression purposes.The random forest regression method is used for the mean prediction of individual trees, in consistence with decision trees developed according to the training dataset [40].Both ANN and RFR as machine learning models have limitations regarding interpretation, which is very important, particularly in life sciences.Still, they offer valuable insights not only into yield assessment, but also into seed quality parameters, such as oil and protein content.In addition, the RF model can reveal the importance of features.The RFR models were constructed upon the data presented in Table 1.Similarly to the ANN model, the inputs for the RFR models were the year of production and genotype.During random forest regression model calculation for the prediction of seed yield, oil and protein yield, 1000 seed weight, and oil and protein content, based on the year of production and genotype, a large set of decision trees was constructed and each tree was built according to the specific bootstrap sample within a training dataset [41].In this study, the bootstrap function was employed to randomly split the dataset into homogeneous subsets, namely training and test subsets, which explained 60% and 40% of the entire data, respectively [42].New sub-samples were selected from the input sample dataset and multiple trees were added to the RFR structure to fit the obtained sub-samples.During the training cycle, the RFR model averaged the results of the created trees in order to minimize the error of prediction [28].During the RFR calculation, the number of trees parameter was set to 100, 200, 300, 400, 500, and 10,000, while the random test data proportion was set to 40% and the sample proportion was 50%.
The building of the RFR models was performed using StatSoft Statistica, ver.10.0, Palo Alto, CA, USA.

The Accuracy of the Model
The numerical verification of the developed models was tested using the coefficient of determination (r 2 ), reduced chi-square (χ 2 ), mean bias error (MBE), root mean square error (RMSE), and mean percentage error (MPE).MBE and RMSE have the same unit-like variable.These commonly used parameters can be calculated as follows [43]: where x exp,i stands for the experimental values and x pre,i are the predicted values calculated by the model; N and n are the number of observations and constants, respectively.

Yield-Related Components
Seed and oil yield, as well as oil content, had the highest values in 2016.That year was favourable for rapeseed growing and over half of the examined genotypes yielded more than 2950 kg/ha.According to four-year mean values, Jelena and NS-L-32 belong to the same group with the highest seed yield as determined by Duncan post hoc test (Table 2).On the other hand, NS-L-45 exhibited the lowest yield.The top two genotypes with the highest oil yield were the same as for seed yield, namely Jelena and NS-L-32.NS-L-45 had the lowest oil yield.Protein yield varied among years, whereby in 2016 and 2018 their average values differed by only 19.44 kg/ha.The average protein yield was 444.93 kg/ha.The highest protein yield was noted for NS-L-32 and NS-L-136 and the lowest for Kata and NS-L-45.The mean genotypic value for 1000 seed weight in the period 2015-2018 was 4.28 g.NS-L-44 had the highest and Express the lowest 1000 seed weight.The mean seed oil content ranged between 41.57% and 46.85%, with a grand mean of 44.41%.The highest yearly average of 46.83% was recorded in 2016 and the lowest, 41.56%, in 2015.Protein content was the highest (22.57%) in 2018 and the lowest (18.24%) in 2016.Valeska svetla had the highest seed protein content (23%), which was 2% more than the grand mean for all other genotypes.

Correlation Analysis
Statistically significant correlations (p ≤ 0.05) were found for all analysed traits.During 2015-2018, oil content was in a strong negative correlation with protein content (Figure 1).The size and the circle's colour depend on the correlation coefficients; if the colour is blue, a positive correlation was achieved, and on the contrary the red colour represents the negative correlation.Additionally, the circle's size is increased with the absolute value of the correlation coefficient.The highest positive correlations were found between seed and oil yield (r = 0.995), seed and protein yield (r = 0.943), and oil and protein yield (r = 0.921).Oil content was positively correlated with seed yield, 1000 seed weight, and oil and protein yield, whereas correlation with all traits except 1000 seed weight was strong.Interestingly, only 1000 seed weight was weakly correlated with all traits, negatively with protein content and positively with the other traits.

ANN Model
The acquired optimal neural network model showed good generalization capabilities for the experimental data, and could accurately predict the oil and protein content, seed yield, oil and protein yield, and 1000 seed weight, based on the year of production and genotype.The number of neurons for the ANN model was eight (network MLP 2-8-6) to obtain the highest values of r 2 (during the training cycle r 2 for output variables were: 0.742; 0.757; 0.853; 0.705; 0.872 and 0.807, for oil and protein content, seed yield, 1000 seed weight, oil and protein yield, respectively); see Table 3.The potential of the ANN model to predict yield and quality components is presented by scatter plots (Figure 2).The distribution pattern of predicted values differed in comparison with the scatter plots obtained by the RFR model (Figure 3).

ANN Model
The acquired optimal neural network model showed good for the experimental data, and could accurately predict the oil yield, oil and protein yield, and 1000 seed weight, based on t genotype.The number of neurons for the ANN model was eig obtain the highest values of r 2 (during the training cycle r 2 for ou 0.757; 0.853; 0.705; 0.872 and 0.807, for oil and protein conte weight, oil and protein yield, respectively); see Table 3.The obtained ANN model for the prediction of output variables was built upon 78 weights-bias coefficients due to the high nonlinearity of the observed system [44,45].
The goodness of fit between experimental measurements and model-calculated outputs, represented as ANN performance, is shown in Table 4.For seed yield, RMSE was 303.25 kg/ha, which accounts for 14.11% of the overall observed yield mean.The RMSE for oil yield and oil content was 139.88 kg/ha and 1.25%, respectively, which accounts for 14.45% and 2.81% of the overall observed oil yield and content, respectively.
The ANN model predicted experimental variables reasonably well for a broad range of the process variables.For the ANN model, the predicted values were very close to the measured values in most cases, in terms of r 2 values.The obtained ANN model for the prediction of output variables was built upon 78 weights-bias coefficients due to the high nonlinearity of the observed system [44,45].
The goodness of fit between experimental measurements and model-calculated outputs, represented as ANN performance, is shown in Table 4.For seed yield, RMSE was 303.25 kg/ha, which accounts for 14.11% of the overall observed yield mean.The RMSE for oil yield and oil content was 139.88 kg/ha and 1.25%, respectively, which accounts for 14.45% and 2.81% of the overall observed oil yield and content, respectively.

RFR Model
The acquired optimal random forest models showed good prediction capabilities for the experimental data, and could be used to adequately foresee the oil and protein content, seed yield, oil and protein yield, and 1000 seed weight, based on the year of production and genotype.The number of trees for the RFR models were 1000, 590, 590, 1000, 200, and 1000, respectively for oil and protein content, seed yield, 1000 seed weight, oil and protein yield, to obtain the highest values of r 2 (during the training cycle, r 2 values for the output variables were 0.944, 0.935, 0.912, 0.886, 0.936, and 0.900, respectively); see Table 5.The ANN model predicted experimental variables reasonably well for a broad range of the process variables.For the ANN model, the predicted values were very close to the measured values in most cases, in terms of r 2 values.

RFR Model
The acquired optimal random forest models showed good prediction capabilities for the experimental data, and could be used to adequately foresee the oil and protein content, seed yield, oil and protein yield, and 1000 seed weight, based on the year of production and genotype.The number of trees for the RFR models were 1000, 590, 590, 1000, 200, and 1000, respectively for oil and protein content, seed yield, 1000 seed weight, oil and protein yield, to obtain the highest values of r 2 (during the training cycle, r 2 values for the output variables were 0.944, 0.935, 0.912, 0.886, 0.936, and 0.900, respectively); see Table 5.The RFR model showed much better prediction characteristics for oil and protein content, seed yield, oil and protein yield, and 1000 seed weight based on the year of production and genotype in comparison with the ANN model (Tables 4 and 5).
The RFR and ANN models had an insignificant lack of fit tests, which means the models satisfactorily predicted output variables.

Discussion
In this research, 40 rapeseed genotypes were analysed during four years for the assessment of yield and quality components.NS-L-45 was the lowest yielding genotype, probably due to low performance in 2015, which was below 1000 kg/ha.In this study the following pattern was observed: genotypes that had the highest (Jelena, NS-L-32) and the lowest yield (NS-L-45) also had the highest and lowest oil yield.Ref. [10] reported that at the same environments (locations) and years, the highest seed and oil yield were recorded.Since they evaluated only three genotypes, our pattern cannot be extrapolated and discussed for comparison of genotype performance.In 2016, rapeseed had the highest yearly mean value for oil content (46.83%).Throughout May of the same year, during flowering and at the beginning of the seed filling phase, precipitation was higher than the long-term average (64.6 mm for the period 1964-2014).Although precipitation is one of the main factors that positively influence oil content in rapeseed [46,47], it should be kept in mind that it is not the only factor influencing oil content, since in years with high precipitation during seed filling oil content may not be as high as expected [48].This is in line with our study, as in May 2015 precipitation was three times higher than the long-term average and the oil content was lowest in that year.Considering that oil and protein content are negatively correlated traits [49][50][51], 2016, the year with the lowest average protein content, was advantageous in terms of oil content.Still, not all analysed genotypes with high oil content had a lower share of proteins, e.g., in 2015 NS-L-7 had 2% higher oil content and 0.71% higher protein than average, and in 2017 NS-H-R-2 had both higher oil and protein content relative to average year values for that year.These genotypes are regarded as good resources for further breeding towards high oil and protein content, because their protein content does not sink abruptly with increasing oil content, as in the case of other genotypes.
Most traits that are used for rapeseed breeding are polygenic and represent the result of the interaction of several components.Knowledge regarding trait correlations is important for success in breeding.Due to the low heritability of yield, indirect selection appears to be the best breeding solution.The strength of the correlation between two analysed traits may differ in different agroecological growing conditions.A strong negative correlation between oil and protein content was previously reported [48,52,53].An increase in oil content in the seed can arise whether at the expense of decreasing protein content, or by reduction of other seed components [49].Oil content is under the control of a large number of genes that have additive and non-additive effects, whereas the environment has an impact only on additive components of genetic variance [54].Refs.[55,56] also reported a high positive correlation between oil content and seed yield.However, unlike our results, they did not find a significant correlation between oil content and oil yield.In relation to a high positive correlation between seed yield on the one hand, and oil and protein yield on the other, it can be concluded that with higher seed yield oil and protein yield tends to increase, as can be realized from Figure 1.The analysis of 20 rapeseed traits with path analysis revealed 1000 seed weight to be the most important trait that influences yield [57].A positive relationship between 1000 seed weight and seed yield was also reported by [58,59].Results from [60] are contrasting in terms of claiming a negative correlation between these traits.Observed differences probably occurred because of the stronger influence of environmental (weather) variables on yield and 1000 seed weight over the analysed years.It can be hypothesized that in years with adverse weather conditions, rapeseed plants will decrease the number of seeds per silique, but seed size may increase, thus resulting in higher seed weight.Similar to our dataset, which consisted of temperature, precipitation, and cultivar data, on the list of variables [19] that were most often used for prediction of crop yield using different machine learning models, temperature was positioned first, rainfall third, and crop information (e.g., cultivar, crop density) fourth.The process of seed filling is generally susceptible to environmental conditions, especially temperature and precipitation.Thus, we assume that in that period information regarding weather conditions is more important than crop information for yield prediction.
The use of classic statistical procedures for the analysis of both dependent and independent variables is not as efficient as the use of machine learning models.Machine learning models make it possible to predict a larger number of variables.Non-linear machine learning models for the evaluation of yield-related traits enable deciphering non-linear relations among dependent and independent variables [61].Prediction models can be efficiently used for rapeseed and other crop yield prediction, offering the possibility for early yield assessment, thus enhancing farmers in the decision-making process toward optimum production.ANN and RFR, among other machine learning models, cope well with the analysis of complex data.The developed mathematical models provided an efficient insight in the prediction of oil and protein content, seed yield, 1000 seed weight, oil and protein yield, and the influences of production parameters, such as year of production and crop genotype, on the abovementioned traits.With the aid of these models, it is easier to predict the effects of different weather circumstances, or of the selected cultivar on yield and quality, as well as to choose which cultivar will have the best performance in a certain environment.Knowledge regarding cultivars' reaction to specific environmental conditions is valuable for the estimation of their final performance.Additionally, information on weather, such as temperatures and precipitations, is accessible to farmers, and on the other hand, cultivar/hybrid recommendation and production technology can be obtained from agricultural extension service.All of this should be incorporated into applicable models and used for yield forecasting.
ANN was successfully used for the prediction of oil content in other species, such as sesame (Sesamum indicum L.) and ajowan (Carum copticum L.) [62,63].The best fit of predicted to measured traits in our ANN model was observed for oil yield (r 2 = 0.851).Negative MBE values occurred for all traits except protein content.This indicates that the predicted values were smaller than the observed ones.SOS values obtained with the ANN model were of the same order of magnitude as experimental errors for output variables reported in the literature [33,37].
A high r 2 is indicative that the variation was accounted for and that the data fitted the proposed RFR model satisfactorily [64,65].The RMSE for seed yield was 227.08 kg/ha, which represents 10.56% of the overall observed yield mean.When comparing this result with the RMSE for the ANN model, it can be noticed that the RFR model offered a more acceptable RMSE.This finding also goes in favour of using RFR for rapeseed yield prediction.

Conclusions
The best performances of the analysed rapeseed genotypes (e.g., highest seed and oil yield) were achieved in 2016.The highest positive correlation was found between seed and oil yield.In order to forecast yield and quality components, machine learning models were developed based on available genotype and weather data.The current study suggests that RFR and ANN modelling can be successfully exploited for the purpose of rapeseed oil and protein content, seed yield, oil and protein yield, and 1000 seed weight prediction, based on the year of production and genotype.The artificial neural network model showed itself to be adequate for the prediction of output variables.The highest r 2 values were obtained with the RFR model.The mentioned r 2 values justified the use of the developed models in the prediction of the observed parameters.According to the results, the RFR models were more accurate and their results were closer to the experimental data, in comparison with the ANN models.It is assumed that during the phase of seed filling, input data regarding environmental conditions are more valuable for yield prediction.The incorporation of more input data can improve the efficiency of both tested models.The tested models proved their usefulness for yield prediction and suggested the possibility to use them for the prediction of oil and protein content.This study has the potential to direct new ways for promising research related to rapeseed quality prediction, such as fatty acids and glucosinolates contents.In the end, it will encourage and promote research related to the use of machine learning algorithms for yield forecasts.

Figure 1 .
Figure 1.Colour correlation graph between genotypic values for six year period.Numerical data represent the coefficients of correlations.

Figure 1 .
Figure 1.Colour correlation graph between genotypic values for six analysed traits during four-year period.Numerical data represent the coefficients of correlations.

Figure 2 .
Figure 2. Comparison between experimentally obtained and ANN model predicted values of (a) oil and (b) protein content, (c) yield, (d) 1000 seed weight, (e) oil yield, and (f) protein yield.

Figure 2 .
Figure 2. Comparison between experimentally obtained and ANN model predicted values of (a) oil and (b) protein content, (c) yield, (d) 1000 seed weight, (e) oil yield, and (f) protein yield.

Figure 3 .
Figure 3.Comparison between experimentally obtained and RFR model predicted values of (a) oil and (b) protein content, (c) yield, (d) 1000 seed weight, (e) oil yield, and (f) protein yield.

Figure 3 .
Figure 3.Comparison between experimentally obtained and RFR model predicted values of (a) oil and (b) protein content, (c) yield, (d) 1000 seed weight, (e) oil yield, and (f) protein yield.

Table 2 .
Mean genotypic values for six analysed traits during four-year period.

Table 3 .
Artificial neural network model summary (performance and errors), for training, testing, and validation cycles.

Table 3 .
Artificial neural network model summary (performance and

Table 4 .
The goodness of fit tests for the developed ANN model.

Table 5 .
The goodness of fit tests for the developed RFR model.

Table 4 .
The goodness of fit tests for the developed ANN model.

Table 5 .
The goodness of fit tests for the developed RFR model.