Application of Artificial Neural Networks for Multi-Criteria Yield Prediction of Winter Rapeseed

The aim of the work was to produce three independent, multi-criteria models for the prediction of winter rapeseed yield. Each of the models was constructed in such a way that the yield prediction can be carried out on three dates: April 15th, May 31st, and June 30th. For model building, artificial neural networks with multi-layer perceptron (MLP) topology were used, on the basis of meteorological data (temperature and precipitation) and information about mineral fertilisation. The data were collected from the years, 2008–2015, from 328 production fields located in Greater Poland, Poland. An assessment of the quality of forecasts produced based on neural models was verified by determination of forecast errors using RAE (relative approximation error), RMS (root mean square error), MAE (mean absolute error) error indicators, and MAPE (mean absolute percentage error). An important feature of the produced prediction models is the ability to realize the forecast in the current agrotechnical year on the basis of the current weather and fertiliser information. The lowest MAPE error values were obtained for the neural model WR15_04 (April 15th) based on the MLP network with structure 15:15-18-11-1:1, which reached 7.51%. Other models reached MAPE errors of 7.85% for model WR31_05 (May 31st) and 8.12% for model WR30_06 (June 30th). The performed sensitivity analysis gave information about the factors that have the greatest impact on winter rapeseed yields. The highest rank of 1 was obtained by two networks for the same independent variable in the form of the sum of precipitation within a period from September 1st to December 31st of the previous year. However, in model WR15_04, the highest rank obtained a feature in the form of a sum of molybdenum fertilization in the current year (MO_CY). The models of winter rapeseed yield produced in the work will be the basis for the construction of new forecasting tools, which may be an important element of precision agriculture and the main element of decision support systems.


Introduction
The prediction of the quantity and quality of crop yields is very important in terms of planning, use as a means of production, current decision-making, transport, stockholding, and risk management [1][2][3].The prediction of yields during the growing season is the basis for estimating production levels and expected yields at the end of the growing season, and therefore the amount of income [4].
The results of plant production are strongly influenced by atmospheric conditions, which may vary due to climate changes.Reliable estimates of the effects of climate change require the integration of meteorological and crop data in the produced models [5].The yield of plants depends on a large number of factors, which are often correlated with each other and directly or indirectly affect the yield of a particular plant.The most common factors are soil factors (pH, structure, organic matter content, nutrient levels), weather and climate factors (air temperature, precipitation, sunshine), soil cultivating technology, plant variety, fertilisation technology and level, plant protection, harvesting technology, and crop rotation [6].
Modern technologies of cultivation and harvesting of plants have an increasing impact on the increase in the quantity and quality of yields.It is also connected with the possibility of using linear and non-linear prediction models used to perform simulations before harvest, and, as a consequence, to optimize the production process [1].For example, in [7], the artificial neural network (ANN) and linear regression method (MLR) were used to build the Ajowan (Trachyspermum ammi L.) yield model.It was shown that all parameters of the model, i.e., determination coefficient (R 2 ), mean absolute error (MAE), and root mean square error (RMS) are better for the ANN model than for MLR.Similarly, in [8], the ANN and MLR methods were used to produce a model of safflower yield (Carthamus tinctorius L.).The results of analyses (R 2 , MAE, RMS) also confirm better results for ANN models than MLR models.Therefore, crop yield models are used to develop forecasting tools, which can be an important element of high-precision agriculture and the main element of the decision-making support systems [9].Precision agriculture can help in managing crop production inputs in an environmentally friendly way.By using site-specific knowledge, precision agriculture can target rates of fertilizer, seed and chemicals for soil, and other conditions.The concepts of precision agriculture and sustainability are inextricably linked.From the first time a global positioning system was used on agricultural equipment, the potential for environmental benefits has been discussed.Intuitively, applying fertilizers and pesticides only where and when they are needed should reduce the environmental burden [10].
The rapeseed is one of the most important oilseeds.It is a basic raw material for the food industry and an element of renewable energy sources as a component of biofuel for diesel engines.In this context, competition in land use between food and energy crops is becoming a critical issue.As a result, there is a growing demand for decision-making support tools [20][21][22].
Rapeseed is grown mainly in Europe, Canada, China, and India.In Poland, winter rapeseed was cultivated in 2016 on 826,946 ha of sown area.The average yield per 1 ha was 26.8 dt and the total production was 2,219,270 tons.Winter rape is a plant which occupies the third place in Poland in terms of total cultivated area.The share of Polish winter rapeseed production in the European Union in 2016 was ranked 3rd in 28 associated countries.First place was taken by France with an area of 1,550,720 ha and a total production of 4,727,961 tons [23].
In recent years, the growing importance of rapeseed cultivation has led to many attempts to adapt commonly known models of cultivation systems to simulate the yields of winter rapeseed.
The examples are CERES-Crop Environment Resource Synthesis [24], AquaCrop-water-driven model [25], DSSAT-Decision Support System for Agrotechnology Transfer [26], and APSIM-Agricultural Production Systems sIMulator [27].For a reliable and accurate yield prediction, it is necessary to obtain meteorological, cultivation, and fertilizer information.Unfortunately, most models are based on information from specialised research.Consequently, the use of prediction models based on such detailed data is not possible for a wide range of users.Therefore, the aim of this paper is to create multi-criteria yield models on the basis of information generally available to every farmer.These models are an important element of precision agriculture that is part of the concept of sustainability, including respect for the natural environment.The novelty in this paper is the selection of a data structure for the produced models.The division into meteorological data and information on fertilisation from the previous year and from the current year is presented.The data for the current year were additionally divided into three ranges and forecast dates-15th April, 31st May, and 30th June.All data necessary to simulate the yield comes from the period before winter rape harvest.Such actions lead to the minimisation of the forecast error.The dates for carrying out the forecast, i.e., 15 April, 31 May, and 30 June, were defined with the cooperation of farmers.On these dates, in Greater Poland, an initial estimation of winter rapeseed yield is made.Moreover, it is possible to simulate the yield at any time for future weather and fertilisation parameters.This makes it possible to build different scenarios for cultivation and care, harvest, storage, and making decisions about selling grain, considering price trends.

Materials and Methods
Prediction neural models were built based on data collected in the years, 2008-2015, from winter rapeseed fields located in Poland, in the central and south-western part of Greater Poland, and particularly in the districts of Pozna ń (52 • 24 29.759"N 16 • 56 0.672" E), Kościan (52 • 5 10.77" N 16 • 38 41.998" E), and Gosty ń (51 • 53 5.762" N 17 • 0 47.829" E) (Figure 1).In total, data from 328 fields were used for model construction and verification (Table 1).This information formed the basis for the creation of a database for the construction of predictive neural models, which was divided into two sets, A and B. Set A (292 fields) is composed of information from 2008-2014, which was used to build models.Set B (36 fields) consists of information from 2015, which was not involved in model building, but was used for model validation.
Sustainability 2019, 11 FOR PEER REVIEW 3 different scenarios for cultivation and care, harvest, storage, and making decisions about selling grain, considering price trends.

Materials and Methods
Prediction neural models were built based on data collected in the years, 2008-2015, from winter rapeseed fields located in Poland, in the central and south-western part of Greater Poland, and particularly in the districts of Poznań (52°24'29.759"N 16°56'0.672"E), Kościan (52°5'10.77"N 16°38'41.998"E), and Gostyń (51°53'5.762"N 17°0'47.829"E) (Figure 1).In total, data from 328 fields were used for model construction and verification (Table 1).This information formed the basis for the creation of a database for the construction of predictive neural models, which was divided into two sets, A and B. Set A (292 fields) is composed of information from 2008-2014, which was used to build models.Set B (36 fields) consists of information from 2015, which was not involved in model building, but was used for model validation.Meteorological data -air temperature and precipitation for the research area and period -were obtained from Davis stationary and mobile meteorological stations located closest to the research area, namely in Kórnik, Gola, Turew, Piotrowo, and Stary Gołębin.
The construction of neural predictive models was prepared based on three predicted dates for a calendar year: April 15th, May 31st, and June 30th.The models have been named, respectively, WR15_04, WR31_05, and WR30_06.The models included factors (independent variables) that affect crop yields and are easily available to agricultural producers (Table 2).
This approach to the prediction of winter rapeseed yields enables the making of forecasts and the simulation of expected yields directly before harvesting, in the same agricultural year.

Model WR30_06
The scope of data

R9-12_LY mm
The sum of precipitation from 1 September to 31 December of the previous year The average air temperature from 1 September to 31 + + + 4.9-9.4Meteorological data-air temperature and precipitation for the research area and period-were obtained from Davis stationary and mobile meteorological stations located closest to the research area, namely in Kórnik, Gola, Turew, Piotrowo, and Stary Gołębin.
The construction of neural predictive models was prepared based on three predicted dates for a calendar year: April 15th, May 31st, and June 30th.The models have been named, respectively, WR15_04, WR31_05, and WR30_06.The models included factors (independent variables) that affect crop yields and are easily available to agricultural producers (Table 2).
This approach to the prediction of winter rapeseed yields enables the making of forecasts and the simulation of expected yields directly before harvesting, in the same agricultural year.The sum of Zn fertilization in the current year + + + 10-560 "+"-the variable exists in the model, "-"-the variable does not exist in the model.

Method of Construction of Neural Models
Independent variables for the construction of neural models were selected in such a way that each neural network used a different number of independent variables, which are presented in Tables 1 and 2.
In the selection of a network topology and learning method, consideration was taken of the network's ability to approximate and generalise, based on measures of network quality.Using the Statistica v7.1 [28] program, it was possible to test networks with different architectures.For each of the neural models, WR15_04, WR31_05, and WR30_06, the number of networks tested was 10,000, with the use of an automated network designer (AND).Network selection was made based on the best parameters determining the network quality.
The set of empirical data was divided randomly into a learning set, a validation set, and a test set.
The sizes of the sets were as follows: Learning set-204 cases; validation set-44 cases; testing set-44 cases.
The set was divided in the proportions of 70%-15%-15%, taking account of the number of fields included in the study.

Methodology for Validating the Neural Models
Following the construction of neural models using the automate network designer, each model was evaluated based on information obtained from Statistica, namely the standard deviation, mean error, error deviation, mean absolute error, deviation quotient, and correlation.The best model was selected based on the smallest value of the mean absolute error and the largest value of the correlation.
In the next step, the predictive ability of the constructed neural models was evaluated using ex post measures of the prediction error, comparing data from set B with the results of the predictions made based on set A. These errors have the property that they are computed on the basis of materials from the past, namely expired predictions and the corresponding actual values of the predicted variable.The prediction error is the difference between the observed and predicted value.
Validation of the constructed models was performed based on data from the last year of the study (2015) and covering 36 fields of winter rapeseed.These data were not used in the construction of the neural models.The quality of the predictions was evaluated using a methodology widely described in the literature [2,[29][30][31][32][33][34].
• RAE-relative approximation error: • RMS-root mean square error: • MAE-mean absolute error: • MAPE-mean absolute percentage error: where, n-number of observations; y i -actual values obtained during research; and ŷi -values given by the model.
For better visualisation of the relations between the observed and predicted yield, graphs were plotted showing those relations for each prediction date.

Neural Network Sensitivity Analysis
In order to check which of the examined independent features contribute the most to the explanation of the variability of biological yields of winter rapeseed, a sensitivity analysis of the neural networks under construction was carried out.After removing a specific input variable (independent trait) from the model, its influence on the total error of the neural network can be observed, which allows the determination of the significance (influence on the output variable, i.e., yield) of individual independent traits.Two indicators were used for this purpose.The error quotient-this is the ratio of error to error obtained using all independent features, the larger this value is the greater is the significance of the given feature.If it is less than 1, a feature may be removed from the model in order to improve its quality, although this is not a compulsory procedure.Rank-this shows numerically the ordering of the features by decreasing error, a rank of 1 indicating the greatest significance for the network.

Results
As a result of the analyses, one neural model was selected for each prediction date.Basic information on the quality of the neural models, WR15_04, WR31_05, and WR30_06, is given in Table 3.The general structure of the designed neural network model is presented in Figure 2. To determine the quality of the prediction, computations applied for ex post methods were performed, using the formulae (1)(2)(3)(4).The results are given in Table 4.
In the next step, graphs were plotted showing the relationship between the actual and forecasted yield for each prediction date.Figure 3 show this relationship for the models, WR15_04, WR31_05, and WR30_06, respectively.To determine the quality of the prediction, computations applied for ex post methods were performed, using the formulae (1-4).The results are given in Table 4.In the next step, graphs were plotted showing the relationship between the actual and forecasted yield for each prediction date.Figure 3 show this relationship for the models, WR15_04, WR31_05, and WR30_06, respectively.Figure 3 shows a comparison of yield results for all models produced with the observed yield.As it is easy to notice, fields no.29 and 33 definitely differ from the average value of the observed yield, which amounted to 3.81 t•ha −1 .For field no.29, the observed yield was 2.14 t•ha −1 , and 4.63 t•ha −1 for field no.33.The average value of the forecast yield for models WR15_04, WR31_05, and WR30_06 amounted to 3.92 t•ha −1 , 3.58 t•ha −1 , and 3.87 t•ha −1 respectively.

Network Sensitivity Analysis
In the last step of the computations, network sensitivity analysis was carried out for all of the constructed neural models.The results of this analysis are given in Table 5.As shown in Figure 4, model WR31_05 has the best match between the real yield and the forecasted yield, for which the determination coefficient of R 2 was 0.6286.The other models slightly deviate from this result.In model WR15_04, the determination coefficient, R 2 , of 0.6187 was achieved, while in model WR30_06, the determination coefficient, R 2 , of 0.5976 was obtained.

Discussion
Three independent, multi-criteria models of winter rapeseed yield were produced in the following study.These models were produced based on weather and crop information from the last seven years.Moreover, the models were validated on the basis of data from 2015.The model WR15_04 was based on field data, including 15 independent variables contained in Table 2.The data included basic weather information from the previous and current year, as well as data on fertilisation with micro and macro elements.Model WR31_05 (19 traits) was additionally enriched with weather data from April and May, while in model WR30_06 (21 traits), the weather information from June was added.In each model, the yield of winter rape expressed in t•ha −1 of cultivated area was forecasted.
A common problem in the process of prediction of plant yields using neural models is the selection of the appropriate network topology.The most frequently used for prediction issues is the MLP network, i.e., multilayer perceptron, which gives the best forecasting results.A good model should adequately describe the behaviour of the system [31].This means that the model under construction should be similar to the tested empirical system, from which data for research, analysis, and calculations are taken.
In view of the above, four ex post error measures were used in this work, i.e., relative approximation error (RAE), root mean square error (RMS), mean absolute error (MAE), and mean absolute percentage error (MAPE).They were used to determine the quality of models and to determine yield forecast errors for winter rapeseed.
Table 4 shows the ex post error values for all models created.The most commonly used indicators characterising the values of prediction errors belongs to MAPE [2,35].The lowest values of MAPE errors were obtained for the WR15_04 neural model based on the MLP network with a 15:15-18-11-1:1 structure, which reached 7.51%.Similar results were obtained for model WR31_05 -7.85%.The highest MAPE error of 8.12% was obtained for model WR30_06.Based on the literature [36], threshold values for the assessment of errors may be indicated.If the error is less than 10%, the degree of fit of the model is perfect; when it is in the range of 10% to 20%, the degree of fit of the model is good.In the range of 20% to 30%, the error is acceptable, while above 30%, the degree of fitting is bad-such a model is not usable.In this paper, all MAPE error results were up to 10%.In cases that are significantly influenced by random conditions, the results obtained for all models are highly satisfying.Other ex post error rates-RAE, RMS, and MAE for all models produced also reached a satisfying level (Table 4).On the basis of the results obtained, in order to illustrate the relation between the actual yield and the forecast, three graphs were created (Figure 4).
In the next step, a network sensitivity analysis was carried out for all neural models produced.The highest rank 1 was obtained for WR31_05 and WR30_06 models for the independent variable in the form of the sum of precipitation in the period from 1th September to 31st December of the previous year (R9-12_LY).This means that this factor had the greatest influence on the yield of winter rape in the period from May to June.It is known that water is an essential factor for the correct and rapid germination of seeds and the growth and development of the leaf rosette before the end of vegetation.The accumulation of water resources in the soil during winter rest is an important reservoir to cover the water needs of the canopy in early spring.
In model WR15_04, the highest rank was obtained by the sum of Mo fertilization (MO_CY).This means that after sowing, in the period of initial plant growth, this component has the greatest influence on the final yield.Interestingly, the second place was occupied by the feature of the average temperature in the period from 1st September to 31st December of the previous year (T9-12_LY).The values of the error quotient differed slightly from 1.1935 (MO_CY) and 1.1933 (T9-12_LY), respectively.
Molybdenum is an essential element for the proper growth and development of plants, and its content in soil is traceable.The availability of this element depends on the pH of the soil, the concentration of certain oxides (e.g., iron), and the amount of organic compounds in the soil.The plant needs molybdenum to carry out multiple oxidation and reduction reactions [37].Molybdenum deficiency in the Brassicaceae family, to which rapeseed belongs, is a characteristic feature.They show gray-green discoloration of leaves, which become flaccid [38].Plants have a low content of chlorophyll and ascorbic acid [39].The main aim of molybdenum rape fertilization is to increase its resistance to low temperatures and to prepare the future yield structure.
To sum up, on the basis of the above results, it should be stated that the prediction of winter rape yield with the use of artificial neural networks gives satisfactory results of the forecast.However, in order to optimize the models, further research should be undertaken in order to obtain data from a larger number of fields and further analysis of the number of independent factors in the models.

Conclusions
Prediction of agricultural crop yields is a useful tool in rational management of the means of production and responsible management of crops in the era of climate change.Forecasting winter rapeseed yields using artificial neural networks makes it possible to obtain an accurate yield forecast before harvesting.This paper presents three neural models, which were constructed in such a way that each of them predicts the yield in three different dates of the calendar year: April 15th-model WR15_04, May 31st-model WR31_05, and June 30th-model WR30_06.All tested models were characterized by high forecast accuracy, which was confirmed by very good values of their qualitative parameters.The average absolute percentage error of MAPE, considered as the basic indicator of model quality, did not exceed 10% in any model.The sensitivity analysis of neural networks indicates a high influence of autumn-winter precipitation and molybdenum fertilization in shaping winter rapeseed yields.The presented method of neural modelling extends the range of plant yield modelling and may be an important element of precision agriculture.This method, after some modifications, may be used for forecasting yields of other cultivated plants as well, which may result in measurable macro and microeconomic effects.Moreover, the concept of neural modelling presented in this paper may contribute to sustainability by reducing the doses of mineral fertilizers while keeping high yields of cultivated plants at the same time.

Figure 2 .
Figure 2. General structure of the neural network.

Figure 2 .
Figure 2. General structure of the neural network.

Table 1 .
The number of productive fields of winter rapeseed divided into two sets, A and B.

Table 2 .
Data structure in neural prediction models.

Table 1 .
The number of productive fields of winter rapeseed divided into two sets, A and B.

Table 2 .
Data structure in neural prediction models.

Table 3 .
The quality and structure of the neural models produced.

Table 4 .
Measures prediction ex post of analyzed neural models.

Table 5 .
Sensitivity analysis of the neural networks.