Forecasting of Winter Wheat Yield: A Mathematical Model and Field Experiments

380-93-998-30-73


Introduction
The world population is increasing and it is predicted to exceed 9 billion by 2050 [1].Agriculture is required to significantly increase (up to 110%) food production to meet food demands [2,3].Thus, population growth has a negative impact on food resources [4].The food and agricultural organization (FAO) states that food security has become an urgent problem for a number of countries [5,6].
Wheat is ranked third among food crops [7].This crop covers around 21% of the world's food demand.Around 220 million hectares of arable land worldwide are used for wheat cultivation [8].Total wheat production exceeds 700 million tons [9,10].The European Union is ranked first in wheat production [11].China is a world leader in wheat production.Among European countries, Germany, France, and Ukraine are in the top ten wheat producers [12].
Agriculture 2023, 13, 41 2 of 22 In the last decade, there has been a decrease in wheat production [13][14][15][16].Climate change and biofuel production are among the primary problems of production growth.Average air temperature is rising [17].Agricultural production, including wheat production, is sensitive to climate change, which affects crop yields [18][19][20][21].The climate becomes drought.Thereby, irrigated agriculture is expected to expand.This kind of agriculture currently utilizes around 20% of arable land and produces up to 40% of total food [22].The production of biofuels has increased.Reducing greenhouse gas emissions, diversifying vehicle fuels, and promoting renewable energy are the main reasons for the above [23].The European Union has determined main targets to mitigate climate change.They include a 20% increase in renewable energy consumption [24].Biofuels (based on crop origin) are an alternative to fossil fuels [25,26].Their use in transport facilities is a priority for reducing carbon dioxide emissions in many countries [27].In 2017, the European Union used around 7% of the total arable land for industrial crops [28].Thus, biofuel production reduces available land for food production.
Agriculture must provide enough food for the growing population.Therefore, agricultural management and policymakers should use forecasting methods for crop production [6,29].Statistics, modeling, time series, learning machine, etc., are used for prediction [30].Reliable wheat yield forecasting is imperative.These methods help farmers to monitor yield and identify threats (weather conditions, fertilizer management, etc.) [31].Crop yield information is necessary for strategic planning [32].
Yield is an important indicator for characterizing grain production.Yield forecasting is an important task for any country.The accuracy of forecasting determines the solution of some problems, including organizing reserve funds of food, volumes of grain storage, etc.It affects the formation of an effective foreign trade policy (including the import/export plan and price, optimum management of growing crops).The yield indicator is also the basis for assessing the profitability of agricultural companies.Therefore, estimating yield is an important tool for effective management.
In Ukraine, since 1990, wheat yield has ranged from 1980 to 4160 kg/ha [33].It is behind the world's leaders; 7530 kg/ha in Germany [34].Irrigated winter wheat yield is in the range of 3550 to 5290 kg/ha [35].In the world, irrigated winter wheat yield is up to 7990 kg/ha in arid and semi-arid zones [36].Thus, there are reserves for increasing yields.It is necessary to optimize the use of available energy and material resources to realize these reserves.It is important for the arid and semi-arid zones of both Ukraine and other countries.
Significant weather and economic instability dictate the importance of yield forecasting.Crop yield forecasting is difficult because crop formation is associated with factors such as agricultural practice, weather conditions, the characteristics of biological systems, etc.
Currently, various methodical approaches to yield forecasting have been developed and applied in practice: 1.
The building of regression models based on a set of statistics obtained on the basis of remote and meteorological observations [38,42]; 4.
The analysis of synoptic processes [43].
The approaches of the first, second, and fifth groups are not accurate enough.Groups 3 and 4 are the most widely used approaches.In most cases, meteorological data are used to build regressions or to model plant growth.This type of forecast does not take into account the actual state of the soil, the use of fertilizers, and other chemicals.Dynamic models are most widely used in practice [5].However, they do not take into account the entire history of changes in yields and the conditions of grain production, which significantly limit the accuracy of existing models.
The main feature of yield is its stochastic changes.In this regard, the theory of random functions and random sequences must be used to predict crop yields.Methods and algorithms of the above theory take into account various random factors (precipitation, air temperature, soil temperature, etc.) [44], as well as the values of deterministic parameters (soil structure, crop variety, tillage practices, dosage and composition of fertilizers, etc.) [45][46][47][48][49]. Crop yields have been studied using simulation models [50,51].These models most accurately reveal the impact of agronomic factors on crop yields [52][53][54].
However, practitioners and authorities require mathematical models for different crops, climate zones, tillage practices, etc.The purpose of this article is to develop a mathematical model for predicting the winter wheat yield in a semi-arid zone.The novelty of this study is the development of a mathematical model based on stochastic data, such as precipitation, plant density, fertilizer, micro fertilizers, the effective temperature sum of the autumn vegetation, the amount of water used for irrigation, and preceding crops.This model has been developed for a semi-arid zone.Its modeling algorithm was built based on random non-stationary sequences of input variables.The main requirement for the method developed is the absence of any significant restrictions on the random process of grain crop yields.The maximum consideration of stochastic characteristics will allow us to achieve the best accuracy of the modeling problem.

Materials and Methods
This study focuses on the forecasting of winter wheat yield and proposes the use of a methodology combining statistical analysis and performing field experiments.This methodology comprises the following steps: the collection of field experiment data; the modeling of yield as a function of selected parameters.Field experiments were performed in the Mykolaiv region (Ukraine).

Field Experiment
The experiments were carried out in 2019 and 2020 in the Mykolaiv province, Ukraine (46 • 58 06" N; 31 • 42 39" E).The area of the experimental field is equal to 10 ha.Our experiment had a randomized design with three replications.Rapeseed and corn were preceding crops for winter wheat.The soil had the following properties: pH-from 6.8 to 7.2; organic carbon-from 2.9 to 3.2 g•kg −1 ; phosphorus-from 31 to 38 mg•kg −1 ; potassium-from 332 to 525 mg•kg −1 ; bulk density-1380 kg•m −3 .Winter wheat was grown under practice that is conventional for southern Ukraine (Table 1).The experiments were carried out over two years on irrigated lands.A field experiment was established by the method of randomized split plots.All studies, observations, and samplings were performed in quadruplicate.We used a sequential arrangement of plots in one tier.They were located in relation to organizational and technical factors: the convenience of tillage, fertilization, sowing, harvesting, etc.The total number of plots was 32.We investigated six factors in the field experiments.Factor A was the preceding crops (rapeseed and corn); factor B was the plant density, million plants per hectare: 4.0, 4.5, and 5.0; factor C was the fertilizer type and dosage (N:P:K); factor D was the micronutrient fertilizer type and dosage; factor E was the irrigation scheme; factor F was the total effective temperature of the autumn growing season.The sum of average air temperatures in autumn is the sum of average daily temperatures above +5 • C.This indicator characterizes the amount of heat necessary for the plant development process.Wheat during the autumn vegetation should gain a sum of effective temperatures from 300 to 350 • C.During the three-year experiments, differences in precipitation were observed.This study considered the influence of annual precipitation on wheat yield.
A preceding crop leaves nutrients in the soil.Thus, preceding crops influence yield.Two preceding crops (maize and rapeseed) were the limitation of this study.We chose maize because it is a widespread crop, and its production is around 50% of gross national grain production.Rapeseed is one of the best preceding crops.

Measurement of Yield
The method of mechanized harvesting was used to determine winter wheat yield.Wheat grain was harvested by a Sampo 500 combine harvester.Harvesting was carried out from a selective typical plot of the field.The yield was calculated by the equation: where MG is the mass of wheat grain from a plot, kg; PA is the area of a plot, ha.

Methodology for the Synthesis of Winter Wheat Yield Models
The formation and use of models of changes in winter wheat yield indicators involves the implementation of the following stages: Stage 1.The collection of statistical data on grain yields and cultivation conditions; Stage 2. The estimation of moment functions M X λ (ν)X s (i) based on obtained statistical data; Stage 3. The determination of the optimal order of stochastic connections of the random vector {X};

Verification of a Developed Mathematical Model
The developed mathematical model was verified using such indicators as the mean relative error, the standard deviation of error, and the coefficient of error variation.A mean relative error is where Ym i is the ith yield calculated by the mathematical model, kg/ha; Ye i is the ith experimental yield, kg/ha; n is the number of experimental yields.
To find the standard deviation, we used the following formula where δ i is the ith error, %.Finally, the coefficient of error variation is equal to

Field Experiment
Three-year field experiments on winter wheat growing were carried out on a farm of Mykolaiv National Agrarian University (the Mykolaiv region).The results are presented in Table A1 (preceding crop-maize) and Table A2 (preceding crop-rapeseed).Wheat yields were in the range from 4310 kg/ha to 6020 kg/ha.We can see that rapeseed was a better preceding crop than maize.This predecessor provided higher yields (from 4440 kg/ha to 6020 kg/ha).It was 3% higher compared to maize.Experimental data were used to develop our mathematical model.
We analyzed the impact of changing each factor (variable) on the yield.The results of the analysis are presented in Table 2.We assumed that all variables, except for the one being studied, are constants.As can be seen, nitrogen fertilizers had the strongest effect on yield.The irrigation scheme had the least influence.The sum of average air temperatures in autumn ranked second, with a value of −9.33%.Plant density and a preceding crop had approximately the same effect on the yield.We analyzed the variables that most strongly affect yield: nitrogen, the effective temperature sum of the autumn vegetation, and plant density (Figures 1-3).Linear functions were built to determine their slopes.We can see that rapeseed was a better preceding crop for winter wheat.Rapeseed allows farmers to obtain higher yields.Slopes of linear functions varied from −22.17 to 8.64 (Table 3).The only variable that had a negative slope was the effective temperature sum of the autumn vegetation.An increase in the effective temperature sum of the autumn vegetation decreased winter wheat yield.However, fertilizer management could offset this negative impact.

Canonical Decomposition of a Random Sequence of Yield Index and Characteristics of Production Conditions
To form realizable algorithms for modeling random sequences, certain restrictions are imposed on the properties of the sequence under study.For example, it is assumed that: a) the sequence under study is normal (has a normal probability distribution law) or a stationary sequence generated by a normal one in nonlinear systems; b) the sequence is non-stationary, but with stationary increments; c) the sequence is Markovian, etc.
For these classes of random sequences, there are quite efficient modeling algorithms.Therefore, to obtain a random sequence with a given correlation matrix (without taking into account distribution densities), the method of linear transformations is successfully used [58,59].One of the varieties of the method of linear transformations-the canonical decomposition of V.S. Pugachev [60]-allows us to form the values of a sequence of random variables that are dependent within the framework of linear relationships, taking into account their one-dimensional distribution densities.The Fourier series is widely used to model a stationary random sequence [61]; the apparatus for modeling stationary normal sequences is well developed [62] (there are two operators for generating values and several approaches have been developed [63,64] for determining their parameters); the simplest solution is the problem of modeling Markov sequences [65], which is reduced to the method of conditional distributions for the simplest case-the use of only a twodimensional distribution density.However, the use of simplifying assumptions about the

Canonical Decomposition of a Random Sequence of Yield Index and Characteristics of Production Conditions
To form realizable algorithms for modeling random sequences, certain restrictions are imposed on the properties of the sequence under study.For example, it is assumed that: (a) the sequence under study is normal (has a normal probability distribution law) or a stationary sequence generated by a normal one in nonlinear systems; (b) the sequence is non-stationary, but with stationary increments; (c) the sequence is Markovian, etc.
For these classes of random sequences, there are quite efficient modeling algorithms.Therefore, to obtain a random sequence with a given correlation matrix (without taking into account distribution densities), the method of linear transformations is successfully used [58,59].One of the varieties of the method of linear transformations-the canonical decomposition of V.S. Pugachev [60]-allows us to form the values of a sequence of random variables that are dependent within the framework of linear relationships, taking into account their one-dimensional distribution densities.The Fourier series is widely used to model a stationary random sequence [61]; the apparatus for modeling stationary normal sequences is well developed [62] (there are two operators for generating values and several approaches have been developed [63,64] for determining their parameters); the simplest solution is the problem of modeling Markov sequences [65], which is reduced to the method of conditional distributions for the simplest case-the use of only a two-dimensional distribution density.However, the use of simplifying assumptions about the properties of a random sequence in the formation of a modeling algorithm naturally reduces the accuracy of the representation of a random sequence.The most universal from the point of view of the restrictions (linearity, Markov property, stationarity, monotonicity, scalarness, etc.), which are superimposed on the properties of sequences of random variables, is the method based on a non-linear canonical decomposition [66].
The subject of study is a random sequence {X(i)},i = 1, I, where X(i), i = 1, I − 1random values that determine the conditions of production of grain crops (temperature, amount of precipitation, number of sunny days, volume of mineral and organic fertilizers, etc.), X(I)-indicator of grain crop yield.
The nonlinear canonical decomposition of the investigated vector {X(i)}, can be written as [67]: The random coefficients W (λ) ν and non-random coordinate functions ϑ of the mathematical yield model ( 1) are determined by the recurrence relations: where M[ ] is the mathematical expectation; D λ (ν) are the variances of the random coefficient The nonlinear model (1) of the random vector {X} = X(i), i =1, Each of these coefficients contains information about the corresponding value X λ (i), λ = 1, N, i = 1, I, and the coordinate functions ϑ describe the probabilistic relations of the order λ + h between the point ν and i ν, i = 1, I .Expression (5) provides an optimal description of the studied sequence {X}(where X(i), i =1, I − 1 are the technological parameters; X(I) is the yield) according to the criterion of the minimum mean square of the modeling error.Expression (1) is also true if some stochastic relations of the random vector {X} = X(i), i =1, I are missing.In this case, the corresponding coordinate functions take the value zero and these relations are automatically excluded from the canonical decomposition.
The legitimacy of the approach used to form representation ( 5) is confirmed by the proposition that it is possible to construct a canonical decomposition of the sequence f 1 Y 1 , . . . ,f n Y n , where Y ν , ν = 1, n is a vector random variable and f ν (.), ν = 1, n is a nonlinear function.

Predictive Model of Changes in Yield Indicators Depending on the Initial Conditions of Production
The sequential fixation of known values x ν (j) in the canonical decomposition (5) (the values of random coefficients become known W (ν) j ) using the mathematical expectation operation obtains an extrapolation algorithm [67]: The expression m (1, i) of the future value x(I) of the yield indicator, provided that the values x ν (j), ν = 1, N, j = 1, I − 1 are used to calculate this estimate; i.e., I − 1 indicators that characterize the conditions of production of grain crops are known.

The expression for estimation m (k,N) x
(1, i) can be written in the following explicit form [68]: where where mod N ( ) is the division modulo N.

Synthesis of Models of Changes in Winter Wheat Yield
Based on statistical yield data from the period 2019-2021, as a result of conducting experiments at the innovative training ground of the Ukrainian National Academy of Sciences, it was determined that the main factors affecting winter wheat yield are as follows [69][70][71][72] That is, the random vector takes the form {X(i)},i = 1, 7: X(1)amount of average annual precipitation; X(2)-sowing rate; X(3)-amount of mineral nutrition; X(4)amount of microfertilizer; X(5)-the sum of the effective temperatures of autumn vegetation; X(6)-volume of water used for irrigation; X(7)-winter wheat yield.
It was determined that the most significant stochastic relations affecting the yield index of winter wheat are relations of the third order: X ν (i)X µ (j) = 0, i, j = 1, 7, ν, µ ≤ 3; X ν (i)X µ (j) = 0, i, j = 1, 7, ν, µ > 3 (the analysis of the ripening process was not performed, therefore, the aftereffect interval in this case degenerates into a point: the moment of harvesting).

Verification of the Developed Model
We determined the approximation error.Mean relative errors were analyzed for three approximations: a linear model, a second-order polynomial model, and a third-order polynomial model.The linear model had the highest mean relative error of 11.9942%.The second-order polynomial model showed a mean error of 4.8582%.Therefore, the above models are not recommended for use.
We revealed that the mean relative error depends on the preceding crop.When the preceding crop was maize, the mean relative error for a third-order polynomial model was 1.7884%.Its value varied from 0.0056% to 7.2168%.The standard deviation was 1.4691.The coefficient of variation was rather high, and it was equal to 82.15%.If the preceding crop was rapeseed, the mean relative error was higher, 2.7532%.The standard deviation was almost the same (1.4532).The coefficient of variation was 52.78%.Hence, the errors varied in a wide range.Although, the mean relative error was of low value.Thereby, the developed third-order polynomial model was proven to predict winter wheat yield.
The maximum relative error of 7.61% (the preceding crop was a rapeseed) was under the following conditions: plant density-5.0 million per hectare, the effective temperature sum of the autumn vegetation-345 • C, micro-fertilizer-scheme II, irrigation scheme-600 + 1000, preceding crop-rapeseed, nitrogen-120 kg/ha.The minimum relative error of 0.021% was under the following conditions: plant density-4.5 million per hectare, the effective temperature sum of the autumn vegetation-369 • C, micro-fertilizerscheme II, irrigation scheme-700 + 900, preceding crop-rapeseed, nitrogen-120 kg/ha (Figure 4).The use of another preceding crop (maize) changed relative errors.The maximum relative error of 7.22% was at the plant density of 4.5 million per hectare and the effective temperature sum of the autumn vegetation of 369 • C. The minimum relative error of 0.0056% was observed at the plant density-4.0 million per hectare, the effective temperature sum of the autumn vegetation-369 • C, and the second scheme of micro-fertilizer (Figure 5).
To further verify the prediction ability of the developed mathematical model, we applied it to five farms in the Mykolaiv region in 2021.Their actual yields ranged from 5025 to 5284 kg/ha.The relationship between the prediction and observation was found to validate the accuracy of our model.Results are presented in Table 10.The developed model produced an average accuracy of 2.48%.Errors varied from 1.48 to 4.03%.irrigation scheme-700 + 900, preceding crop-rapeseed, nitrogen-120 kg/ha (Figure 4).The use of another preceding crop (maize) changed relative errors.The maximum relative error of 7.22% was at the plant density of 4.5 million per hectare and the effective temperature sum of the autumn vegetation of 369 °C.The minimum relative error of 0.0056% was observed at the plant density-4.0 million per hectare, the effective temperature sum of the autumn vegetation-369 °C, and the second scheme of micro-fertilizer (Figure 5).The use of another preceding crop (maize) changed relative errors.The maximum relative error of 7.22% was at the plant density of 4.5 million per hectare and the effective temperature sum of the autumn vegetation of 369 °C.The minimum relative error of 0.0056% was observed at the plant density-4.0 million per hectare, the effective temperature sum of the autumn vegetation-369 °C, and the second scheme of micro-fertilizer (Figure 5).

Conclusions
Wheat is an important food crop.Climate change and an increase in world population has increased the importance of wheat yield forecasting.It is a principal problem for both farmers and authorities.
In this study, proposed a mathematical method by which to solve a significant practical problem of modeling winter wheat yields.The mathematical model was developed based on the results of a three-year field experiment.The mathematical forecasting model can use an arbitrary number of variables affecting the yield, preceding crops, fertilizers, plant density, an irrigation scheme, the effective temperature sum of the autumn vegetation, and micro-fertilizers.The structure and computational algorithm do not depend on the number of variables and the order of the nonlinear stochastic model.
The modeling method can use various functional dependences of yield on random factors (linearity, stationarity, Markovianity, monotonicity, etc.).The mathematical model uses weather and technological stochastic indicators.It allows us to achieve the best accuracy.The mean relative error does not exceed 2.75%; whereas a linear extrapolation gives the mean relative error of 9-12%.Therefore, the only nonlinear model is suitable for practical application.
Further studies are planned to develop a mathematical model for energy, environmental, and economic assessment of wheat cultivation as a function of agricultural practices and weather conditions.They will build on our previous papers [55,73,74].

Stage 4 .
The calculation of the characteristics of the canonical distribution of the random vector {X}; Stage 5.The calculation of the parameters of the mathematical model; Stage 6.The calculation of productivity indicators based on the predictive model under different initial conditions of production; Stage 7. The assessment of yield modeling accuracy.

Figure 2 .
Figure 2. Yield versus the effective temperature sum of the autumn vegetation.

Figure 2 .
Figure 2. Yield versus the effective temperature sum of the autumn vegetation.

Figure 2 . 21 Figure 3 .
Figure 2. Yield versus the effective temperature sum of the autumn vegetation.

Figure 4 .
Figure 4.The relative error (the preceding crop is rapeseed).

Figure 5 .
Figure 5.The relative error (the preceding crop is maize).

Figure 4 .
Figure 4.The relative error (the preceding crop is rapeseed).

Figure 4 .
Figure 4.The relative error (the preceding crop is rapeseed).

Figure 5 .
Figure 5.The relative error (the preceding crop is maize).Figure 5.The relative error (the preceding crop is maize).

Figure 5 .
Figure 5.The relative error (the preceding crop is maize).Figure 5.The relative error (the preceding crop is maize).

Table 1 .
Production of winter wheat.

Table 2 .
The weight of each variable.

Table 3 .
The slope of a linear function.

Table 3 .
The slope of a linear function. :