Simpliﬁed and Hybrid Remote Sensing-Based Delineation of Management Zones for Nitrogen Variable Rate Application in Wheat

: Enhancing digital and precision agriculture is currently inevitable to overcome the economic and environmental challenges of the agriculture in the 21st century. The purpose of this study was to generate and compare management zones (MZ) based on the Sentinel-2 satellite data for variable rate application of mineral nitrogen in wheat production, calculated using different remote sensing (RS)-based models under varied soil, yield and crop data availability. Three models were applied, including (1) a modiﬁed “RS- and threshold-based clustering”, (2) a “hybrid-based, unsupervised clustering”, in which data from different sources were combined for MZ delineation, and (3) a “RS-based, unsupervised clustering”. Various data processing methods including machine learning were used in the model development. Statistical tests such as the Paired Sample T -test, Kruskal–Wallis H-test and Wilcoxon signed-rank test were applied to evaluate the ﬁnal delineated MZ maps. Additionally, a procedure for improving models based on information about phenological phases and the occurrence of agricultural drought was implemented. The results showed that information on agronomy and climate enables improving and optimizing MZ delineation. The integration of prior knowledge on new climate conditions (drought) in image selection was tested for effective use of the models. Lack of this information led to the infeasibility of obtaining optimal results. Models that solely rely on remote sensing information are comparatively less expensive than hybrid models. Additionally, remote sensing-based models enable delineating MZ for fertilizer recommendations that are temporally closer to fertilization times.


Introduction
In recent years, there has been an intense growth in the world population, which is projected to reach 9.7 billion by 2050 [1]. Population growth puts enormous pressure on agricultural productivity growth, but also on the increasing environmental impact of the agri-food sector [2,3]. In many regions of the world, small farms are the main food producers, and this group will be under pressure to increase production efficiency [4].
The research was conducted on the 50.2 ha field on the experimental farm of the Poznań University of Life Sciences, RGD Brody, located in Brody (52.43 N, 16.29 E according to WGS84), Wielkopolskie Voivodeship, Poland ( Figure 1). The field dedicated to the research was covered with winter wheat crop (Triticum aestivum cv. 'RGT Reform'), carried out in the reduced soil tillage system and non-irrigated.
The average annual precipitation sum of the study area  was 599 mm, and the annual mean air temperature was 8.5 °C, while in 2019 and 2020, the annual precipitation sum was 462 and 520 mm respectively, with the mean air temperature of the study at 10.8 and 10.6 °C, respectively. Meteorological data were obtained from the meteorological station of Brody Experimental Station and were recorded according to the World Meteorological Organization guidelines.
The farm soils in the study area are light, loamy sands, developed on loamy sands overlying loamy material, and are classified as Albic Luvisols according to World Reference Base nomenclature [63,64].

Soil Sampling
Soil physical and chemical properties were determined by infield sampling and laboratory analysis. Soil samples were taken mechanically, in a semi-automatic operation, from the 0-30 cm field layer in a 4 ha grid, where one average, mixed sample consisted of The average annual precipitation sum of the study area (1960-2019) was 599 mm, and the annual mean air temperature was 8.5 • C, while in 2019 and 2020, the annual precipitation sum was 462 and 520 mm respectively, with the mean air temperature of the study at 10.8 and 10.6 • C, respectively. Meteorological data were obtained from the meteorological station of Brody Experimental Station and were recorded according to the World Meteorological Organization guidelines.
The farm soils in the study area are light, loamy sands, developed on loamy sands overlying loamy material, and are classified as Albic Luvisols according to World Reference Base nomenclature [63,64].

Soil Sampling
Soil physical and chemical properties were determined by infield sampling and laboratory analysis. Soil samples were taken mechanically, in a semi-automatic operation, from the 0-30 cm field layer in a 4 ha grid, where one average, mixed sample consisted Agriculture 2021, 11, 1104 5 of 24 of 16 primary samples. Sampling was carried out in spring 2020. In the present study, analyses were performed to determine pH KCl , phosphorous (P 2 O 5 ), potassium (K 2 O) and magnesium (Mg) contents. These are the most commonly used parameters determined by farmers in agrotechnical practice, especially small-scale farming, because of the low implementation costs. Additionally, an analysis of soil organic matter (OM) content was performed as one of the important factors determining sorption and water properties of arable soils.
Soil physico-chemical parameters were determined as follows: pH in 1M KCl according to PN-ISO 10390-1997, available phosphorus and potassium by the Egner-Riehm method [65] according to PN-R-04023 and PN-R-04022 respectively, and magnesium by the Schachtschabel method [66] according to PN-R-04020. OM content was determined by the Tiurin method (T methode) [67] with the Van Bemmelen coefficient of 1.724.

Yield Data
Winter wheat yield data were recorded automatically during harvest in 2019 and 2020 with a modified Claas Lexion 480 combine harvester. Data were recorded at a temporal resolution of 1 Hz for each of the harvester passes. The recorded raw yield data were postprocessed and filtered to mitigate lag times and exclude outliers. A detailed description of the combine harvester prototype equipped with a system for monitoring qualitative and quantitative grain parameters is described by Czechlowski and Wojciechowski [68,69].

Elevation Data
The basic hypsometric data were obtained using the measuring system of the modified combine harvester described by Czechlowski and Wojciechowski [64]. The combine was equipped with a Novatel RT2 PROPAK V3 GNSS receiver with a GPS-702-GG: a dualfrequency (L1/L2) antenna and a SmallTRIP 3.2 GPRS/NTRIP modem with automatic connection to the Real-Time Kinematic (RTK) NAWGEO service of the ASG-Eupos network. The possibility of using this type of data to create digital elevation models of agricultural field surfaces was reported by Czechlowski et al. [70]. The slope map was generated based on the interpolated elevation map (DEM) as a percent slope (See Preprocessing Section).

RS Data
A time series of Sentinel-2 L2A images from 1 January 2018 to 1 July 2020 was downloaded from The Copernicus Open Access Hub [71]. Sentinel-2 L2A data are atmospherically and geometrically corrected. Additionally, layers such as a scene classification layer, cloud mask, cloud shadow mask and snow mask were provided along with Sentinel-2 L2A raw data. During the downloading stage, it was attempted to download the data with minimum possible cloud cover by checking the quick layer of every data point. The quick layer of each data point is accessible in the Copernicus Open Access Hub which is embedded in the detailed information of each Sentinel-2 scene. Finally, 119 scenes of Sentinel-2 L2A were downloaded. Table 1 shows the number of downloaded data per year and month.  2018  1  2  1  5  8  2  6  7  5  8  2  1  2019  0  4  2  6  5  8  3  3  5  5  2  4  2020  3  3  4  7  4 3 -* -* -* -* -* -* * Not analysed.

Models
Three models were separately implemented to delineate MZ. In the following sections, each model is described in detail. First, a brief introduction of each model is provided, then its flowchart is presented and related parts are explained.
2.3.1. Model-1 (RS-and Threshold-Based Clustering) The first model was fundamentally adopted from Georgi et al. [17], who developed a segmentation algorithm for generating MZ of within-field crop patterns by solely using multi-temporal and multi-spectral satellite images. Thus, the input to the model was the time series of RS data. However, the RapidEye data used in [17] was replaced with Sentinel-2 L2A data because of its free and open data policy. Moreover, a different approach was applied for selecting cloud-free and cloud shadow-free data using mask layers embedded in Sentinel-2 L2A products. The workflow of model-1 is summarized in Figure 2 and the whole process of this model was subdivided into 4 parts (Sentinel-2 data processing, Data selection, Processing of NIR bands, and Segmentation and classification), and a detailed description of each part is provided in the following sections.

Models
Three models were separately implemented to delineate MZ. In the following sections, each model is described in detail. First, a brief introduction of each model is provided, then its flowchart is presented and related parts are explained.
2.3.1. Model-1 (RS-and Threshold-Based Clustering) The first model was fundamentally adopted from Georgi et al. [17], who developed a segmentation algorithm for generating MZ of within-field crop patterns by solely using multi-temporal and multi-spectral satellite images. Thus, the input to the model was the time series of RS data. However, the RapidEye data used in [17] was replaced with Sentinel-2 L2A data because of its free and open data policy. Moreover, a different approach was applied for selecting cloud-free and cloud shadow-free data using mask layers embedded in Sentinel-2 L2A products. The workflow of model-1 is summarized in Figure 2 and the whole process of this model was subdivided into 4 parts (Sentinel-2 data processing, Data selection, Processing of NIR bands, and Segmentation and classification), and a detailed description of each part is provided in the following sections.

Sentinel-2 Data Processing
As inputs to the model, 119 scenes of the Sentinel-2 L2A were imported. First, all bands were resampled to 10 m pixel size, which was inevitable due to differences in spatial resolutions. The cloud-free and cloud shadow-free data were extracted by investigating the quality scene classification layer, quality cloud confidence, cloud probability mask,

Sentinel-2 Data Processing
As inputs to the model, 119 scenes of the Sentinel-2 L2A were imported. First, all bands were resampled to 10 m pixel size, which was inevitable due to differences in spatial resolutions. The cloud-free and cloud shadow-free data were extracted by investigating the quality scene classification layer, quality cloud confidence, cloud probability mask, cloud mask and shadow mask layers embedded in Sentinel-2 L2A products. For each scene, the mentioned layers were investigated by visual quality control. Finally, NDVI was calculated for all images. The entire workflow was conducted in SNAP V7 software [72].

Data Selection
The data were selected with two constraints. Standard deviation (SD) was calculated for each NDVI data point, and the SD values <0.02 were dropped to exclude the images of dense vegetation cover with no spatial patterns. Then, the mean of each NDVI data point was calculated, and the values between 0.3 and 0.78 were selected since the values <0.3 and >0.78 depict the corresponding images of vegetation canopy, which is too sparse (bare soil background) or too dense, respectively [73]. When the canopy becomes too dense, NDVI saturates because red reflectance does not change much, but near-IR reflectance increases [73]. This stage was implemented in Python 3.8 by using 'rasterio', 'unidip' and 'numpy' packages. For selected dates, NIR bands of Sentinel-2 data were extracted, on which the consequent steps were implemented. This model was conducted based on NIR bands, while NDVI data were used only for data selection since ratio indices such as NDVI cause noise patterns and artifacts that challenge the MZ delineation [17,74]. This resulted in 24 out of 119 raster data, as summarized in Table 2.  Processing of NIR Bands Each image was converted to relative values by Equation (1), i.e., a normalization to a percentage, where 100% was equal to the average NIR value of each image.
where Minimum is the minimum value of the whole scene and Mean is the mean value of the whole scene. Then, an average of NIR bands for each year was calculated, thus the NIR time series of each year formed a raster stack. Then, normalization and averaging for the generated data were applied as well.

Segmentation and Classification
A 3 × 3 median filter was applied to eliminate the small zones and smooth the class boundaries, and the result was normalized as previously mentioned. Eventually, a thresholding method was implemented on the classification, whereby the 10%, 35%, 65% and 90% quantiles were calculated, and the final raster was classified into five classes. These quantile values were empirically chosen [17]. The final five classes were termed 'very low' (1), 'low' (2), 'average' (3), 'high' (4) and 'very high' (5), which correspond to yield expectancy. The processing of NIR bands, classification and mapping was conducted in QGIS V3.10 [75].

Model-2 (Hybrid-Based, Unsupervised Clustering)
The Second approach is based on a hybrid model for MZ delineation that combined data from different sources (see Figure 3). This method was recently applied by researchers in several studies [44][45][46][47]. However, in these studies [44][45][46][47], principal component analysis (PCA) was utilized to reduce data dimensionality and minimize the dependencies among variables. In this study, PCA was replaced with machine learning-based feature selection (random forest (RF) feature importance). The flowchart of model-2 is shown in Figure 3 and the whole workflow of this model was subdivided into 5 parts (Input data, Preprocessing, Processing, Output (MZ map) and Validation), and a detailed description of each part is provided in the following sections. The Second approach is based on a hybrid model for MZ delineation that combined data from different sources (see Figure 3). This method was recently applied by researchers in several studies [44][45][46][47]. However, in these studies [44][45][46][47], principal component analysis (PCA) was utilized to reduce data dimensionality and minimize the dependencies among variables. In this study, PCA was replaced with machine learning-based feature selection (random forest (RF) feature importance). The flowchart of model-2 is shown in Figure 3 and the whole workflow of this model was subdivided into 5 parts (Input data, Preprocessing, Processing, Output (MZ map) and Validation), and a detailed description of each part is provided in the following sections.

Input Data
For this method, data from 4 different sources were integrated, including (1) soil nutrition data comprising soil pHKCl, P2O5, K2O, Mg and OM, (2) topographical data comprising elevation and slope of the study area, (3) yield data from 2019 and 2020 and (4) RS data comprising NDVI and biophysical variables of LAI, FAPAR and FVC of Sentinel-2 data at the heading stage of wheat (15 May 2020 for this study area) [37]. The NDVI and biophysical variables were derived in SNAP V7 [72].

Preprocessing
Descriptive statistics (minimum, maximum, mean, SD, standard error (SE), coefficient of variation (CV), skewness and kurtosis) of soil, elevation and yield samples were calculated. Since the locations of soil, yield and elevation data are different, no geostatistical analysis was possible prior to correlation analysis and feature selection. Thus, semivariogram parameters (nugget (C0), sill (C + C0) and range) were estimated to represent the spatial distribution of soil, yield and elevation data [76]. Several semi-variogram models, including circular, spherical, tetraspherical, pentaspherical, exponential, Gaussian, rational quadratic, hole effect, k-Bessel, J-Bessel and stable, were evaluated. The best-fit model with the lowest root-mean-square (RMS) error was selected for each data point. Then, the data were interpolated using the best-fit model and ordinary kriging (OK) procedure. Since the resolution of the satellite data is 10 m, all the variables (e.g., soil, yield and elevation data) were interpolated to 10 m spatial resolution. Based on Martins et al.

Input Data
For this method, data from 4 different sources were integrated, including (1) soil nutrition data comprising soil pH KCl , P 2 O 5 , K 2 O, Mg and OM, (2) topographical data comprising elevation and slope of the study area, (3) yield data from 2019 and 2020 and (4) RS data comprising NDVI and biophysical variables of LAI, FAPAR and FVC of Sentinel-2 data at the heading stage of wheat (15 May 2020 for this study area) [37]. The NDVI and biophysical variables were derived in SNAP V7 [72].

Preprocessing
Descriptive statistics (minimum, maximum, mean, SD, standard error (SE), coefficient of variation (CV), skewness and kurtosis) of soil, elevation and yield samples were calculated. Since the locations of soil, yield and elevation data are different, no geostatistical analysis was possible prior to correlation analysis and feature selection. Thus, semi-variogram parameters (nugget (C0), sill (C + C0) and range) were estimated to represent the spatial distribution of soil, yield and elevation data [76]. Several semi-variogram models, including circular, spherical, tetraspherical, pentaspherical, exponential, Gaussian, rational quadratic, hole effect, k-Bessel, J-Bessel and stable, were evaluated. The best-fit model with the lowest root-mean-square (RMS) error was selected for each data point. Then, the data were interpolated using the best-fit model and ordinary kriging (OK) procedure. Since the resolution of the satellite data is 10 m, all the variables (e.g., soil, yield and elevation data) were interpolated to 10 m spatial resolution. Based on Martins et al. [44], variables that are temporally stable and correlated with crop yield were selected, which are crucially significant to delineate MZ [47]. Therefore, a correlation matrix was generated using Spearman's correlation to specify the relationship among variables on interpolated data with the spatial resolution of 10 m. Unlike other recent studies [44][45][46][47], in this study, RF with variance reduction criterion was utilized instead of PCA to rank the features, reduce data dimensionality and minimize the dependencies amongst variables. The final variables were selected by considering Spearman's correlation matrix (using correlation coefficient criterion) and RF feature importance (using mean squared error criterion). Deriving descriptive statistics, correlation analysis and feature selection was performed in Python 3.8 using 'pandas', 'scipy', 'rasterio', 'numpy', 'matplotlib', 'seaborn' and 'sklearn' packages. The geostatistical analysis and interpolation were conducted in ArcGIS V10.7 [77].
Processing MZ delineation was performed by the fuzzy c-means algorithm using Management Zone Analyst (MZA) software V1.0. [36,78]. Furthermore, two types of cluster validity functions, fuzzy performance index (FPI) [79,80] and normalized classification entropy (NCE) [81], were used to determine the optimum number of zones. The FPI is a measure of the degree of separation, i.e., fuzziness, between classes, with values ranging from 0 to 1 [36]. Additionally, the NCE measures the degree of disorganization between classes [36].
The minimum values of these indices suggest the optimum number of clusters since it represents the least membership sharing (FPI) or the highest amount of organization (NCE), as shown in Equations (2) and (3) [36]. The settings used in the MZA software included similarity measure = Mahalanobis distance, fuzziness exponent = 1.3, the maximum number of iterations = 300, convergence criteria = 0.0001, minimum number of zones = 2 and maximum number of zones = 8. Finally, the ideal number of zones was selected by considering the lowest values of FPI and NCE: where c is the number of clusters, n is the number of observations, u ik is the fuzzy membership and log a is the natural logarithm. Following clustering, mapping was conducted in QGIS V3.10 [75].

Model-3 (RS-Based, Unsupervised Clustering)
The approach applied here differed from model-1, in that a K-means clustering algorithm was conducted to compare the classification of this model (threshold-based clustering) with a simple clustering procedure. Other components of model-3 were similar to model-1 ( Figure 2). As can be seen from Figure 4, the whole workflow of model-3 was subdivided into 3 sections (Sentinel-2 data processing, Data selection and Classification) and the description of each part was provided in Section 2.3.1.

Model Improvement
To improve the result of the final MZ maps, the RS-based models (model-1 and model-3) were enriched with climate and agronomy information. The agronomic infor-mation used is the date of sowing and the dates of subsequent spring mineral nitrogen fertilization treatments, which are carried out three times, as is typical for this region. The timing of the three application rates is linked to the phenological phases of plant development and to climate and soil conditions (beginning of vegetation, stem-shooting phase, earing). Agronomic and drought information were used to select RS images. This is a knowledge-based selection of input RS data under new climate conditions, such as drought. Additionally, both RS-based models were performed concerning phenological phases, which starts from seeding to harvesting time using expert knowledge. To overcome this, the models should be run with single-year RS data and considering phenological phases in a specific year. The abnormal climate condition, such as drought in the analysed period, can impact the final yield map, which affects the performance of the models. The RS-based models were performed only for 2020 RS data to avoid this, by selecting data after 19 September 2019 (seeding date) in the 2019-2020 season. This resulted in 5 RS datasets that passed the data selection constraints ( Table 2). Since this analysis was conducted for single-year data, the generated MZ maps can guide fertilization before the fertilization time approaches considering the growth stages of wheat. In this study area, fertilization was performed at four dates in the 2019-2020 season (23 January 2020, 17 February 2020, 25 March 2020 and 21 April 2020). With regard to fertilization dates and available selected RS data, two MZ maps for fertilization were generated for 25 March 2020 and 21 April 2020.  The K-means algorithm was performed with 5 classes and 100 iterations in SNAP V7 [72]. Moreover, mapping was conducted in QGIS V3.10 [75].

Model Improvement
To improve the result of the final MZ maps, the RS-based models (model-1 and model-3) were enriched with climate and agronomy information. The agronomic information used is the date of sowing and the dates of subsequent spring mineral nitrogen fertilization treatments, which are carried out three times, as is typical for this region. The timing of the three application rates is linked to the phenological phases of plant development and to climate and soil conditions (beginning of vegetation, stem-shooting phase, earing). Agronomic and drought information were used to select RS images. This is a knowledge-based selection of input RS data under new climate conditions, such as drought. Additionally, both RS-based models were performed concerning phenological phases, which starts from seeding to harvesting time using expert knowledge. To overcome this, the models should be run with single-year RS data and considering phenological phases in a specific year. The abnormal climate condition, such as drought in the analysed period, can impact the final yield map, which affects the performance of the models. The RS-based models were performed only for 2020 RS data to avoid this, by selecting data after 19 September 2019 (seeding date) in the 2019-2020 season. This resulted in 5 RS

Sampling for Validation
A stratified random sampling procedure was conducted for drawing validation data. Yield maps were converted to relative values (see the Processing of NIR Bands Section) and then averaged based on the models over the available years [17]. For each sample, the relative yield value and corresponding class ID were sampled based on the MZ map [17]. The size and conjectured SD of each class were considered to determine the sample size of each class. The number of samples was computed by Equation (4) [82]: available to determine the S i of each class by the assumption that S i is higher for classes with low area proportion. Afterwards, S o was assumed equal to 0.01 [82]. Finally, equal distribution (ED i ) (Equation (5)) and weighted distribution (WD i ) (Equation (6)) of each class' samples were calculated to determine the sample size of each class. The final number of samples for each class (N i ) (Equation (7)) was assessed by averaging these distributions. The sampling procedure was performed in QGIS V3.10 [75].

Validation
To validate and evaluate the final MZ, a variety of statistical tests were performed. The Paired Sample T-test, Kruskal-Wallis H-test and Wilcoxon signed-rank test were applied to samples of each class and compared with each other to explore whether there were statistically significant differences. The Paired Sample T-test requires normally distributed data, so it was performed on logarithmically transformed sample values for the second time after the test was performed without normalization. The purpose of MZ delineation is to classify the wheat parcel into homogeneous zones, thus the separability of final zones was tested. Finally, boxplots for each model were used along with fitted lines through the medians of each class. The validation was conducted in Python 3.8 by using 'pandas', 'scipy', 'numpy' and 'matplotlib' packages. Figure 5a shows the delineated, 5-class MZ map based on model-1. The higher the number of zones, the better the crop vitality and yield expectancy will be. Based on this map, the north and south-west parts of the field (Zones 5 and 4) show a more productive crop pattern compared with the west and east parts. The statistical tests (Table S1 see the (Supplementary Material)) indicated inseparability between classes for values with p > 0.05, and these values are highlighted in red in Table S1. According to Table S1, although the result of the Kruskal-Wallis H-Test, which compares the separabilities of the zones all at once, showed that all classes are separable with the p-value of 2.9 × 10 −54 in all other tests, the pairs of the zones 3-4, 3-5 and 4-5 are inseparable. The results of the Paired Samples T-test for the pairs of the zones 3-4, 3-5 and 4-5 were 0.60, 0.63 and 0.95, respectively. The p-values of the Paired Samples T-test (log of data) for the mentioned zones were equal to 0.86, 0.80 and 0.70, respectively. Finally, the results of the Wilcoxon signed-rank test for these zones were 0.34, 0.08 and 0.77. Moreover, in the Wilcoxon signed-rank test, the pair of the zones 1-2 did not support the separability hypothesis, with a p-value of 0.08. On the other hand, values with p < 0.05 indicate separable zones. Additionally, the boxplot shown in Figure 6a confirms the result of the statistical tests, with overlaps observed for the pairs of the zones 3-5 and 1-2.

Model-2
The results of descriptive statistics for soil, yield and elevation samples are summarized in Table 3. Despite the sufficient number of samples for elevation (DEM), OM and yields for 2020 and 2019, the numbers of soil pH KCl , P 2 O 5 , K 2 O and Mg samples were fewer than the expected number for interpolation as there were just 14 samples in the 50 ha area. The authors of this study are well aware of the small sample size, but this is the typical soil sampling density used in state public advisory practice. As shown in Table 3, the average yield in 2020 dropped by approximately 400 kg ha −1 compared with that of 2019. This lower productivity is attributed to the severe spring drought that occurred in major parts of Poland, including this study area [83]. The CV of the analysed attributes can be categorized from low (CV < 12%) to moderate (12% ≤ CV < 60%) based on the classification suggested by Warrick and Nielsen [84].

Model-2
The results of descriptive statistics for soil, yield and elevation samples are summarized in Table 3. Despite the sufficient number of samples for elevation (DEM), OM and yields for 2020 and 2019, the numbers of soil pHKCl, P2O5, K2O and Mg samples were fewer   Table 4 represents the results of the geostatistical analysis. It suggested the bestfit models to be exponential (DEM, yield 2019, yield 2020) J-Bessel (OM, pH KCl ), Gaussian (K 2 O, Mg) and Hole Effect (P 2 O 5 ), based on the minimum RMSE. The values of Nugget/Sill could be used to determine the degree of the spatial autocorrelation, in which the values < 25%, 25-75% and >75% suggest strong, moderate and weak spatial dependencies, respectively [85]. DEM and OM showed strong spatial dependence, while other parameters showed moderate degrees. The range of the semi-variogram was the distance over which the samples are correlated with each other [86]. A low value of Nugget/Sill and a high range of an attribute generally indicate that high precision can be obtained by kriging [85].  Figure 7 shows the maps using the best-fit model, including interpolated soil, elevation and yield, along with RS data.
The maps suggested a high correlation of RS data with the yield map of 2020. Moreover, high values were generally observed in the central part of the field. Soil pH KCl and P 2 O 5 maps were consistent since the values of P 2 O 5 were high and neutral in the west and east parts of the field. However, the values were low and somewhat acidic in the southern part. In terms of K 2 O, Mg and OM, higher values were observed in the Eastern part. Besides, DEM and slope values were higher in the western part. Figures 8 and 9 show the results of correlation analysis and selection of features with high correlation with yield 2020 data. The RS data (NDVI, LAI, FCV and FAPAR) were highly and positively correlated with yield 2020. The feature selection also showed that yield 2019 and NDVI are appropriate features for clustering.
Besides, the optimum number of classes was found by computing two cluster validity indices (FPI, NCE). Figure 10 shows the plotted values of FPI and NCE against the number of clusters, with the optimum number being the value at which FPI and NCE are minimum, i.e., 5 clusters. This was consistent with model-1 results in terms of the number of zones.
The result of MZ in 5 classes by fuzzy c-means clustering (Figure 5b) showed a better condition in the central part of the field from west to east (Zone 5). This map was consistent with the features (NDVI and yield 2019 data) that were used in the MZ delineation process. As reported in Table S2, the p-value of the Kruskal-Wallis H-Test for this model was 3.2 × 10 −117 , which means that all of the zones support the separability hypothesis. However, the results of other statistical tests showed that the pair of the zones 2-3 did not support the separability hypothesis, with p-values of 0.37 (Paired Samples T-test), 0.46 (Paired Samples T-test (log of data)) and 0.55 (Wilcoxon signed-rank test). This can also be observed in the boxplot (Figure 6b). The maps suggested a high correlation of RS data with the yield map of 2020. Moreover, high values were generally observed in the central part of the field. Soil pHKCl and P2O5 maps were consistent since the values of P2O5 were high and neutral in the west and east parts of the field. However, the values were low and somewhat acidic in the southern part. In terms of K2O, Mg and OM, higher values were observed in the Eastern part. Besides, DEM and slope values were higher in the western part. Figures 8 and 9 show the results of correlation analysis and selection of features with high correlation with yield 2020 data. The RS data (NDVI, LAI, FCV and FAPAR) were highly and positively correlated with yield 2020. The feature selection also showed that yield 2019 and NDVI are appropriate features for clustering.  Figure 5c shows the delineated MZ map for model-3, which suggests that the central, eastern and western parts of the field have higher yield expectancy. However, a low yield pattern was observed at the edge of the field. According to Table S3, the p-value of the Kruskal-Wallis H-Test for this model was 1.2 × 10 −102 . Nevertheless, the pair of the zones 4-5 did not support the separability hypothesis, with p-values of 0.37 (Paired Samples T-test), 0.18 (Paired Samples T-test (log of data)) and 0.99 (Wilcoxon signed-rank test), as they are also overlapping along with the pair of the zones 1-2 (Figure 6c). Agriculture 2021, 11, x FOR PEER REVIEW 15 of 24 Figure 8. Spearman's correlation matrix amongst soil attributes, RS data, yield data and geomorphology data. Besides, the optimum number of classes was found by computing two cluster validity indices (FPI, NCE). Figure 10 shows the plotted values of FPI and NCE against the number of clusters, with the optimum number being the value at which FPI and NCE are minimum, i.e., 5 clusters. This was consistent with model-1 results in terms of the number of zones.  Besides, the optimum number of classes was found by computing two cluster validity indices (FPI, NCE). Figure 10 shows the plotted values of FPI and NCE against the number of clusters, with the optimum number being the value at which FPI and NCE are minimum, i.e., 5 clusters. This was consistent with model-1 results in terms of the number of zones.

3.4.1.
Model-1 Figure 11 shows the delineated MZ maps in 5 classes for model-1 considering agronomy and climate information. Figure 11a shows the MZ map that is based on the RS data before 25 March 2020 (see Table 2). Thus, it can be used as a fertilization recommendation map at 25 March 2020. Figure 11b shows the result of model-1 with the RS data before 21 April 2020. Additionally, this map can be utilized as a fertilizer recommendation map at 21 April 2020. As can be seen in Tables S4 and S5, all zones passed the separability hypothesis (p-value < 0.5), thus all zones are separable. The boxplots of these maps also confirmed the results of statistical tests (Figure 12). Zones 4 and 5 passed the test, though they showed partial overlap. The result of MZ in 5 classes by fuzzy c-means clustering (Figure 5b) showed condition in the central part of the field from west to east (Zone 5). This map w sistent with the features (NDVI and yield 2019 data) that were used in the MZ deli process. As reported in Table S2, the p-value of the Kruskal-Wallis H-Test for this was 3.2 × 10 -117 , which means that all of the zones support the separability hyp However, the results of other statistical tests showed that the pair of the zones 2-3 support the separability hypothesis, with p-values of 0.37 (Paired Samples T-tes (Paired Samples T-test (log of data)) and 0.55 (Wilcoxon signed-rank test). This c be observed in the boxplot (Figure 6b). Figure 5c shows the delineated MZ map for model-3, which suggests that the eastern and western parts of the field have higher yield expectancy. However, a lo pattern was observed at the edge of the field. According to Table S3, the p-value Kruskal-Wallis H-Test for this model was 1.2 × 10 -102 . Nevertheless, the pair of th

Model-3
Likewise, MZ delineation was conducted for model-3 with the hypothesis (agronomy and climate information) that was considered for improving the MZ results. Figure 13 shows the MZ maps for two dates before fertilization. As shown in Table S6, all zones were separable (p-value < 0.5). However, zones 3 and 4 did not pass the statistical test, and the pvalues of statistical tests were 0.78 (Paired Samples T-test), 0.73 (Paired Samples T-test (log of data)) and 0.08 (Wilcoxon signed-rank test) (Table S7). The boxplot of both maps confirms the statistical tests ( Figure 14). Similar to model-1, zones 4 and 5 were overlapping.

Model-3
Likewise, MZ delineation was conducted for model-3 with the hypothesis (agronomy and climate information) that was considered for improving the MZ results. Figure 13 shows the MZ maps for two dates before fertilization. As shown in Table S6, all zones were separable (p-value < 0.5). However, zones 3 and 4 did not pass the statistical test, and the p-values of statistical tests were 0.78 (Paired Samples T-test), 0.73 (Paired Samples Ttest (log of data)) and 0.08 (Wilcoxon signed-rank test) (Table S7). The boxplot of both maps confirms the statistical tests ( Figure 14). Similar to model-1, zones 4 and 5 were overlapping.

Model-3
Likewise, MZ delineation was conducted for model-3 with the hypothesis (agronomy and climate information) that was considered for improving the MZ results. Figure 13 shows the MZ maps for two dates before fertilization. As shown in Table S6, all zones were separable (p-value < 0.5). However, zones 3 and 4 did not pass the statistical test, and the p-values of statistical tests were 0.78 (Paired Samples T-test), 0.73 (Paired Samples T-

Discussion
This study followed a comparative analysis for MZ mapping that was aimed at optimizing fertilization in the context of PA. The comparative analysis has been tested by experiments based on the Sentinel-2 satellite data. The results showed only marginal consistency among the mapping outputs without any additional agronomy and climate information, which confirms the hypothesized challenges in the best model selection. Although statistical tests were considered to be appropriate validation tools, it is also suggested that the quality of the final results be eventually confirmed by a farmer and those engaged in the ongoing cultivation process.
In terms of the applied models, model-1 was partially adopted from Georgi et al. [17], and solely relied on RS data without any additional source of information. One of the advantages of this model is the utilization of NIR bands that are less noisy and more physical than band ratio vegetation indices such as NDVI. Therefore, NIR bands were used in

Discussion
This study followed a comparative analysis for MZ mapping that was aimed at opti mizing fertilization in the context of PA. The comparative analysis has been tested by ex periments based on the Sentinel-2 satellite data. The results showed only marginal con sistency among the mapping outputs without any additional agronomy and climate in formation, which confirms the hypothesized challenges in the best model selection. Alt hough statistical tests were considered to be appropriate validation tools, it is also sug

Discussion
This study followed a comparative analysis for MZ mapping that was aimed at optimizing fertilization in the context of PA. The comparative analysis has been tested by experiments based on the Sentinel-2 satellite data. The results showed only marginal consistency among the mapping outputs without any additional agronomy and climate information, which confirms the hypothesized challenges in the best model selection. Although statistical tests were considered to be appropriate validation tools, it is also suggested that the quality of the final results be eventually confirmed by a farmer and those engaged in the ongoing cultivation process.
In terms of the applied models, model-1 was partially adopted from Georgi et al. [17], and solely relied on RS data without any additional source of information. One of the advantages of this model is the utilization of NIR bands that are less noisy and more physical than band ratio vegetation indices such as NDVI. Therefore, NIR bands were used in model-3 as well. Besides, labelling of the zones from low to high was convenient compared with using a thresholding procedure for MZ delineation in other methods. On the other hand, labelling and sorting the zones in a clustering method (e.g., k-means) is a demanding task. All in all, the main disadvantage of model-1 was the lack of agronomy and climate information that can negatively impact the result.
Model-2 integrated data from multiple sources that potentially provide additional information for MZ delineation. This resulted in an enhanced accuracy of the final results. Nevertheless, one may note that such a workflow increases the costs, whereas this model is considered impractical without the incorporated sample data (soil, yield, elevation, etc.). Here, the optimum number of soil sampling data was not accessible (14 samples in 50 ha) ( Table 3). Furthermore, the soil sampling data which were applied here comprised only chemical attributes and lacked soil physical attributes, such as soil compaction, texture and electric conductivity, providing further influential information on soil condition. Additionally, the certainty of utilizing RS data was confirmed in this model, as previously claimed by Martins et al. [47] and Song et al. [37]. Besides, the results of this study showed the high correlation of RS data with the yield map in correlation analysis and feature selection (See Figures 8 and 9). The k-means clustering method was used in model-3 to eliminate the NIR bands' preprocessing part (see Figure 2) and reduce the data computation of model-1. However, the labelling and sorting of the zones remain an issue in this method, as previously mentioned.
Although Georgi et al. [17] indicated that a three-year timeframe would be sufficient to depict the crop pattern for MZ delineation, the results of this study witnessed an inferior performance considering the years 2018, 2019 and 2020. Instead, focusing on one crop growing season in 2019-2020 yielded the best results. Climate change is to blame for this, as it influences crop patterns and results in year-to-year variations in crop yield. For example, in 2020, the average yield had dropped by 400 kg/ha in comparison to 2019 (see Table 3). As a result, one of the finest data sources for MZ delineation would be high-resolution remote sensing data, such as unmanned aerial vehicle (UAV) data, as it contains detailed spatial information and one UAV data point before the fertilization date can be used to create MZ and apply the generated MZ map for fertilization. Additionally, Nawar et al. [2] asserted that in terms of performance and cost, remote sensing data such as UAV or satellite data are more suitable for informing variable rate nitrogen fertilizer application than soil characteristics' data, such as electric conductivity and soil texture.
Several studies [45,47] recently applied Cohen's Kappa coefficient to evaluate delineated MZ maps with yield maps. However, this method was avoided due to some rationales. First, the classification of yield maps entails highly expert agronomic knowledge by which elements such as climate and agronomy information should be considered to classify the yield maps. Further, yield maps inherently comprise continuous values, subject to bias and information loss when converted and interpreted in categorical/integer data values. Last but not least, statistical tests from Georgi et al. [17] were adopted, which we think will provide an appropriate platform for validating the MZ map.
Further analyses for improvement of MZ delineation showed superior performance of model-1 and model-3 when agronomy and climate information were considered. The output MZ maps of both performed better and were similar and consistent, unlike other delineated MZ maps that lacked agronomy and climate information (see Figures 5,11 and 13). This supports the importance of this information.
As an additional analysis, RS-based models were calibrated by solely considering agronomy information. Thus, the data within the phenological phases from 2018 to 2020 (2 seasons) were selected, i.e., leaving out data before seeding and after harvesting in two seasons. However, the results showed inferior performance of the RS-based models compared to the model considering both agronomy and climate information. This is parallel to suggestions presented in [4] indicating that information on crops, soil or phenological phases increases the quality of the model. Thus, incorporating both agronomy and climate information was strongly suggested in MZ delineation for PA. This issue is inevitable under new climate conditions with more frequent drought and other extreme events. As such, the requirement of considering this information is to conduct a model in a single year. Accordingly, the generated MZ map can be used as a guide for fertilization.
In a similar study, Santaga et al. [45] investigated simplified and advanced models for creating variable nitrogen fertilization maps using satellite imagery, yield maps and protein maps embedded in FMIS architecture. In contrast to the results of [45], in this study, in addition to simplified models, hybrid models were tested using non-advanced soil data, which should be considered more accessible to agricultural producers than yield or protein maps. In both studies, both simplified and hybrid or advanced models can be effectively implemented in the expanding FMIS [87].

Conclusions
This study compared three models based on the Sentinel-2 satellite data for management zones' (MZ) delineation in the context of precision agriculture (PA) on an example of a winter wheat field. The remote sensing (RS)-based models 1 and 3 did not require any additional data and were thus considered less expensive than model-2. Additionally, agronomy (wheat phenological phases) and climate information (drought) were considered to improve MZ maps when applying RS-based models. The results also showed that MZ delineation is prone to uncertain results, particularly under new and rapidly changing climate conditions, if no agronomy expert knowledge and climate information were incorporated. These findings enhance the understanding of the role of new climate conditions for RS-based PA algorithms in rainfed farming with the potential agricultural drought and its impact in applying fertilizer. As drought stress is on the rise and had a significant impact on the growth of most arable crops in central Europe, it is suggested to perform low-cost RS-based techniques only for the current season. This information provides additional, reliable and current information to improve MZ delineation for optimizing fertilization in a single-year context.
In conclusion, an algorithm has been designed and its use has been evaluated for a few test cases for the integration of soil, crop and yield information, together with knowledge about agronomy and climate information. This will improve the results of MZ delineation and generate a guiding map to prescribe variable rates of fertilization before the necessary fertilizer application dates.
It is recommended for future research to use remote sensing data with high spatial resolution, such as satellite images with higher resolution and drone images, in the delineation of MZ maps since delineation of MZ maps with single-year data is feasible using the proposed method. It is also suggested to use cost-benefit analysis to evaluate the implemented MZ maps, e.g., in the context of variable rate fertilization. However, it was concluded that the quality of the final MZ maps should be eventually calibrated in collaboration with farmers and all these steps help farmers set up their fertilization operation to address problem areas and maximize yield.
This study has compared three models for creating MZ, focused on implementation in cloud-based farm management information systems (FMIS). The results of this work, as well as those of Santaga et al. [45], indicate that for FMIS, it is necessary to develop not only advanced models for creating MZ or variable nitrogen fertilization strategies, but also simplified and hybrid models for those users who do not have multiyear crop vegetation data or simplified soil data. Further work in this area is recommended.