Monitoring the Spatial Variability of Knapweed ( Centaurea diluta Aiton) in Wheat Crops Using Geostatistics and UAV Imagery: Probability Maps for Risk Assessment in Site-Speciﬁc Control

: Assessing the spatial distribution of weeds within a ﬁeld is a key step to the success of site-speciﬁc weed management strategies. Centaurea diluta (knapweed) is an emerging weed that is causing a major agronomic problem in southern and central Spain because of its large size, the difﬁculty of controlling it, and its high competitive ability. The main objectives of this study were to examine the spatial variability of C. diluta density in two wheat ﬁelds by multivariate geostatistical methods using unmanned aerial vehicle (UAV) imagery as secondary information and to delineate potential control zones for site-speciﬁc treatments based on occurrence probability maps of weed infestation. The primary variable was obtained by grid weed density ﬁeld samplings, and the secondary variables were derived from UAV imagery acquired the same day as the weed ﬁeld surveys. Kriging and cokriging with UAV-derived variables that displayed a strong correlation with weed density were used to compare C. diluta density mapping performance. The accuracy of the predictions was assessed by cross-validation. Cokriging with UAV-derived secondary variables generated more accurate weed density maps with a lower RMSE compare with kriging and cokriging with RVI, NDVI, ExR, and ExR(2) (the best methods for the prediction of knapweed density). Cokriged estimates were used to generate probability maps for risk assessment when implementing site-speciﬁc weed control by indicator kriging. This multivariate geostatistical approach enabled the delineation of winter wheat ﬁelds into two zones for different prescription treatments according to the C. diluta density and the economic threshold.


Introduction
The International Society for Precision Agriculture (2019) gave a clear official definition of precision agriculture as follows: "precision agriculture is a management strategy that gathers, processes and analyses temporal, spatial and individual data and combines it with other information to support management decisions according to estimated variability for improved resource use efficiency, productivity, quality, profitability and sustainability of agricultural production" [1].
Site-specific weed management (SSWM) is the application of the precision agriculture concept to one particular aspect of agricultural production: weed control. Site-specific weed management is based on the fact that weed populations commonly have a patchy distribution within crop fields [2][3][4][5][6][7][8][9][10][11], and SSWM implies applying chemical and/or physical weed control measures only where and when they are really needed [12]. Site-specific management is already commonplace in many aspects of farming. However, the uptake of SSWM is lacking, largely because of the unavailability of accurate within-field weed species distribution maps [13].
Accurately assessing the spatial distribution of weeds within a field is a key step to the success of site-specific weed management strategies. Patchy distribution mapping could be used to prevent the spread of weeds to clean areas, which is particularly relevant for invasions of new weed species and resistant or emerging, troublesome weeds, and to implement SSWM to control those weeds. This strategy would complement management of hard-to-control weeds and prevent any further spread, although weeds are located at densities that do not cause economic problems. This weed control strategy has tremendous potential to provide economic and environmental benefits. Therefore, accurate, appropriate, and well-timed weed maps are a key component in taking full advantage of site-specific herbicide applications and avoiding over-application [14][15][16].
To develop effective SSWM strategies based on maps, robust methods for weed data acquisition and analysis procedures that integrate knowledge about the within-field spatial variability of infestations are required [5,13]. Regrettably, traditional weed sampling is not viable, as such information cannot be obtained exclusively from traditional weed sampling at a number of discrete locations because it is costly, labor-intensive, and requires a large number of weed field samples to achieve a good representation of weed infestations. To overcome the limitations of spatially-scarce weed field data, methods of interpolation are needed to estimate weeds in unsampled areas. Spatial interpolation methods create continuous surfaces from the weed field sampling points. A commonly used method of interpolation is kriging and its variants, which was developed in the domain of geostatistics. For a detailed presentation of the theory of geostatistics, interested readers should refer to textbooks, such as [17][18][19][20], among others, as only an outline of the interpolation methods used will be given here.
Geostatistical methods are based on the regionalized variable theory in which values of a random function are similar when they are close to each other, and the similarity decreases as the distance between data locations increases. To assess whether data are spatially correlated and to what extent, the semivariogram was developed. The semivariogram allows determining whether a variable presents spatial dependence (i.e., regular or patchy distribution) and quantifies this dependence. After that, for mapping a kriging interpolation method, a generic geostatistical term for a range of least-squares methods is needed to provide the best linear unbiased predictions for an estimate of the accuracy of this prediction (i.e., kriging variances or errors). Kriging usually performs better than other interpolation methods because it takes into account the manner in which a property varies in space through the semivariogram [21].
The success of any spatial interpolation method will depend on the quality of the data obtained. Maps based on observations at coarse resolutions will be poor in detail. Although kriging has been widely used to interpolate data and to create weed and treatment maps, e.g., [3,4,10,22,23], often because of the costs of sampling, there exist few sample sites to describe the variation in sufficient detail, which can lead to considerable uncertainty in the kriging estimations. Thus, researchers have suggested that this might not be the most suitable method for estimating weed densities [24,25]. To solve that, it is possible to improve the reliability of a weed distribution model by incorporating supplementary data and applying multivariate geostatistics, which offer a set of linear unbiased estimators with minimum error variance by a combination of kriging with secondary variables more densely sampled and spatially correlated with the primary variable (i.e., weeds in the case of SSWM) as suggested by [20,26]. The data set may not cover all variables at all sample locations and may use additional covariates that are sampled more intensely than the prediction variable [18]. The secondary variable may be sampled more intensely than the primary variable because it is easier or cheaper to measure, reducing the cost of sampling and increasing the accuracy of prediction of the sparsely sampled primary variable [19,27]. Elevation data, remote sensing data, and the apparent electrical conductivity of the soil and crop yield monitor data, among others, are generally collected at fine spatial resolutions much denser than, e.g., weed sampling data, and could, therefore, be used as secondary information sources. Multivariate spatial interpolation methods, such as cokriging, allow a variable of interest (primary variable) to be cokriged at a specific location from the data about itself and about secondary variables in the neighborhood and, then, a crosssemivariogram is used to quantify cross-spatial auto-covariance between the primary and secondary variables, which, in turn, is used in the interpolation procedure. Details of the cokriging system can be found in [18][19][20]26].
Remote sensing can offer a means of deriving continuous spatial coverage of agroenvironmental information for large areas at regular time intervals, and can, therefore, be extremely useful in providing secondary variables in the whole study area to estimate variables of interest [28]. Thus, cokriging has been successfully used to combine secondary data derived from satellite imagery or piloted airborne platforms for mapping herbaceous biomass [29], estimating forest parameters [27,30] or woody vegetation [31]. However, data from piloted airborne or satellite platforms are often not suitable for weed mapping because of their low spatial and temporal resolutions [32].
The need for more and better information to accurately map weed distribution within a field can be met by using other remote platforms, e.g., unmanned aerial vehicles (UAV), which can generate data with a high spatial-resolution. The use of UAVs with low-altitude flights (e.g., 30-60 m altitude) provides images with a very high spatial resolution for early mapping of weeds for in-season post-emergence SSWM programs, as reported by [33][34][35][36][37][38][39][40][41].
In addition, and as reported by [5], this increasing use of UAVs can provide a quick, timely, and economical source of secondary information when flying at higher altitudes (e.g., 120 m), which means a larger area of study with a lower UAV battery cost to estimate sparsely sampled primary variables, such as weed species, and, therefore, a more costeffective solution to improve their estimation accuracy without intensifying the field sampling of the primary variable.
Despite the above-mentioned advantages, thus far, the use of cokriging with UAVderived secondary variables to improve the spatial mapping of a sparsely sampled variable is scarce. There are few applications in soil science and environmental science. In precision agriculture, recently [42,43], geostatistical analysis and UAV multispectral imagery were used for the early detection of Xylella fastidiosa in olive trees and to provide probabilistic maps of infection risk. In weed research, to the best of the authors' knowledge, only one recent study used cokriging in combination with UAV imagery to improve the spatial mapping of Papaver rhoeas in winter wheat fields [5]. This study demonstrated that the multivariate geostatistical method of cokriging can be used to improve the accuracy of weed infestation maps at late phenological stages, combining sparsely sampled target variable and high-resolution UAV imagery. It would also be helpful to know the risk one runs if one does not spray certain areas of the field. The knowledge of the probability that an area within the field is below a certain threshold can be useful in making decisions for or against weed control using patch spraying [3,44]. The cokriged estimates represent useful tools for decision makers to develop appropriate SSWM strategies. In this sense, the display of cokriged estimates together with other agronomic variables, such as the infestation severity index or economic threshold, allows the production of thematic maps of individual attributes and may be an effective support for decision making.
For example, indicator kriging can be implemented for probabilistic mapping of weed density. Indicator kriging is a non-parametric geostatistical method similar to kriging that, rather than working with actual values, transforms the primary variable into a binary one based on pre-defined threshold values and maps the probability of exceeding the threshold [20,45]. This is directly useful for probabilistic decision-making. Using binary variables, indicator kriging proceeds the same as ordinary kriging or cokriging to predict the probability that the indicator-transformed variable is above the threshold value [20,46]. The indicator approach is based on the interpretation of the conditional probability of exceeding the cut-off value as the conditional expectation of an indicator variable [47]. After a fitting a semivariogram model, the application of kriging (or cokriging) directly to the indicator variables allows estimation of the probability of exceedance at each node of the interpolation grid. It is then possible to identify zones within the field where there is a great probability that the indicator variable is above a threshold. For example, if an infestation severity index is used as threshold value, the indicator kriging maps would display the probability that the weed density will be above the infestation severity index and could be used to assess the risk when implementing a SSWM, e.g., in patchy weed control technology that would occur if those areas with a moderate or high infestation severity were not treated with herbicides.
To practice SSWM by using UAV imagery, a subsequent phase consists of converting probability maps to prescription maps (i.e., a set of grids with the corresponding weed infestation values such as weed coverage or weed density) and to delineate potential zones for site-specific control only in areas where weed density exceeds an economic threshold indicating at which weed density an herbicide application is economically justified on a field [48][49][50]. This georeferenced information is exported to the sprayer to conduct the treatment according to the position of each grid and the weed-treatment decision (e.g., spray or do not spray) following an SSWM strategy. The ultimate objective is to decrease the amount of herbicide in comparison to a uniform weed treatment [12,37,51,52].
Although weed maps at late growth stages (e.g., flowering) for in-season SSWM strategies entails that late post-emergence weed control treatments at a late crop stage are generally inappropriate or ineffective, these weed maps may be useful for weed control in pre-emergence the following year if the weed patches are spatially stable over time.
In addition, the late-season spatial distributions maps within a field may also have other uses, e.g., they could be of benefit to prevent the spread of weeds to clean areas, which is particularly relevant for invasions of new weed species or emerging weeds that threaten an area, thus, helping the development of SSWM strategies. The current restrictions on the number and use of herbicides in cropping systems as well as the tillage frequency in European Union legislation have made it more difficult to control troublesome weeds and have led to an increase in the occurrence of lesser-known weeds, such as Centaurea diluta Aiton (common name: knapweed) [53], which is an emerging troublesome annual weed that has spread into southern Spain in recent years and is of increasing concern to farmers in the dryland cropping systems [54]. This taxon belongs to the complex genus Centaurea L. (Asteraceae), a taxonomically complex group centered in the Mediterranean region [55]. Centaurea L. (Asteraceae) includes species, such as C. calcitrapa L., C. iberica Trevir.ex Spreng, C. cyanus L., and C. solstitialis L. [56,57], which are considered troublesome weeds, whereas others, including C. diffusa Lam. and C. melitensis L., are of minor agronomic importance [53]. Most species are distributed in the eastern Mediterranean region and the Irano-Turanian regions, although some species are exclusive to the western Mediterranean region and others have a wider distribution. A total of 94 Centaurea species have been described in the Iberian Peninsula including the increasingly troublesome annual herb, C. diluta [53].
Centaurea diluta is an annual dicot native to north Africa and south-western Europe, including Spain. It is a vigorous plant that can reach up to 350 cm in height and produce a large number of capitula and hairy achenes (3 mm), which can be easily dispersed [58]. This species is becoming a major agronomic weed, primarily because of its large size and high competitive ability. There is little published information on methods for controlling C. diluta except for several technical reports [58,59]. C. diluta's impact has increased, largely because its seeds can easily contaminate cereal grains at harvest and can also be mistakenly seeded with cereal grains at planting [60]. Although the increasing importance of this weed in southern Spain and the possibility of invasion into northern Spain make C. diluta an important weed species to target for further studies [53], no research has been conducted to map its spatial distribution within field crops, which is the first step to develop and implement SSWM strategies. A better understanding of the spatial distribution of its infestations and an analysis of risk assessment are critical for the development and implementation of effective SSWM.
To our knowledge, there are no published studies on the spatial distribution of C. diluta in winter wheat fields and the potential for SSWM in winter wheat has not yet been assessed. Thus, the main goals of this research were to examine the spatial variability of C. diluta density at late phenological stages in winter wheat fields by combining multivariate geostatistical methods and high-resolution UAV imagery data and to delineate potential control zones for site-specific treatments based on occurrence probability maps of weed infestation. The novelty of the present work is not in the spatial interpolation methodology itself but in its application in a joint multivariate statistical analysis of weed density, UAV-derived secondary variables, and other agronomic variables, such as the infestation severity index or economic threshold, in order to detect and map C. diluta at late phenological stages in winter wheat fields and to generate probability maps for risk assessment in a SSWM. The specific objectives were to: (1) evaluate different variables derived from UAV imagery (wavebands and derivative products, such as band ratios and vegetation indexes) and their correlation with the weed density data to be used as source of secondary information in the spatial interpolation methods, (2) map the spatial distribution of C. diluta infestation by a spatial interpolation method of kriging and cokriging with UAV-derived secondary variables that yield a higher significant correlation with the primary variable, (3) generate probability maps for risk assessment in patchy weed control by indicator kriging and the infestation severity index, and (4) delineate potential zones for site-specific treatments based on occurrence probability maps of weed infestation developed by indicator kriging.

Study Area and Data Collection of Centaurea diluta
Field surveys were carried out in April 2018 in the provinces of Cordoba and Seville-typical agricultural areas in Andalusia (southern Spain)-to look for winter wheat fields infested with C. diluta. Two commercial winter fields were chosen to conduct the study because knapweed was the only broad-leaved weed in both fields with no other problematic species. One field was located in the province of Seville (Lantejuela), and the other was in the province of Córdoba (Fernan Nuñez). The fields were farmer-managed and the typical agronomic practices for the regions were used.
The primary variable (i.e., C. diluta density, which was the target variable of study) was obtained by field surveys. In mid-May, when the winter wheat showed the typical greenyellow color of the beginning of the maturation stage and C. diluta patches displayed an intensive pink-purple color corresponding to the flowering growth stage, an intensive grid weed sampling was carried out in selected areas within the larger fields, and their borders were at least 20 m from the main borders of the fields. Weed assessments (knapweed density) were performed following a grid of 10 m × 10 m, within two 0.25 m 2 frames. The details of the sampled areas and the resulting sampled sites in each field are shown in Table 1. The position of each frame was georeferenced using a real time kinematic (RTK) GPS linked to a reference station from the global navigation satellite system (GNSS) network from the Institute for Statistics and Cartography of Andalusia (IECA), Spain, and the total number of C. diluta plants was manually counted within the frame. The data were multiplied by 2 so that, for analysis, the weed density was characterized by the number of plants per m 2 . The number of plants per m 2 for each frame was assigned to the location of the sampling point within the field.

Acquisition and Processing of UAV Imagery
The secondary variables were derived from the UAV images taken the same day as the weed field samplings. The UAV imagery were collected following the same methodology reported by [5]. A model MD4-1000 quadrocopter UAV (microdrones GmbH, Siegen, Germany) and two sensors with different spectral and spatial resolutions separately mounted on the UAV: a still point-and-shoot camera model Olympus PEN E-PM1 (Olympus Corporation, Tokyo, Japan) and a modified, commercial, off-the-shelf camera, model Sony ILCE-6000 (Sony Corporation, Tokyo, Japan), were used to collect the remote images. The Olympus camera takes 12-megapixel images in true color (red, R; green, G; and blue, B, wavebands) and the Sony camera acquires 24-megapixel images and was modified to capture information in both near-infrared (NIR) and visible light (G and R).  4 87. The UAV images were collected at a flight altitude of 120 m with a transverse overlap of 60% and a longitudinal overlap of 90%, thus, allowing the generation of quality orthomosaicked images, according to previous research [39,61,62]. A set of artificial ground control points were placed in the fields and geo-referenced using the GNSS-RTK system to obtain the ground control points' coordinates to perform the imagery orthorectification and mosaicking process. In the course of the UAV flights, a 1 m × 1 m barium sulphate standard Spectralon ® panel (Labsphere Inc., North Sutton, NH, USA) was also placed in the middle of the fields to calibrate the spectral data. The orthorectification and mosaicking of the imagery set into a single image of the whole experimental field (Figure 1) was performed using Agisoft PhotoScan Professional Edition software (Agisoft LLC, St. Petersburg, Russia). An in-depth description regarding the configuration of the UAV flights, camera sensor specifications, and image pre-processing (georeferencing, calibration, orthorectification, and mosaicking processes) can be found in [39,[61][62][63]. Using ENVI 5.0 software (Research Systems Inc., Broomfield, CO, USA, 2012), the georeferenced sampling units from field surveys were identified and located in the corresponding ortho-mosaic and assigned to one of the classes considered in the study (i.e., weeds and crop) based on the information provided by the on-ground field sampling; and its digital values in B, G, R, and NIR wavebands were extracted. In addition to the four wavebands, three band ratios and eleven vegetation indexes derived from wavebands were calculated ( Table 2). The eighteen variables derived from UAV imagery were used in the multivariate spatial interpolation methods.
In this study the C. diluta density was considered the primary variable to be estimated and the UAV-derived variable was considered secondary information to increase the estimation accuracy. For each field, the weed density data were treated as a study case and statistically analyzed. Data distributions were described using classical descriptors (mean, maximum, standard deviation, skewness, and coefficient of variation). Similarly, the mean and standard deviation of the 18 UAV-derived secondary variables were calculated and the Kolmogorov-Smirnov test was performed on all data sets to verify that the variables were normally distributed. The data were transformed as needed to meet the normality assumptions prior to estimating weed density [64]. The linear correlation between the 18 secondary variables and C. diluta density was also examined using Pearson's coefficient to guide the choice of the multivariate data set to be submitted to geostatistical analysis.

Spatial Variability of Centaurea diluta: Kriging and Cokriging
Weed density data at sampling points were used for the prediction of values at unknown points and to map the spatial distribution of C. diluta infestations, first using the traditional geostatistical interpolation method of ordinary kriging, which does not consider the secondary data, and secondly, to include UAV-derived data in the spatial estimation of knapweed density, cokriging was used. The spatial structure of knapweed in each winter wheat field was described by computing a semivariogram that expresses the spatial dependence between the weed density at different separation distances and directions, whereas kriging and cokriging were used to interpolate and map the C. diluta density.
Cokriging is an extension of kriging that allows for the inclusion of additional spatial information when it is related to the primary variable and that takes into account the spatial cross-correlation between two variables, i.e., the primary variable (C. diluta density in this case) and another secondary variable (in this study, derived from UAV imagery data) to improve the estimation of the studied variable. Cokriging was performed with those UAV-derived variables that showed a moderate-strong correlation with the weed density.
To apply cokriging, experimental direct and cross-semivariograms for the knapweed density and the secondary variables (Table 2), as well as the cross-semivariograms between them, were first calculated separately. Potential directional trends in the spatial distribution of the weed density were checked by calculating the experimental semivariograms both isotropically and anisotropically. The selection of the theoretical semivariogram models was based on visual inspection, and they were fitted with a weighted least squares approximation and a specific semivariogram function, e.g., a spherical or exponential function. The parameters of the model (i.e., range, nugget, and sill) were determined. The degree of spatial dependence of the C. diluta was calculated using the spatial dependence index (i.e., the nugget variance expressed as a percentage of the total semivariance or sill). Spatial dependence values close to zero (≤25%) indicate a variable strongly spatially dependent or strongly distributed in patches, and spatial dependence values close to 100 (≥75%) indicate a weakly spatially dependent variable [75]. The semivariogram model was then used for kriging and cokriging to estimate the weed density on a regular grid of 0.5 m.
The comparison of the spatial interpolation methods' results and the evaluation of the accuracy of predictions in unknown areas was assessed applying the leave-one-out cross validation [17] by temporally removing one weed density observation at a time from the data set, re-estimating it from the remaining C. diluta density data, and then determining the corresponding error as the differences between the observed and the estimated values using the kriging and cokriging. Thus, repeating this procedure for all the weed density experimental data in each field, the statistical parameters were calculated for comparison among the interpolation methods. The statistical parameters calculated were the mean error (ME), the root mean square error (RMSE) and the root mean squared standardized error (RMSSE) [17].
The mean error measures the bias of the prediction and should be close to zero for unbiased methods. This indicates whether the model is, on average, producing estimates that are overestimating or underestimating the observed values [76]. The root mean square error measures the average precision of the prediction and should be as small as possible. The model that performs the best will be the one with the smallest RMSE. This would suggest that the predictions are impartial and close to the respective real values [76]. The root mean squared standardized error is the root of the mean squared difference between the observed value and the estimated value standardized by the kriging variance. If the model is accurate, the mean squared error should equal the kriging variance and then the RMSSE value should be close to 1 [17,77]. If the RMSSE is greater than 1, then the variability of the predictions is underestimated; if the RMSSE is less than 1, the variability of the predictions is overestimated [17]. The last step of the geostatistical analysis was to map the spatial distribution of the C. diluta density based on the kriged and cokriged estimates.
Evaluation of the accuracy of predictions methods was carried out as a quantitative assessment of the results of kriging and cokriging. When comparing the results estimated with kriging and cokriging with the UAV-derived variables (wavebands, band ratios, and vegetation indexes) the RSME of the predictions was analyzed. The RMSE of kriging, which did not consider the secondary data, was compared with that of different approaches of cokriging by using the RMSE of kriging as the standard. For each prediction method, the RMSE was calculated as an overall indication of the map precision quality [5].

Probability Maps for Risk Assessment in a Patchy Centaurea diluta Control: Indicator Kriging
The display of cokriged estimates after back transformation allowed the production of thematic maps of individual attributes and may be an effective support for site-specific decision making. In this study, indicator kriging was performed to characterize the probability of surveying weed density values greater than a critical threshold value. Rather than working with actual values, this technique transforms the primary variable (C. diluta density values estimated by the best performing cokriging method in this case) into a binary one based on the infestation severity index threshold, and maps the probabilities of the critical value being exceeded at each node of the interpolation grid, in order to develop probability maps for risk assessment in patchy weed control.
For each indicator kriging, the primary variable (i.e., the weed density values estimated by cokriging with NDVI, RVI, ExR, and ExR(2) UAV-derived variables) was transformed into an indicator variable (discrete binary variable) with only two possible outcomes using the threshold (binary threshold definition function) of the ArcGIS Geostatistical Analyst module. Using the threshold function, if values are above the threshold, they become 1, and 0 otherwise. It should be noted that the threshold values accounted during the indicator kriging process supposed a critical aspect of this analysis.
There is no recommendation in the literature regarding the threshold value of the C. diluta density probability levels to be used as a threshold; thus, the values chosen here were established taking into account the weed counting in the field studies as well as their large size and high competitive ability. In this research, the indicator variables were created using the infestation severity index of 2.2 plants per m 2 as the threshold value adapted from [3,78,79].
The practice of indicator kriging involves calculating and modelling indicator semivariograms of indicator-transformed data at a range of thresholds, which should cover the range of the input data. Then, the semivariograms model parameters were determined. The model performance was checked by the leave-one-out cross validation of the indicator kriging prediction of infestation severity (i.e., the weed density) greater than the defined threshold. This works the same as for parametric kriging: removing one data point at a time from the data set and predicting the probability of a true indicator from the remaining data, and then comparing this probability with the actual value of the indicator. The spatial structure of indicator variables, which integrates the information of the weed density, was used to map the probabilities of finding sites with weed densities exceeding the infestation severity threshold by indicator kriging on a regular grid of 0.44 m in Fernan Nuñez and 0.45 m in Lantejuela.
The probability maps of sites with a C. diluta density exceeding the threshold value could be a useful tool to generate site-specific prescription treatment maps. Based on the estimated probabilities described above and setting the desired cut-off value, prescriptions maps were developed by indicator kriging with an additional cut-off. Indicator kriging with additional cut-offs is a special case of cokriging. In this case, the cut-off acts as an additional variable. Additional cut-offs can compensate for the loss of information caused by coding data with indicator functions; however, as a cokriging method requires fitting cross-covariances, it requires additional modelling decisions and parameter estimation. As mentioned above, there are also no recommendations regarding the probability levels to be used as cut-offs for C. diluta, thus, the value chosen here was somewhat arbitrary and based on the economic threshold of other weed species of a similarly sized group-adapted from [79]. The economic threshold, i.e., the weed infestation severity index causing a reduction in the net winter wheat yield amounting to the control treatment cost, was estimated at a low index, and therefore, the cut-off value (i.e., the critical value for the probability map creation) was chosen equal to 0.6 plants per m 2 , as this was assumed as a limiting value for C. diluta control. Using binary variables, indicator kriging with the additional cut-off proceeded the same as cokriging. The subsequent phase consisted of converting probability maps to prescription maps and delineating zones for site-specific treatments, and, for this, the resulting cokriging geostatistical layers were converted to block interpolated grids using a cell size equivalent to the treatment machinery size. Therefore, the resulting interpolation maps showed the probabilities for the resulting blocks to be below or above the cut-off value (i.e., the economic threshold).
All geostatistical analyses, i.e., semivariogram modelling, kriging, cokriging, indicator kriging, and indicator kriging with the additional cut-off, were performed using the ArcGis 10.8 software and its Geostatistical Analyst extension.

Exploratory Data Analysis
The statistical analysis of the data revealed substantial spatial variation in the C. diluta density either within fields or between and a generally biased distribution of the data (Table 1). In Fernan Nuñez, knapweed was present at a low density (2.04 plants m −2 ) and at a moderate density in Lantejuela (3.75 plants m −2 ). Although the weed density data were not high (more than 3 plants m −2 in the field with the highest values), the infestation levels cannot be considered insignificant or negligible because C. diluta has been considered an emerging troublesome weed, and these density values could become a major agronomic problem in wheat fields primarily because of its large size and high competitive ability [59]. The primary variable data showed positive skewness, with values far away from 0, which is indicative of a non-normal distribution. In this case, the elevated values of the coefficient of variation confirmed that the weed density did not follow a normal distribution, and therefore, transformations were needed. The data were logarithmically transformed after a constant value of 1 was added to stabilize the variances and to approximately normalize the data prior to estimating the weed density [64]. The skewness values of log-transformed data (between 0.77 and 0.99) and the lower coefficient of variation of log-transformed data suggested a normal frequency distribution, allowing the successive procedures of variography and spatial interpolations. The 18 secondary variables derived from the UAV imagery showed low positive skewness and no transformations were needed (results not reported).
From the analysis carried out using Pearson's linear correlation coefficient (Table 3), a statistically significant correlation between most UAV-derived secondary variables and weed density was apparent-although not to the same extent. Since these values proved to be significant, the use of high-resolution UAV imagery as secondary information into the mapping of knapweed density was worth accounting for. Therefore, cokriging techniques were used for further data analysis.
In the two fields studied, the results indicated a strong correlation between the C. diluta density and the R waveband, the R/B band ratio, and the NDVI, RVI, DVI ExR, and ExR(2) vegetation indexes (r > |0.5|); and a moderate correlation with the B and NIR wavebands, the R/G and B/G band ratios, and the NGRDI and ExGR vegetation indexes (|0.20| < r ≤ |0.50|). For the other secondary variables under consideration (G and ExG, ExG(2), GDVI, and GNDVI), no statistically significant correlation with weed density was found (r ≤ |0.20|) ( Table 3). These results agree with those of previous research, which suggests the robustness of the NDVI index with respect to quantification of vegetation at NDVI [80], justifying its use as the vegetation index most used in agricultural studies [81][82][83]. The lack of consistency of performance observed for other vegetation indexes can be attributed to the fact that the measured spectral response of a given environment depends uniquely on the atmosphere, sensor calibration, ambient lighting conditions, soil background, and the homogeneity of the scene [84]. As the correlation increases, the infor-mation provided by the secondary variable to the primary variable increases; thus, stronger correlations will result in more accurate estimations of multivariate geostatistics [18]. The exploratory statistics allowed us to identify the secondary variables derived from the UAV imagery that showed a statistically significant correlation with the weed density and that would therefore be useful in determining the spatial distribution of C. diluta density. The multivariate spatial interpolation methods to estimate weed density for the whole winter wheat field were performed with those UAV-derived variables that displayed a moderate-strong correlation with the weed density data (i.e., r > |0.2|). The wavebands R, B, and NIR; the band ratios B/G, R/G, and R/B; and the vegetation indexes NDVI, RVI, DVI, NGRDI, ExR, ExR(2), and ExGR were used in the subsequent cokriging analyses.

Spatial Variability of Centaurea diluta
A first examination of the directional experimental semivariograms for the knapweed density showed that there was no significant difference in the variation according to direction. Therefore, omnidirectional direct and cross-semivariograms were calculated. Nested semivariogram structures were not used since adequate fits were achieved with a simple structure. For both spatial interpolation methods, the exponential theoretical model was selected as the optimal in the two fields, corroborating other studies that described this model as one of the best adjusted with weed data along with the spherical model [3,5,10,23,24]. The parameters of the fitted direct-and cross-semivariograms and their cross-validation results for each field are presented in Tables 4 and 5, respectively. The nugget effect can be caused by measurement errors or by the fact the grid spacing used is not enough to detect the spatial variability between samples. Most of the direct and crosssemivariograms models showed a zero or very low nugget component, supporting that the sampling intervals used in this study were appropriated and that no further sampling on a finer spatial scale was needed. The semivariogram range values were of about 21 m in Lantejuela to almost 32 m Fernan Nuñez (Table 4), which also corroborated that the sample grids were large enough to determine the spatial variability range for knapweed infestations and that they were appropriate for field surveys. The spatial dependence index was below 25% in both locations (Tables 4 and 5), indicating a high spatial dependence between measurements and a low contribution of stochastic measurement error. The low spatial dependence index values indicated that weed density is a variable strongly spatially dependent or strongly distributed in patches and confirmed that the selection of the geostatistical spatial interpolation method was reasonable and beneficial because the spatial structure inherent to the data sets could be well implemented with a semivariogram model in the interpolation process. The cross-validation results for the two fields confirmed that the direct-semivariogram models met the cross-validation conditions and were suitable for kriging interpolation ( Table 4). The isotropic exponential cross-semivariograms models calculated between the C. diluta density and the secondary variables adequately described the variation of the knapweed density for each field even though the parameters of the cross-semivariogram models varied depending on the field and UAV secondary variable considered. As a rule, the better-structured cross-semivariograms were generated with those secondary variables highly correlated with weed density ( Table 5). The range and the sill of the cross-semivariograms models varied from field to field, but they were similar for a given field regardless of the UAV-derived variable considered. The cross-validation results also varied from one secondary variable to another and between fields; however, adequate statistical parameters were obtained for all models ( Table 5). The cross-validation statistical parameters were used as comparison criteria for the models with different secondary variables. The models that performed the best were the ones with the smallest RMSE and the closest mean error to zero and were considered suitable for the cokriging interpolations-that is, the R waveband, the R/B band ratio, and the RVI, NDVI, DVI, ExR, and ExR(2) vegetation indexes.
On the other hand, the RMSSE values of the ordinary kriging were greater than 1 in both fields (i.e., 2.0133 and 2.9935 in Fernan Nuñez and Lantejuela, respectively) ( Table 4) indicating that kriging was, on average, producing estimates that underestimated the density observed data, whereas those of cokriging were, in general, closest to 1, suggesting that the predictions were impartial and close to the respective real weed density value.
Underestimation of C. diluta infestations could contribute to increasing the current knapweed agronomic problem in winter wheat fields. Thus, in similar field conditions, we highly recommended the use of the most accurate spatial interpolation method, i.e., cokriging with UAV-derived secondary to avoid future infestations and their spread into other fields, since its seeds can easily contaminate cereal grains at harvest and can also be mistakenly seeded with cereal grains at planting [60].
The direct-and cross-semivariogram models selected were then used for kriging and cokriging to estimate the weed density on a regular grid of 0.5 m in both winter wheat fields. The kriging and cokriging values were transformed back to the original scale to map the spatial distribution of C. diluta at late phenological stages. Representative maps of these observations are shown in Figures 2 and 3.  A visual evaluation revealed a strong to moderately patchy distribution of knapweed density, which was also supported by the direct-and cross-semivariogram analyses. In general, perceptible differences existed between the kriged ( Figure 2) and cokriged maps, as well as among the cokriged maps with the different secondary variables used; however, cokriging generated similar trends in the spatial variability of C. diluta density in the two fields ( Figure 3).
The kriged maps appeared to be too smooth to reproduce the measured maximum and minimum values. Overall, this technique tended to underestimate the weed density. The cokriged maps were characterized by a higher spatial variation with several hot spots that were not defined or appreciable in the kriged maps because ordinary kriging over-smoothed the spatial variability of the C. diluta density.
The cokriged maps exhibited more local detail and less smoothness in their depiction of the variability of weed density. Comparable results were found for other weed species cokriged with UAV-derived variables as covariates [5,13,23]. The improved local detail of the cokriging maps was due to the finer spatial resolution of the UAV-derived secondary variables. In addition, the UAV-derived secondary variables that exhibited less local detail with cokriging were also the secondary variables least correlated with the weed density (Table 3). This result could be due to the relatively low cross-correlation between these variables (e.g., B, B/G, R/G, NGRDI, or ExGR) and the weed density; thus, modelling of the cross-semivariogram was made difficult. The result could also be due to the geometric arrangement of the weed sampling points themselves and how that arrangement affected the modelling of the semivariograms and cross-semivariograms [18,26].
The spatial distribution maps of C. diluta produced by cokriging with UAV-derived variables had the best visual resolution compared to those produced using only spatial information (kriging) in both fields. There was some similarity in the weed map patterns as produced by cokriging (Figure 3). However, cokriging with R and R/B over-smoothed the spatial variability of the knapweed density. Comparatively, cokriging with vegetation indexes reflected the local variation more than cokriging with wavebands and band ratios. Few other observations could be made from the visual assessment of the kriged and cokriged maps. Differences in the mapping characteristics of the best-fit semivariogram models made it difficult to conclude real differences among maps. Maps are effective for identifying trends; however, the actual differences between methods should be analyzed quantitatively [85]. This was carried out as a quantitative assessment of the results of kriging and cokriging.
The quantitative comparison of the interpolations results in terms of RMSE showed that the RMSE for cokriging was consistently lower than the RMSE for kriging as expected if the cokriging helps to improve the prediction of the knapweed density values at unknown points (Table 6). Table 6. Average performances of spatial interpolation methods and improvement by cokriging (%). The conventional geostatistical method (kriging) exhibited the highest RMSE value between observed and predicted density data with an average value of 3.384. The best performing cokriging method was the one that resulted in the lowest RMSE.

Prediction
In summary, the results confirmed that using cokriging with high resolution UAVderived secondary variables significantly improved the accuracy of the C. diluta density estimations compared to kriging, as stated by the reductions in RMSE values.
The findings of the current study show that the lower correlations between the knapweed density and wavebands (e.g., B and NIR) and band ratios (e.g., B/G and R/G) resulted in the lowest reductions (largest gains) in RMSE, whereas the vegetation indexes exhibited the largest reduction in RMSE.
The RMSE values of cokriging with R, B, and NIR, 1.39, 1.54, and 1.62, indicated 59%, 54%, and 52% improvements in performance relative to kriging, respectively. In general, the RMSE values of cokriging with vegetation indexes were lower and closer to one than those obtained with cokriging with wavebands or band ratios. Cokriging with RVI, NDVI, ExR, and ExR(2) provided the best performance compared to kriging, showing more than 69% improvements. Therefore, cokriging with RVI, NDVI, ExR, and ExR(2) were clearly the best methods for the prediction of C. diluta density (Table 6). These results are similar to those reported by authors working with different weed species [5] and are in agreement with previous studies, which found that a reduction in kriging variance was observed as the correlation between the covariate and primary variable increased. The potential of cokriging to improve kriging estimates is high if the associations between the primary and secondary variables are robust and important [17,26,29,86]. The efficacy of cokriging is more than a function of correlation between a secondary and target variable. The efficacy also includes the strength of the spatial cross-correlation between the covariate and the target variable, the geometric sampling pattern, and the ratio of the sampling intensities of the covariate and the target variable [87].

Probability Maps for Risk Assessment in Patchy Centaurea diluta Control
Semivariogram model parameters and cross-validation results for the indicatortransformed data (i.e., weed density values estimated by cokriging with NDVI, RVI, ExR, and ExR(2) transformed into indicator variables) are reported in Table 7. Omnidirectionalstable semivariograms were fitted with a weighted least squares approximation. After fitting the semivariogram models to the experimental semivariograms of the indicator variables, the application of cokriging directly to the indicator variables allowed us to estimate the probability of exceedance at each node of the interpolation grid. As shown by cross-validation results for both fields, although the statistics also varied from one indicator variable to another and among the fields, adequate cross-validation results were obtained for all semivariogram models, indicating that the adopted models would be suitable for indicator kriging analysis (Table 7). It was then possible to map the probabilities of finding sites with C. diluta issues by indicator kriging in both winter wheat fields. Like the C. diluta density data, semivariograms of the different indicator variables that integrated information of weed density showed strong spatial structures with a range of spatial dependence of about 28 m in Fernan Nuñez and 22 m in Lantejuela, which was also corroborated from the visual assessment of the probability maps of sites with weed density exceeding the threshold value. The probability maps corresponding to the testing with different indicator variables for both fields (Figure 4) appeared consistent with the cokriging maps ( Figure 3). In general, the spatial pattern changed very slightly among the indicator kriged maps based on the indicator variable used.
Small visual discrepancies existed between the indicator kriging maps based on the indicator variable used; however, indicator kriging generated similar trends in the spatial distribution of the probabilities of the weed density exceeding the threshold in both fields. In Fernan Nuñez, the probability maps showed two clearly differentiated wide areas: the lower zone where the probability of exceedance was greater than 80% and the upper right with a probability of less than 30%. Similarly, in Lantejuela, the indicator kriged maps showed an area in the upper left part of the field where the probability of occurrence of C. diluta infestations above the threshold of 2.2 plants per m 2 was high (e.g., a probability greater than 80%) and a wide area in the left part of the field where the probability was low; however, this probability increased quickly, becoming greater than 80% in several hot patches that were very well defined and easily appreciable.
For interpreting these maps, the zones with greater probability values (light areas) were those that were more likely to surpass the limit value of the threshold. It would be helpful to know the risk one runs if one does not spray certain areas of the field. Therefore, the knowledge of the probability of an area within the field to be above a certain threshold can be useful in making decisions for or against weed control using patch spraying. The light areas have high probabilities to be above the threshold. That means the risk for the farmer is high if he does not spray these zones. It is clearly seen that areas with high probabilities correspond to areas where the C. diluta occurrence in the original data set was above the threshold. If farmer wants to run a low risk, the probability of the occurrence of weed infestations must be set to a high value to determine the areas to be treated with herbicides. The higher the probability is, the lower the risk. Such pieces of information can be advantageously used in weed management to formulate useful recommendations for SSWM, such as applying herbicide treatments at those locations where the probability of weed density above the threshold is the highest. The probability maps produced by indicator variables made from estimates of cokriging with RVI and ExR had the best visual resolution and cross-validation results (Figure 4). These probability maps of sites with C. diluta density exceeding the threshold value were used to generate site-specific prescription treatment maps. Based on the estimated probabilities described previously and setting the desired cut-off equal to 0.6 plants per m 2 , prescription treatments maps were developed by indicator kriging with an additional cutoff. The resulting geostatistical layers were converted to block interpolated grids using a cell size of 15 m × 15 m, because this size is the usual treatment machinery size in the study area. Therefore, the resulting interpolation maps (prescription maps) showed the probabilities for the resulting 15 m by 15 m areas to be below or above the cut-off value (i.e., the economic threshold). Thus, the blocks with values above the critical value (i.e., >0.6 weeds per m 2 ) exceeded the economic threshold and would require weed control actions.
The knowledge of the probability of an area within the field to be above a certain cut-off can be useful to delineate potential zones for site-specific control only in areas where the weed density exceeds an economic threshold ( Figure 5). This multivariate geostatistical approach enabled the delineation of the winter wheat field into two zones with different herbicide prescriptions related to differences in the knapweed density and economic threshold.
In the site-specific prescription treatment maps, it is then possible to identify two potential zones within the fields: zones that should be treated with herbicide (i.e., blocks with knapweed density values exceeding the economic threshold) and zones that should not be treated (i.e., blocks with no or very low weed density, of less than 0.6 plants m −2 ) ( Figure 5). These prescription maps can be exported to the sprayer to conduct the treatment according to the position of each block and the weed-treatment decision (e.g., spray or do not spray) following an SSWM strategy. The ultimate objective is to decrease the amount of herbicide in comparison to a uniform weed treatment. SSWM based on maps of probabilities of exceedance of cut-off value could lead to herbicide savings. Thus, economic and environmental savings can be attained by applying herbicide only to infested weed zones according to the economic threshold. If post-emergence herbicides to control C. diluta were only applied to control infested areas, the average herbicide costs could be approximately 80%. The cut-off value was defined to allow the farmer to make a decision based on numerical results and not only on visual, subjective, and arbitrary criteria. Thus, the farmer can either choose to treat the block or not treat the block. The site-specific prescription treatment maps could be applied in late post-emergence (for in-season control) or in pre-emergence the following year. Site-specific post-emergence control treatments at a late crop stage are generally inappropriate or ineffective for in-season SSWM strategies. Nevertheless, these maps may be useful for weed control in pre-emergence the following year if weed patches are spatially stable over time.
For numerous weed species, it was found that patches occurred more or less at the same location over years indicating that the within-field weed distributions are relatively stable between years. If weeds are expected at the same locations over years, this makes any map produced useful in future years ( [6,13,88,89], among many others). Although, the greater the time between sampling and prediction, the less accurate the prediction becomes [13].
In southern Spain, sunflower-wheat is a common and traditional crop rotation. C. diluta has become a major agronomic weed in the rotation as it infests both crops and is also considered to be a problem in sunflower. In winter wheat, it is effectively controlled by pre-emergence herbicide treatments. However, post-emergence herbicides cannot adequately control C. diluta mainly because of its staggered emergence and quick growth [90]. In sunflower, knapweed is also a major agronomic problem since herbicides cannot be applied, as the weed and crop belong to the same dicotyledonous botanical family; thus, tillage is frequently conducted to reduce C. diluta infestations [58,59]. Since legislation in the European Union has restricted the number and use of herbicides as well as tillage frequency in cropping systems, knapweed has become more difficult to control, and this has led to an increase in its occurrence in southern Spain. In recent years, it has spread into southern Spain and is of increasing concern to farmers in that region.
The main difficulty of the proposed approach for identifying C. diluta-infested areas is to choose an appropriate probability threshold above which control action must be undertaken. Decision making is straightforward for the areas with a very high or very low probability, whereas it is much more difficult for the areas with intermediate probabilities in the range of (0.30-0.70) [91]. Therefore, the choice of a probability threshold remains mainly subjective.
Up until recently, knapweed has not been considered a major agronomic weed, and as such, relatively little agronomic research has been conducted to study its behavior and performance as a weed species in crop fields. There is also relatively little published information on methods for controlling C. diluta, except for several technical reports [58,59,79]. Hence, there are also no recommendations regarding the probability levels to be used as thresholds and cut-offs for C. diluta density. In this research, values were established taking into account the weed counting in the field studies, the large size, and the abundant aerial part as well as the high competitive ability, which together mean that, even at low densities, knapweed could entail a serious agronomic problem in field crops.
The threshold values accounted during the indicator kriging and indicator kriging with an additional cut-off process supposes a critical aspect of this analysis; therefore, with this troublesome weed, it would be highly recommended and more profitable to carry out a conservative approach and to establish a low threshold and cut-off value to avoid future infestations and its spread into other fields.

Conclusions
The results of the present study demonstrated that spatial interpolation methods in a joint multivariate geostatistical analysis of weed density and high resolution UAV-derived secondary variables can be successfully used to examine and accurately map the spatial distribution of C. diluta density at late phenological stages in winter wheat crops.
Cokriging with UAV-derived secondary variables reduced the errors of interpolation, and therefore, improved the accuracy of weed-patches mapping, which was confirmed by cross-validation. The best resolutions of weed infestation maps were obtained when cokriging was performed with the UAV-derived secondary variables that yielded the highest correlation with knapweed and showed the strongest spatial cross-semivariogram structures.
The results also suggest the great potential of high-resolution UAV imagery to be used in precision agriculture as a source of secondary information for improving the estimation of sparsely sampled variables using multivariate geostatistics without intensifying the sampling of the primary variable. Thus, the display of cokriged estimates allowed the production of thematic maps of individual attributes and was an effective support for site-specific decision making. In this study, indicator kriging was performed for characterizing the probability of surveying weed density values greater than a critical threshold value. The knowledge of the probability of an area within the field to be above a certain cut-off was used to delineate the winter wheat field into two zones with different herbicide prescriptions treating only those areas where the C. diluta density exceeded an economic threshold. These prescription maps can be exported to the sprayer to conduct the treatment according to the position of each block and the weed-treatment decision (e.g., spray or do not spray) following an SSWM strategy and attaining herbicide, economic, and environmental savings.
In conclusion, sparse samplings of the C. diluta density in-field combined with remotely sensed secondary information obtained from UAV imagery and data processing according to geostatistical procedures may lead to a better understanding of the spatial variability of weeds. As a consequence, better recommendations can be made about spatially variable weed management, thus, reducing the production risk in a spatial sense.
This study demonstrated the usefulness of multivariate geostatistical procedures to produce a quantitative description of spatial variability that will be used for yielding thematic probability maps. These late-season spatial distribution maps within a field may also have other uses, e.g., they could also be useful to complete or improve other mapping studies at very early phenological stages and in other crops and could be used together for developing effective post-emergence SSWM strategies, particularly in those cases in which the detection and mapping at early phenological stages is complicated.
Up until recently, C. diluta has not been considered a major agronomic weed, and, as such, no research has been conducted to map its spatial distribution within field crops, which is the first step to develop and implement SSWM strategies. There is also relatively little published information on methods for controlling knapweed. However, the increasing importance of this weed in southern Spain and the possibility of invasion into northern Spain as climate change makes C. diluta an important weed species to target for further studies. Further studies are required to investigate the application of this approach to different crops (e.g., sunflower, since sunflower-wheat is the main rotation in Andalusia) and under different cropping and management strategies.