1. Introduction
Information regarding the spatial and temporal variability of soil attributes plays a crucial role in the development of effective soil management strategies. By examining these data, farmers can adopt the most suitable cultivars and plant population densities for each specific point within the production area. This, in turn, facilitates the precise determination of the required amounts of fertilizers and soil acidity correctives, not only to maximize financial returns but also to promote more sustainable production [
1].
However, developing an effective strategy for collecting data to characterize the spatial and temporal variability of soil attributes is a complex and challenging task. Research has highlighted the importance of establishing dense sampling grids, with a minimum density of one sample per hectare, to adequately capture the variability of soil attributes [
2,
3,
4]. Webster and Oliver [
5] demonstrated that variograms used to infer soil attributes at unsampled points are unreliable when based on fewer than 100 data points, potentially leading to inaccurate estimates with significant margins of error. Therefore, grid sampling can provide a precise basis for variable rate application, but the costs and labor requirements, especially in extensive areas with high variability, suggest that other approaches may be more economical [
6].
To understand the spatial and temporal variability of soil attributes without the need to establish dense sampling grids, studies have demonstrated the potential of using soil sensors or the crop itself as a ‘soil sensor’ [
1]. Apparent soil electrical conductivity sensors, yield maps, and canopy reflectance indices can provide maps with different spatial and temporal variability patterns and be used to delineate homogeneous areas known as Management Zones (MZs) [
7,
8,
9,
10,
11,
12]. Within each MZ, a low variability of soil attributes is assumed, recommending the collection of a single composite sample. Based on specific levels of these attributes, targeted management practices for each MZ are established [
1]. This strategy reduces soil sampling costs compared to dense sampling grids [
13,
14]. At the same time, it provides a better distribution of management practices (cultivars, plant density, fertilizers) compared to the conventional soil sampling method, in which only a single attribute level—and consequently, a single management strategy—is determined for the entire area.
Although the development of MZs through these methods represents an advancement in precision agriculture, their adoption among farmers remains limited. This limitation is largely due to difficulties in accessing reliable historical yield maps, electrical conductivity data, and multispectral satellite image time series with high temporal resolution. For example, yield maps have been available since the early 1990s, yet their adoption is still limited to only 5% to 25% of the total cultivated area in the United States for crops such as winter wheat, cotton, sorghum, and rice, and 45% for corn and soybean crops [
15]. Apparent soil electrical conductivity presents itself as an attractive alternative because it can be quickly and easily measured for fields using electromagnetic induction instruments. However, this type of data collection strongly depends on specialized service providers for data acquisition and interpretation, whose availability varies across agricultural regions, complicating the implementation of this technology.
The use of multispectral optical images, freely available from orbital platforms such as Landsat-8 and Sentinel-2, enables remote service delivery and extensive spatial coverage. However, its application faces significant challenges, such as cloud cover, which compromises consistent data acquisition. This issue is particularly critical in tropical regions, where average annual cloud cover can reach approximately 66%, hindering the construction of representative historical time series [
16,
17]. Therefore, to expand farmers’ adoption of MZs, it is essential to develop alternative methods capable of efficiently characterizing the spatial and temporal variability of soil attributes, with a lower cost per unit area and broader spatial coverage.
A promising line of research for characterizing the spatial and temporal variability of soil attributes through MZs is the use of Synthetic Aperture Radar (SAR) data. The Sentinel-1 mission, part of the European Union’s Copernicus program, currently consisting of the Sentinel-1A sensor, freely provides SAR imagery with a spatial resolution of 20 × 22 m and a temporal resolution of 12 days [
18]. Equipped with an active C-band SAR sensor operating at a central frequency of 5.405 GHz with dual polarization (vertical–vertical and vertical–horizontal), this satellite can penetrate cloud cover and acquire imagery both day and night [
19,
20,
21,
22]. Moreover, its electromagnetic waves, characterized by a longer wavelength, can penetrate the superficial vegetation layers and, in some cases, reach deeper soil layers. In areas without vegetation cover, wave penetration into the soil makes SAR data sensitive to both the dielectric properties (such as soil moisture) and geometric characteristics (such as surface roughness) of the soil [
23,
24]. In agricultural contexts, SAR backscatter data have been used, either alone or in combination with multispectral data, for various applications, including soil moisture estimation [
25,
26,
27], the assessment of soil physical properties [
28,
29,
30,
31], and the estimation of multispectral indices such as the Normalized Difference Vegetation Index (NDVI) [
32,
33,
34], among other applications.
Therefore, the previously mentioned properties highlight the potential of Sentinel-1 SAR imagery as a rich source of spatiotemporal information, making it promising for estimating soil attributes through the delineation of MZs. A methodology can be applied to create temporal profiles of backscatter with dual polarization—VV (vertical–vertical) and VH (vertical–horizontal)—from SAR data, complemented by the calculation of specific SAR indices. These temporal profiles can be analyzed using unsupervised classification techniques to identify regions with similar backscatter responses, potentially associated with variations in soil attributes. Although the fuzzy clustering of time series of vegetation indices, such as NDVI, has been widely used for the delineation of MZs, this study introduces two main innovations: (i) the application of autoencoders for compressing SAR time series and (ii) the direct comparison of different sampling strategies, including conventional methods (a single composite sample), rectangular cell-based methods ranging from 5 to 10 ha (a composite sample per cell), and random cells with varying sizes and shapes. Thus, the objective of this study was to develop a method for mapping soil attributes through the delineation of MZs using SAR data provided by Sentinel-1.
4. Discussion
In the context of remote sensing applied to agricultural fields dedicated to grain production, SAR backscatter time series were used in this study to delineate MZs. Although the Sentinel-1 satellite has a nominal temporal resolution of 12 days for South America, a shorter revisit interval was observed, possibly due to the overlapping of imaging swaths during consecutive satellite passes. Previous studies have demonstrated that higher temporal resolution enhances trend detection and the identification of spatiotemporal patterns related to crop phenology [
54]. Moran et al. [
55] pointed out that, in the case of C-band SAR data, temporal resolutions between 3 and 6 days are more suitable for distinguishing crop types and monitoring their phenology, while daily monitoring is necessary to capture the rapid changes in soil moisture conditions. Therefore, the higher temporal resolution observed provides improved conditions for understanding and interpreting variations in backscatter over time, potentially contributing to a more accurate delineation of MZs and greater accuracy in estimating soil attributes.
The analysis of the relationship between VV and VH backscatter values and SAR indices revealed that certain indices exhibit strong correlations, indicating potential redundancies. This finding aligns with the study by [
56], which identified that the RVI, NRPB, VH/VV, and VV/VH indices show high mutual correlation, with values greater than 0.95 or less than −0.95. Additionally, in both our analysis and the cited study, the RVI4SI index displayed the lowest correlation compared to other indices. Therefore, these findings suggest that, regardless of the agricultural fields studied, the relationship between VV and VH backscatter values and SAR indices tends to follow similar patterns.
From September to December, an increase in VV and VH indices was observed. This increase may be related to the return of the rainy season, which raises soil moisture content. Indeed, during this same period, there is an increase in the monthly accumulated precipitation, elevating soil moisture levels. Additionally, the planting period for agricultural crops, which occurs between September and October, also influences this phenomenon, as increased biomass intensifies signal backscatter [
57]. However, during the same period, the VV/VH index showed a decline. According to studies, the VV polarization band is particularly more sensitive to soil moisture compared to the VH band, leading to a reduction in the VV/VH index during this period [
58,
59].
Between December and January, a stabilization of VV and VH backscatter values is observed. This phenomenon occurs because, with the crop biomass fully developed, there is an attenuation effect from the canopy on the bands, reducing their sensitivity to soil moisture variation. Ref. [
59] showed that the sensitivity of VV and VH bands to soil moisture variation decreases with the increase in vegetation cover growth (NDVI) and is stronger in the VV polarization than in the cross-polarization VH. El Hajj et al. [
57] demonstrated that the VV polarization C-band penetrates the maize canopy even when the crop is at its biomass peak (NDVI > 0.7). However, penetration was limited in wheat and pastures. Therefore, during the crop canopy development, vegetation may become the primary component contributing to the volume scattering of the backscattered signal, while the influence of soil may become secondary. Finally, between April and August, there is a strong downward trend in backscatter values for both VV and VH polarizations. This behavior may be associated with the decrease in precipitation during this period, resulting in lower soil moisture content. Since the decrease in backscatter is more pronounced in VV polarization compared to VH, an increase in the VV/VH ratio is observed.
The evaluation of experimental semivariograms in GRID-1, for both fields, highlighted the spatial dependence of soil attributes. The SDI, which relates the nugget effect to the sill to quantify the spatial dependence of these attributes, was found to be less than 75% for most attributes. This indicates strong spatial dependence (less than 25%) and moderate spatial dependence (between 25% and 75%), as suggested by [
52]. In this context, kriging emerges as an excellent method for the interpolation and estimation of soil attributes in unsampled locations.
The VV and VH backscatter bands, along with the VV/VH and RVI4SI indices, showed variations in the size, shape, and number of MZs when subjected to Approaches 1 and 2. Approach 2, which applies clustering on features extracted from SAR time series via autoencoders, tended to generate more MZs in both fields compared to Approach 1, which performs clustering directly on the time series. Autoencoders belong to a specific class of deep artificial neural networks. They are designed to compress an input into a more compact representation and then reverse that compression, aiming for the reconstructed input to resemble the original as closely as possible [
60]. The features extracted by the autoencoder, represented by the compact part, can capture nuances and patterns in the data that the raw representation cannot. This leads to a more detailed segmentation of the fields, resulting in a higher number of MZs. Another point to consider is that SAR images are characterized by high levels of noise [
61]. Therefore, the use of features extracted by autoencoders represents a less noisy version of the original data, as the learning process of the architecture extracts patterns that explain the temporal behavior of the backscatter. This factor may result in more accurate clustering and an increase in the number of MZs.
The VV/VH index, combined with Approach 2 based on autoencoders, tended to exhibit lower RMSE values for soil attribute estimation using the LOOCV strategy. Thus, it was able to produce MZs with greater precision compared to other SAR indices. The integration of VV and VH backscatter band information has shown superior performance compared to the isolated use of each band in various applications [
62,
63]. This phenomenon is justified by the fact that the VV/VH ratio minimizes acquisition system errors and provides more consistent indications over time than the isolated VH or VV backscatter, as pointed out by [
34]. Additionally, certain studies indicate that the VV/VH index correlates more closely with the NDVI in the specific phenological stages of the crop. This suggests that this index helps in understanding not only the spatial variability of soil moisture but also the canopy structure and crop biomass—crucial aspects for defining MZs [
64].
When analyzing the MZs derived from the VV/VH index using autoencoders in both fields, a statistical distinction was observed in at least one mean of each soil attribute originating from the MZs, except for V. Despite the high variability of Clay in the region, statistical differences were also detected in temporally unstable attributes, such as Ca
2+, Mg
2+, K
+, etc. The ability of plants to access these attributes is strongly influenced by the soil’s ability to retain water in its macro- and micropores. Thus, the sensitivity of SAR data to soil moisture, as evidenced in several studies, is crucial in identifying the variability of these macro- and micronutrients present in the soil [
25,
59,
65].
When analyzing the MZs generated by the VV/VH index using autoencoders in comparison to the randomly created cells (CEL-RND), the potential of the SAR index to delineate MZs was highlighted. Only for the soil attributes K
+ and V in Field A was it observed that in more than 50% of the scenarios, the RMSE of the CEL-RND was lower than the RMSE estimated by the MZs. The best performance scenario of the VV/VH index against CEL-RND was observed for Clay in both fields. The Clay fraction of the soil is intrinsically linked to water retention [
66]. Therefore, the sensitivity of SAR data to soil moisture may be one of the explanations for the high correlation observed between the MZs and Clay variability in the fields. Clay content exhibits a relatively stable spatial distribution over time, which can provide greater robustness and consistency in the delineation of MZs using the proposed method, especially considering that these zones are expected to have a predominantly static character. However, this study employed a five-year time series, and further research is needed to investigate the minimum time interval required to construct a temporal series that leads to consistent MZ delineations over time.
When evaluating the various sampling methods, it was observed that the method based on GRID-1 stood out, recording the lowest errors (lower RMSE) for all soil attributes. This result is justified by the fact that the fields investigated in this study exhibited high and moderate spatial dependencies for soil attributes, as indicated by the SDI. In such contexts, the fitting of semivariograms combined with kriging interpolation, the approach adopted in our study, tends to provide good estimates. In contrast, the significant spatial variability suggests that the CONV method, which attempts to represent the field through a single soil sample, may not be efficient. This observation is reinforced by noting that in the fields analyzed in this study, the CONV method had the highest RMSE values, indicating lower accuracy in the estimates of soil attributes.
The MZs delineated from the SAR data showed superiority compared to the CONV, CEL, and CEL-RND methods, being occasionally surpassed only by the CEL and CEL-RND methods. Therefore, in scenarios with limited financial resources where conventional sampling is chosen, SAR data can be used to guide sampling through MZs. This methodology, presented in this study, offers specialists the opportunity to provide services remotely, eliminating the need for field trips. This results in cost savings and facilitates the implementation of precision agriculture, even for small farmers. However, future research should be conducted to investigate the impact of reducing the time series in areas without overlapping satellite passes, where temporal resolution consequently decreases. Additionally, as evidenced, there is significant variation in backscatter intensity throughout the year, primarily influenced by fluctuations in precipitation. Thus, it is also suggested that future studies assess the possibility of using images acquired during specific periods of the year. Additionally, both areas analyzed in this study are located in the Brazilian Cerrado, a region characterized by a tropical climate and relatively flat terrain. Therefore, it is recommended that future studies be conducted in other parts of the country, especially in southern Brazil, which includes temperate climate zones and distinct topographic features such as high altitudinal variation and steep slopes. Expanding the validation in this way would allow for a broader assessment of the applicability and scalability of the proposed methodology across diverse geographic contexts.
5. Conclusions
The strategy combining autoencoders with the VV/VH index resulted in more accurate estimates of soil attributes compared to other Synthetic Aperture Radar (SAR) indices. The GRID-1 method, which uses a high-density point grid followed by kriging interpolation, stood out as the most effective technique for mapping soil attributes, while the conventional soil sampling method (CONV) performed the least satisfactorily. The Management Zones (MZs) delineated using the VV/VH index based on autoencoders outperformed the CONV method, the random cell-based soil sampling method (CEL-RND), and, in many cases, the rectangular cell-based soil sampling method (CEL). These findings are encouraging and indicate the potential of SAR data in analyzing soil variability and defining MZs.
This method has the potential to be integrated into digital platforms that use, for example, Google Earth Engine to provide Sentinel-1 data at the scale of agricultural fields. MZs generated using autoencoders applied to the VV/VH index can serve as an intelligent guide for soil sampling, directing sampling points in a more efficient and representative way. In addition to providing the sampling points, the shapefile with the Management Zones can be made available to guide agricultural machinery operations in defining crop treatment levels, such as cultivars, planting density, and fertilizers. Given that training the autoencoder is computationally intensive, this step would normally be performed offline by technical teams, but once trained, the model can be deployed as a lightweight module for automated MZ delineation on cloud platforms. This facilitates accessibility and scalability, even for small farmers.