Since the onset of agriculture, soils have lost their organic carbon to such an extent that the soil functions of many croplands are threatened. There is strong demand for mapping and monitoring critical soil properties and, in particular, soil organic carbon (SOC). After all, SOC stock is one of the three sub-indicators for the Sustainable Development Goal (SDG) 15.3.1 defining the proportion of land that is degraded over total land area [1
]. The Voluntary Guidelines for Sustainable Soil Management published by Intergovernmental Technical Panel on Soils [2
] also clearly indicated the loss of SOC as one of the main causes of soil degradation and lay down a set of good practices to enhance the soil organic matter content. These practices will be used as guidelines for cross-compliance rules for common agricultural Policy (CAP) in the European Union related to the good agricultural and environmental condition (GAEC) of the land for the period post-2020 [3
]. In this context, the continuous spatial and temporal monitoring of the organic carbon content in agricultural soils becomes extremely important, not only from the environmental perspective, but also in economic terms to ensure that the beneficiaries of the CAP respect their cross-compliance obligations.
Pilot studies have demonstrated the potential for remote sensing in the optical domain for SOC mapping. This technique is generally used in croplands where the following conditions prevail: (i) a uniform SOC content throughout the plough layer (usually the top 20 to 30 cm), (ii) the soil is bare and an eventual soil crust has been ploughed in just after seeding of summer crops (such as maize, sugar beet, or potato) or winter crops (such as winter cereals), and (iii) surface water content is low since the surface quickly dries out during the cloud-free conditions required for acquiring an image. Remote sensing products can contribute toward establishing national or regional reference systems for soil properties as an alternative to the costly techniques relying on traditional sampling and analysis. High resolution and up-to-date SOC maps produced by remote sensing techniques will be valuable tools to address issues related to the SDG’s or the common agricultural policy at the regional scale, while, at the same time, providing within field SOC patterns. The latter can be used to optimize cropland management and reduce environmental impacts, such as reducing fertilizer input by accounting for the mineralization of soil organic matter, enhancing infiltration, and reducing local flooding.
The EU (European Union) Copernicus satellite program can give a boost to high-resolution SOC mapping in croplands covering large areas and at low costs. The Sentinel-2 mission of the EU Copernicus program is a constellation of two satellites (Sentinel-2A and Sentinel-2B) equipped with the Multi-Spectral Instrument (MSI) that collects optical imagery over land and coastal waters. The main strengths of Sentinel-2 data are: (i) their availability: everybody can freely explore and download the images, (ii) the short revisit time: five days under the same viewing angle, (iii) the fine spatial resolution: 10 m in the visible and 20 m in the near infrared (NIR) and short wave infrared (SWIR) region, (iv) the presence of 13 bands including two bands acquiring data in the SWIR region, and (v) high quality physically-calibrated data. The spatial and spectral characteristics of the MSI/Sentinel-2 sensor are, thus, promising for soil applications. In particular, the two SWIR bands, even if they are quite broad, can be particularly useful to exploit both the correlation between SOC content and soil brightness and the spectral signature that is assigned to soil organic carbon (SOC) and soil texture in this region [4
]. The availability and the short revisit time of the MSI/Sentinel-2 sensor entails a huge amount of satellite data that increases the possibility to acquire reliable images throughout the year (e.g., cloud-free images). This is especially important for imaging spectroscopy of the topsoil because the time window in which we can find bare soil in croplands of the temperate regions is narrow (i.e., after seedbed preparation). At the same time, the collection of a series of satellite images for the same area allows multi-temporal analysis or tessellated processes.
Some authors already used Sentinel-2 data for soil variable prediction and mapping [5
], and obtained encouraging results, especially for the SOC content in the plough layer. However, some issues still need to be addressed to improve a soil product based on Sentinel-2 data. One of these is the choice of the sampling strategy for a calibration dataset that is representative of the SOC variability of the investigated area. Castaldi et al. [9
] demonstrated that Sentinel-2 spectral data can support algorithms to select the sample locations that ensure a wide range of SOC content. However, a new field campaign and the related soil chemical analysis are time-consuming and expensive in comparison to using an existing soil spectral library. Castaldi et al. [10
] conceived the bottom-up approach that allows exploiting the Land Use and Coverage Area frame Survey (
LUCAS) European topsoil dataset and airborne data for SOC mapping without spectral transfer function between laboratory and remote spectral data. While this approach proved to be successful in a relatively small pilot area restricted to bare cropland soils, the main challenge for the SOC estimation from an entire Sentinel-2 tile (i.e., 100 by 100 km) is the minimization of disturbing factors, such as vegetation, residues, roughness, and soil moisture. Clearly, the search for ideal soil conditions (i.e., croplands in seedbed condition) reduces the time window to get a useful image in temperate regions. Rogge et al. [12
] used temporal series of Landsat and Sentinel-2 scenes. Their approach masks photosynthetically active vegetation using normalized difference vegetation index (NDVI) as a vegetation index. This map produces, among other things, an average reflectance of bare soil pixels in croplands. Given the position of most of the MSI/Sentinel-2 bands, such an approach is effective in removing the disturbing effect of growing crops, but is less suitable to mask wet soils or crop residues. Although, the two wavelengths used in hyperspectral imagery to calculate the Normalized Soil moisture index [13
] are very close to the two SWIR bands of the Sentinel-2, the capability of the multi-spectral data for dry vegetation detection and soil moisture quantification still has to be proven. In this regard, Demattê et al. [14
] applied different thresholds based on Landsat spectral data in order to select bare soil during the dry season in Brazil and remove soil spectra affected by straw or vegetation residues.
In this case, we aim testing the effect of the threshold for a spectral index linked to soil moisture and crop residues on the performance of SOC prediction models using the Sentinel-2 and the European LUCAS topsoil database. On the one hand, such an approach will allow deriving a qualified compromise between prediction quality and spatial coverage of the map. On the other hand, exploiting the LUCAS database reduces the need for calibration data and a new ad hoc field campaign. We tested this approach within a large area (100 by 100 km) in North-eastern Germany characterized by different soil types.
Several differences exist between a laboratory spectrum of a dried and sieved soil sample and the spectrum of the satellite sensors representing the pixel where the sample was taken. One of the most evident differences is related to the distance between the sensor and the target. The atmospheric disturbance and lower signal-to-noise ratio of the remote sensor affect the spectral output, and, consequently, the estimation accuracy of the soil property. The B12 (2185 nm) and B11 (1610 nm) of the MSI/Sentinel-2 sensor, in combination with other bands, showed the highest correlation with SOC using the LUCAS spectra acquired in the laboratory (Figure 5
). We observed a weakening of the correlations removing the two calibration samples with the highest SOC values (Pearson’s coefficient between 0.5 and 0.62). This decrease in correlation especially concerns the combination of bands including B11 and B12. These two bands proved to be very important for calibration sets with a large SOC variability [6
]. In other words, the differences in terms of soil brightness become more evident.
However, when we applied these indices involving the SWIR bands (B11 and B12) to real Sentinel-2 data, the accuracy was always unsatisfactory and it decreased with an increasing NBR2 value. The loss of prediction capability of the SWIR bands from the laboratory to the Sentinel-2 data is mainly due to three factors: i) the large bandwidth of the two bands (141 nm for B11 and 238 nm for B12) and their lower spatial resolution (20 m) as compared to the other Sentinel-2 bands, ii) the lower signal-to-noise ratio (SNR) as compared to that of the visible and infrared bands [6
], iii) the mixed influence of several disturbing factors on the spectral signal of the SWIR bands due to variable soil texture and mineralogy (clay or carbonate content), soil moisture [23
], or dry vegetation cover [24
Generally, the best validation results were obtained using a combination of red edge bands (RE-CI) or red/red edge (RRE-CI, Table 2
, Figure 11
). The exponential calibration models of Figure 6
a,b showed how a small variation of the index values between 0.10 and 0.12 corresponds to abrupt changes in SOC content, while the two indices are less sensitive for SOC values lower than 100 g kg−1
. Many authors reported a wide absorption feature related to SOC around 664 nm [4
], which includes the spectral range of B4 and B5. The MSI/Sentinel-2 band centered at 740 nm (B6) is spectrally close to the absorption feature associated with the overtone of the N-H bond [25
]. This is a chemical bond that characterizes some compounds of soil organic matter, e.g., amino acids [27
]. Castaldi et al. [6
] analyzed the relative variable importance for estimating SOC in the Demmin area using the MSI/Sentinel-2 bands as variables in a random forest regression model. Their results showed the most important bands to be the B4 and B5 as well as the two SWIR bands (B11 and B12).
The structural complexity of the organic matter entails a large heterogeneity of the components, which, in turn, involves a large variability of spectral responses and, consequently, the need to calibrate ‘local’ SOC estimation models based on geographical or spectral proximity. After all, these empirical models can hardly be applied in areas characterized by different soil types. The SOC indices mainly exploit the link between SOC content and soil brightness. This link can be masked or weakened as a result of variation in mineralogy that affects the soil color as well as the type of clay minerals and iron oxide contents. Consequently, the transferability of these spectral indices is likely to be more straightforward for young soil types, where variability in clay minerals and iron oxide content is not yet pronounced.
The availability of a sufficient number of LUCAS soil samples within the study area and their wide spatial distribution (Figure 1
) together with the large variability in terms of SOC content (Table 1
) allowed calibrating reliable SOC models [9
]. The only drawback in the LUCAS database is that it is based on a stratified random sampling strategy in geographical space (rather than feature space) [6
]. Since the area covered by peaty topsoil in cropland is limited to the center of the kettle holes [28
], the number of high SOC content samples in the LUCAS database is very limited (Table 1
). It is likely that the LUCAS samples in other European regions are not numerous enough [29
] or not representative of the soil variability of the investigated area. Until now, remote sensing or airborne spectroscopy was used for small areas that are mostly rather homogeneous in parent material and soil forming factors [30
]. Castaldi et al. [9
] demonstrated that sampling strategies based on the feature space, where the spectral bands were used as ancillary data, were most efficient when the Demmin area was stratified, according to ‘soil scapes,’ i.e., distinguishing sandy and clay topsoil (Figure 1
). Other factors that are likely to improve the SOC prediction models are the absence of a soil crust (since it does not represent the spectral signal of the topsoil [30
] and the large variability in SOC content). Gomez et al. [31
] reported that, in contrast to clay and calcium carbonate content, SOC could not be predicted in Mediterranean environments due to the very low and rather constant SOC content. It should be noted that only 10% to 15% of the landscape is not affected by erosion [28
]. The fact that the landscape is derived from relatively young glacial sediments explains that differences in pedogenic oxides, leaching of calcium carbonate, and differentiation in clay minerals have not yet had enough time to develop and are, therefore, unlikely to greatly influence the soil spectra. Consequently, the SOC indices can effectively exploit the link between SOC content and soil brightness in young soil types where this link is not masked or weakened from variation in mineralogy due to the pedogenesis age.
Straw, other crop residues, or dry vegetation are disturbing factors for soil imaging spectroscopy. Currently, several methods exist to map dry vegetation fractional cover based on hyperspectral imagery. Such methods can be used to separate pure soils from soils partly covered by vegetation [24
]. We selected green vegetation pixels using NDVI and two other indices based on the visible region and applied the NBR2 index to detect dry vegetation or crop residues. Demattê et al. [14
] already showed that the NBR2 values are correlated with the amount of dry vegetation at the soil surface. In their study, the index provided negative, or close to zero, values for pixels with a 100% soil signal, while the NBR2 values increased with straw cover. Therefore, it is reasonable to adopt a very low NBR2 threshold in order to select pure soil pixels for remote sensing applications (Figure 9
). We tested different NBR2 thresholds, and sought to understand how they affect the SOC estimation accuracy. The best validation accuracy was obtained using an NBR2 threshold of 0.05. While using a less restrictive threshold, the estimation accuracy sharply decreased (Figure 9
). The SOC prediction of the RE-CI and RRE-CI indices decreases with an increasing NBR2 threshold (Figure 10
). This is mainly due to an underestimation of the low SOC content. Although the statistical accuracy in terms of R2
, RPD, and RMSE is still satisfactory, the SOC underestimation already appears for an NBR2 of 0.075 and becomes striking for a NBR2 of 0.1 (Figure 10
b,c). Demattê et al. [14
] remarked that the NBR2 index is not particularly effective for pixels with a residue cover between 80% and 20% and that a stricter NBR2 threshold (<0.05) does not improve detection of pure soil pixels. On the contrary, a less severe threshold (>0.05) will result in the selection of pixels that are almost completely covered by dry vegetation or residues.
The spectrum of a bare soil pixel can also be affected by soil moisture, which entails a nonlinear reduction of the reflectance over the entire spectrum [32
]. This challenge in predicting soil moisture effects made many researchers refrain from the use of laboratory spectral libraries for the calibration of remote sensing data or incited them to find alternative solutions to the problem by spectral transfer methods [33
] or by a bottom-up approach [11
]. We faced the issue from another perspective, detecting and selecting only the bare soil pixels mimicking the conditions of a dry soil sample. Several methods exist that could be applied to mask moist soil pixels based on hyperspectral imagery [34
]. The adaptation of these methods to multispectral satellites has yet to be investigated. Some authors investigated the suitability of normalized soil moisture indices to be applied for remote sensing [13
]. Both the normalized indices calibrated by Haubrock et al. [13
] and Castaldi et al. [34
] exploited the combination of two SWIR wavelengths: 1800 nm and 2119 nm in Haubrock et al. [13
] (NSMI = normalized soil moisture index) and between 1700 nm and 2100 nm in Castaldi et al. [34
] (SMIR_A = soil moisture index for remote sensing). The SWIR bands of the MSI/Sentinel-2 sensor used for the NBR2 are very close to those employed for the NSMI and SWIR_A index. B11 is centered at 1610 nm and it has the upper boundary at 1682 nm, while B12 is centered at 2185 nm and has the lower boundary at 2078 nm. Thus, the NBR2 can be used as proxy for soil moisture characterization.
Soil roughness is another disturbing factor for SOC estimation by satellite data. Generally, its influence is limited when the satellite images are acquired for smooth soils in a seedbed condition. However, this particular condition occurs only twice yearly and lasts several days to a week in temperate regions. Therefore, the time window for acquiring Sentinel-2 images is rather restricted. Usually, the periods of the year when it is more probable to find soil in a seedbed condition is spring when summer crops such as sugar beet, potatoes, or maize are sown or the end of the summer when winter cereals are sown.
Taking into account, the above-mentioned restrictions, on the one the hand, entails an improvement of the SOC estimation accuracy and, on the other hand, decreases the extent of the mapped surface (Figure 7
and Figure 8
). Nevertheless, the pixel selection process can be tuned according to the goal of the survey, which gives a warning on the reliability of the SOC map for each NBR2 threshold.