Use of Sentinel-2 Time-Series Images for Classification and Uncertainty Analysis of Inherent Biophysical Property: Case of Soil Texture Mapping

The Sentinel-2 mission of the European Space Agency (ESA) Copernicus program provides multispectral remote sensing data at decametric spatial resolution and high temporal resolution. The objective of this work is to evaluate the ability of Sentinel-2 time-series data to enable classification of an inherent biophysical property, in terms of accuracy and uncertainty estimation. The tested inherent biophysical property was the soil texture. Soil texture classification was performed on each individual Sentinel-2 image with a linear support vector machine. Two sources of uncertainty were studied: uncertainties due to the Sentinel-2 acquisition date and uncertainties due to the soil sample selection in the training dataset. The first uncertainty analysis was achieved by analyzing the diversity of classification results obtained from the time series of soil texture classifications, considering that the temporal resolution is akin to a repetition of spectral measurements. The second uncertainty analysis was achieved from each individual Sentinel-2 image, based on a bootstrapping procedure corresponding to 100 independent classifications obtained with different training data. The Simpson index was used to compute this diversity in the classification results. This work was carried out in an Indian cultivated region (84 km2, part of Berambadi catchment, in the Karnataka state). It used a time-series of six Sentinel-2 images acquired from February to April 2017 and 130 soil surface samples, collected over the study area and characterized in terms of texture. The classification analysis showed the following: (i) each single-date image analysis resulted in moderate performances for soil texture classification, and (ii) high confusion was obtained between neighboring textural classes, and low confusion was obtained between remote textural classes. The uncertainty analysis showed that (i) the classification of remote textural classes (clay and sandy loam) was more certain than classifications of intermediate classes (sandy clay and sandy clay loam), (ii) a final soil textural map can be produced depending on the allowed uncertainty, and iii) a higher level of allowed uncertainty leads to increased bare soil coverage. These results illustrate the potential of Sentinel-2 for providing input for modeling environmental processes and crop management.


Introduction
Soil provides key environmental functions such as food production, water storage and redistribution, pollutant filtering and carbon storage.The accurate characterization of soil properties over cultivated areas, including soil organic matter, soil texture or iron content, is essential for agricultural engineering work such as land consolidation, drainage management, soil erosion limitation and irrigation systems.This characterization provides valuable information to improve soil use and management actions over cultivated areas [1,2].Environmental modeling increasingly contributes to decision making on soil management from local to global scales, but its operational use requires accurate and highly spatially referenced soil information as inputs.The most common way to obtain information on soil characterization is soil maps.However, the spatial resolutions of these maps are limited by the cost of collecting soil samples and measuring soil properties, and the spatial uncertainty of soil characterizations is usually unknown.
Laboratory Visible, Near-InfraRed and Short Wave InfraRed (VNIR-SWIR, 400-2500 nm) spectroscopy has become a good alternative to costly physical and chemical soil analysis for the estimation of a large range of soil properties, including soil organic matter, iron content and soil texture (e.g., [3,4]).This latter soil property refers to the composition of the soil in terms of the proportion of fine, medium, and large particles (clay, silt, and sand, respectively) in a specific soil mass and is one of the most studied properties by VNIR-SWIR data for three major reasons [4].First, texture is an important physical property that plays a key role in soil functions, including fertility and water retention.Fine-textured soils (such as clayey soils) have more total pore space and individual pores with smaller diameters than coarse-textured soils (such as sandy soils).The adhesive force at the molecular scale between water and soil leads to higher levels of water retention in fine-textured soils than in coarse-textured soils.Second, soil texture influences both spectral intensity and absorption band depth in the VNIR-SWIR spectral domain [3,5].A spectrum with a high content of clay fraction (fine particle sizes) will tend to have a higher albedo than a spectrum of sandy or loamy soil (coarse particle sizes).Third, soil texture does not vary in the short term, implying that the field collection of soil samples, soil texture analysis and VNIR-SWIR data acquisition may be carried out separately over time.
VNIR-SWIR hyperspectral airborne and satellite imagery has been shown to be a promising technology for quantitative soil surface property mapping (e.g., using a VNIR-SWIR hyperspectral airborne sensor [6][7][8]; or using a VNIR-SWIR hyperspectral satellite sensor [9,10]).VNIR-SWIR multispectral satellite imagery showed limited utility for quantitative soil surface property mapping applications compared to hyperspectral imagery.Indeed, the studies of quantitative soil texture mapping using hyperspectral data provided more accurate predictions [11,12] than those using multispectral data [13][14][15].This could be due to (i) insufficient spectral information limiting prediction accuracy for soil properties [16] and (ii) the insufficient spatial resolution of VNIR-SWIR multispectral satellite sensors (e.g., 15 and 30 m for ASTER data depending on spectral bands, 30 m for LANDSAT data), which does not match with the patterns of interest required for the definition of soil structure [17].Nevertheless, VNIR-SWIR multispectral satellite imagery has been successfully used for qualitative soil surface property mapping [18,19].
Whatever the VNIR-SWIR imaging sensor, the predictions performances of quantitative soil property are expressed using figure of merits such as coefficients of determination, root mean square errors or bias [20] and studying variograms of spatialized predicted properties [8].In addition, whatever the VNIR-SWIR imaging sensor, the predictions performances of qualitative soil property are expressed using figure of merits such as the coefficient of determination and kappa and studying the confusion matrices.In addition, to our knowledge, only [21] has tried to estimate predictions uncertainties in addition to quantitative predictions performances.
With the successful launch of the Sentinel-2 satellites, the Copernicus program has provided global coverage of terrestrial surfaces with multispectral images, with a revisit time of five days, since 2017.Images are acquired with a spatial resolution of 10 m to 60 m on 13 spectral bands from 400 to 2500 nm.Several studies have illustrated the relevance of individual Sentinel-2 images for environmental applications, including soil organic mapping [22,23], soil moisture mapping using the synergy of Sentinel-1 (radar) and Sentinel-2 data [24,25], and Mediterranean seagrass mapping [26].The importance of the multitemporal dimension of Sentinel-2 data was also evidenced for environmental applications, including the detection of precursory motions before landslide failures [27], the evaluation of the extent of forest fires [28], the estimation of vegetation phenological stages [29], water use by cotton [30], and the improvement of crop type mapping [31].In Soil Science, the multitemporal dimension may allow to (i) increase the probability of image acquisition in clear sky conditions during periods with consistent bare soil coverage over cultivated areas (e.g., between October and November in Mediterranean areas, during the plowing time) and (ii) provide several repetitions of spectral measurements of the surface.Combining time series data for extending bare soil coverage has been recently studied (e.g., [32,33]), and in our paper we propose another example of added value of using the time series data.
The objective of this work is to explore a Sentinel-2 time-series images for soil texture classification in terms of accuracy and uncertainty estimation.Supervised classifications are performed on each individual Sentinel-2 image, and two sources of uncertainty were studied.Uncertainties due to the training data were studied using a bootstrapping procedure applied to each individual image.In addition, uncertainties due to the Sentinel-2 acquisition date were studied from the frequency of classifications in the time-series classification, based on the hypothesis that soil texture does not vary in the short term.This work was carried out in an Indian cultivated region (Karnataka state).

Study Area
The Berambadi catchment is a subcatchment of the South Gundal located in the Deccan Plateau of Southern India (Figure 1a) and covers 84 km 2 .Our study area is located in the eastern part of the Berambadi catchment, which is covered by crop fields and corresponds to 60% of the catchment, while the western part is covered by forest (Figure 1).The Berambadi catchment belongs to the Kabini Critical Zone Observatory [34,35], which is part of the French Network of Critical Zone Observatories (OZCAR) [36].The climate is tropical subhumid with an average rainfall of 800 mm/year and a PET of 1100 mm (aridity index P/PET of 0.7).The monsoon dynamics drive three main seasons: dry season (winter in January and February, summer from March to May), Kharif (southwest monsoon season, from June to September) and Rabi (north-east monsoon season, from October to December).Black soils (Vertisols and Vertic intergrades) are found mostly in the valley bottoms, while red soils (Ferralsols and Chromic Luvisols) cover the uplands and hillslopes [37].
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 21 the synergy of Sentinel-1 (radar) and Sentinel-2 data [24,25], and Mediterranean seagrass mapping [26].The importance of the multitemporal dimension of Sentinel-2 data was also evidenced for environmental applications, including the detection of precursory motions before landslide failures [27], the evaluation of the extent of forest fires [28], the estimation of vegetation phenological stages [29], water use by cotton [30], and the improvement of crop type mapping [31].In Soil Science, the multitemporal dimension may allow to (i) increase the probability of image acquisition in clear sky conditions during periods with consistent bare soil coverage over cultivated areas (e.g., between October and November in Mediterranean areas, during the plowing time) and (ii) provide several repetitions of spectral measurements of the surface.Combining time series data for extending bare soil coverage has been recently studied (e.g., [32,33]), and in our paper we propose another example of added value of using the time series data.The objective of this work is to explore a Sentinel-2 time-series images for soil texture classification in terms of accuracy and uncertainty estimation.Supervised classifications are performed on each individual Sentinel-2 image, and two sources of uncertainty were studied.Uncertainties due to the training data were studied using a bootstrapping procedure applied to each individual image.In addition, uncertainties due to the Sentinel-2 acquisition date were studied from the frequency of classifications in the time-series classification, based on the hypothesis that soil texture does not vary in the short term.This work was carried out in an Indian cultivated region (Karnataka state).

Study Area
The Berambadi catchment is a subcatchment of the South Gundal located in the Deccan Plateau of Southern India (Figure 1a) and covers 84km 2 .Our study area is located in the eastern part of the Berambadi catchment, which is covered by crop fields and corresponds to 60% of the catchment, while the western part is covered by forest (Figure 1).The Berambadi catchment belongs to the Kabini Critical Zone Observatory [34,35], which is part of the French Network of Critical Zone Observatories (OZCAR) [36].The climate is tropical subhumid with an average rainfall of 800 mm/year and a PET of 1100 mm (aridity index P/PET of 0.7).The monsoon dynamics drive three main seasons: dry season (winter in January and February, summer from March to May), Kharif (southwest monsoon season, from June to September) and Rabi (north-east monsoon season, from October to December).Black soils (Vertisols and Vertic intergrades) are found mostly in the valley bottoms, while red soils (Ferralsols and Chromic Luvisols) cover the uplands and hillslopes [37].

Sentinel-2 Images
Sentinel-2 satellites provide systematic global acquisitions of high spatial resolution multispectral imagery and are part of Copernicus program.Sentinel-2A was launched in 2015, followed by the Sentinel-2B launch in 2017.The combination of both satellites delivers a revisit period of up to five days under cloud-free conditions.The images include thirteen spectral bands in the VNIR-SWIR spectral domain, from 10 to 60 m of spatial resolution.Only ten bands remain after atmospheric correction, as the three bands acquired at 60 m spatial resolution (coastal at 443 nm, water vapor at 1375 nm and cirrus at 1376 nm) are used directly to perform atmospheric corrections and cloud detection.Here, atmospheric and topographic corrections were performed with Sen2Cor processor [38], and the level-2A images (bottom of atmosphere terrain corrected reflectance) were used to perform soil classification.The six spectral bands initially acquired with 20 m spatial resolution were resampled to 10 m.
Six Sentinel-2 images, including the Berambadi catchment during the dry season of 2017, were identified.Four images were acquired in February, under clear sky conditions (Figure 2).A fifth image was acquired on the 4th of April, under clear sky conditions and after a rainfall measured at 6 am over the Maddur Village (Figure 2).The sixth image was acquired on the 24th of April under clear sky conditions and after four days without rainfall (Figure 2).Once atmospherically and topographically corrected, the images were filtered to select only pixels corresponding to bare soil and to mask pixels corresponding to urban areas, bodies of water, crops and natural vegetation.Crops and natural vegetation were masked using the normalized difference vegetation index (NDVI) calculated using spectral bands at 842 nm and 665 nm [39,40].A unique threshold for all images was fixed at 0.3, based on visual interpretation from an expert with extensive field knowledge.Urban areas and lakes were masked using a land use map available for the study area.Between 73% and 87% of the pixels were finally selected as bare soil in the Berambadi catchment, depending on the acquisition date.Overall, the six images shared 65% pixels corresponding to bare soil.For the sake of comparison among dates, this study focused on the pixels identified as bare soils and common to all images.

Soil Data
A total of 216 soil surface samples were collected over the Berambadi catchment (Figure 1b), including 176 soil surface samples collected in 2015 and 40 soil surface samples collected in 2017.The sampling was based on the soil physiographic relationship.Soil samples were collected in transects across the landforms.The major landforms are hills and hillslopes, gently sloping uplands and nearly level lowlands.Each sample was located with a GPS.The determination of soil textural classes was performed by experienced soil surveyors, using the feel method based on codified field tests [41] and many comparisons with soil analyses.Soil texture was classified into four classes: clay (C), sandy clay (SC), sandy clay loam (SCL) and sandy loam (SL).
Among the 216 soil surface samples associated with soil texture observation, 130 were located over bare soil pixels shared between the six images.The soil texture among these 130 samples was distributed as follows: 33% for SL, 27% for SCL, 23% for SC and 17% for C.

Methods
Soil texture classification was performed for each individual Sentinel-2 image over bare soil pixels shared between the six images.An uncertainty analysis was realized from each Sentinel-2 data, based on a bootstrapping procedure and from the time-series of soil texture classifications obtained from the time-series of Sentinel-2 images.Figure 3 illustrates the process flow.The classification model developments, training and validation partitions and prediction accuracy measurements were implemented in R using the caret package [42].

Classification of Individual Sentinel-2 Images
The spectral information from the pixels with corresponding soil texture measurements was extracted for each image (Figure 3a).For each classification, spectral information and corresponding textural class were then split into training (70%) and validation (30%) datasets using a stratified random sampling for training and validation data to follow a similar distribution of soil texture classes (Figure 3b).
Soil texture classification was performed using a linear support vector machine (Lin-SVM) classifier (Figure 3b), which is one of the most classical method for images classification [43][44][45].The Lin-SVM was initially developed by [46] for binary classification and can be extended to multiclass problems.The SVM method consists of optimizing hyperplane separating classes by maximizing the margin between the classes' closest points, called support vectors.The best separating hyperplane can be found by solving a nonlinear convex problem that depends on a cost parameter C.This parameter C allows for reducing misclassifying in the training database while maximizing the margins between classes.Ten values ranging between 0 and 5 were tested for C, and the optimal value was defined for the best overall accuracy obtained when performing a 5-fold cross-validation on the training dataset.A detailed description of the Lin-SVM can be found in [47], and more recently in [43].
The accuracy and kappa coefficients were used to measure the performance of the classifications (Figure 3b).Accuracy is commonly measured as the percentage of pixels correctly classified in the validation set.The Kappa coefficient compares the observed accuracy with an expected accuracy (random chance).The kappa statistic is used to evaluate both a single classifier and classifiers among themselves, which is an important evaluation, especially on an imbalanced data set.A kappa coefficient of 1 indicates perfect classification, and a kappa coefficient of 0 corresponds to a random classification [48].Based on [49], kappa values greater than 0.80 represent strong agreement, Kappa values between 0.4 and 0.8 represent moderate agreement, and Kappa values below 0.4 represent poor agreement.Finally, the confusion matrix was also calculated from each classification model (Figure 3b).

Expressions of Uncertainty
As previously investigated by [21,50], a bootstrap procedure was used to study prediction uncertainties due to the training data.The bootstrapping involves repeated stratified random sampling for training and validation data to follow a similar distribution of soil texture classes, with replacement of the available data [51].By repeating the process of sampling and applying the model, probability distributions of the prediction realizations are generated for each pixel.Here, a bootstrap procedure was applied to the original dataset to define 100 sets of training and validation subsets for each individual image.From each bootstrap sample, a classification model was fitted and applied to generate a soil classification (Figure 3b).After the bootstrap procedure, a single-date map of the most frequent class obtained for each pixel of each Sentinel-2 image was produced (Figure 3c), providing a time-series classification.
The frequency of classifications in the time-series classification was used for studying prediction uncertainties due to the Sentinel-2 acquisition date (Figure 3d).A final map of the most frequent class obtained for each pixel over the time-series classification was produced (Figure 3d).
The diversity of classifications was measured for each pixel by computing Simpson's diversity index, corresponding to the diversity of classes assigned to each pixel.Simpson's diversity index (SI) is one of the well-known indices used to measure a degree of diversity [52] and is calculated as follows: where p t is the proportion of classification of class t, and k is the number of classes.SI reaches a minimum value (SI = 0) when all classification results are identical, so the proportion of this given class is 1 and the proportion of the other classes is 0. SI reaches its maximum when all classes are equally distributed, which corresponds to SI = (1 − 1/k).In our case, SI was calculated for each bare soil pixel of each Sentinel-2 image and after the 100 iterations to study uncertainties due to the training data (Figure 3c).SI was calculated for each bare soil pixel of the time-series of Sentinel-2 images to study uncertainties due to the Sentinel-2 acquisition date (Figure 3d).

Classification Model Performance
The classification models derived from the Sentinel-2 image acquired on the 24th of April 2017 outperformed other models by providing moderate accuracy (mean accuracy = 0.5) and fair agreement (mean kappa = 0.31) (Figure 4).These classification models provided a low confusion rate of classification for samples corresponding to remote textural classes in terms of texture (C and SL), and the highest confusion rate was obtained between neighboring textural classes (e.g., SCL and SL) (Table 1).The prediction ability for the SCL class systematically showed a poor score of correct predictions (Table 1).
The classification corresponding to the five other images resulted in lower performances (for both accuracy and kappa) than the classification obtained with the image acquired on the 24th of April 2017, with the lowest performances obtained when using the image acquired on the 4th of April 2017 (Figure 4).The confusion matrices showed similar patterns among dates, with low confusion rates of classification between remote textural classes (C and SL) and frequent confusion between neighboring textural classes (e.g., C and SL) (Table 1).
Finally, the standard deviation of classification performances at each date over the 100 models expressed in accuracy and kappa coefficients was similar among the six images: the standard deviation reached 0.05 for accuracy and 0.08 for Kappa (Figure 4).

Single-Date Classification Maps
The most frequent textural classes after the bootstrap process provided a single-date classification of soil texture.The Pearson coefficient of correlation between these six single-date classifications was superior at 0.74 (Table 2).Soil texture from valley bottoms was mainly classified as C, while soil texture from hillslopes was mainly classified as SL and SCL (Figures 1b and 5).The distribution of the most and least common textural classes estimated by the 100 classification models over the study area was also similar among Sentinel-2 images; the SL class was dominant among the estimated classes, while SC was almost never attributed to pixels (Figure 6).The image acquired on the 24th of April 2017, associated with the most accurate classification performance (Figure 4a), provided the most specific classification with Pearson coefficients of correlation inferior to 0.8 with other dates (Table 2, Figure 5f).On the northeast part of the study area, poorly covered by field samples, the image acquired on 24th of April 2017 provided a classification of a mixture of SCL and C classes (orange rectangle, Figure 5f), whereas this same area was mostly classified as a mixture of SCL and SL classes from the four images acquired in February (Figure 5a-d).
In addition, in the northwest part of the study area, for which two soil samples were collected and analyzed as the SC class, the image acquired on the 24th of April 2017 provided a classification of mixed C and SCL classes (black rectangle, Figure 5f), whereas this same area was mostly classified as SCL class based on the four images acquired in February (Figure 5a-d).

Uncertainties due to the Training Data Set
The highest uncertainty, i.e., the highest values of SI calculated from the bootstrap process, were obtained using the Sentinel-2 image acquired on the 4th of April 2017 (Figure 7e).The high values of the SI obtained from this date were clustered in the western part of the study area over hillslopes close to the forest (orange rectangles in Figure 7e).Moreover, even if urban pixels were masked in the pretreatment of the Sentinel-2 data (Section 2.2), the main road was not identified in the land use map of the catchment; thus, it was not masked and was associated with high SI (Figure 7e).
The lowest values of SI calculated from the bootstrap process were obtained using the Sentinel-2 image acquired on the 24th of April 2017 (Figure 7f).As obtained from the 4th of April 2017, the main road pixels are associated with high SI (Figure 7f).A wasteland located in the eastern part of the catchment is also associated with high values of SI (orange rectangle in Figure 7e).From the 4 Sentinel-2 images acquired on February, the highest SI values were obtained over a bottom valley (orange parallelogram in Figure 7a-d).Finally, some areas were associated with low SI values regardless of the Sentinel-2 image used (e.g., located by the red rectangle on Figure 7).For each pixel and at each date, the SI values were expressed in regard to the most frequent textural class after the bootstrap process (Figure 8).Lower SI values were obtained consistently among images for pixels classified as C and SL, the two remote textural classes (Figure 8).This consistency in the classification process for repetitions on a given image and among images suggests lower uncertainty for these classes.Conversely, the SC class appeared as the most uncertain class for all images (Figure 8), as SI calculated on pixels attributed to this SC class was mostly superior to 0.6, and for two Sentinel-2 dates, the SC class was never affected as the most frequent class over the 100 iterations.Pixels attributed to the SCL class showed intermediate uncertainty based on SI (Figure 8).

Uncertainties due to the Image Acquisition Date
The lower values of SI calculated from the set of six classifications were obtained over pixels for which the most frequent textural classes among this set of six classifications are C and SL (Figure 9a,b).The pixels for which the most frequent textural class among this set of six classifications is SLC are associated with an intermediate SI (Figure 9a,b).No pixels are classified as SC from the set of six classifications, suggesting high uncertainty for this class (Figure 9a).As obtained from single-date classifications, the main road pixels are associated with high SI (Figure 9b); high SI values are also obtained over a bottom valley (orange parallelogram in Figures 7a-d and 9b), and some areas are associated with low SI values (e.g., located by the red rectangle on Figures 7 and 9b).Some other regions are associated with higher SI than those obtained from single-date classifications (e.g., located by the orange rectangle on Figure 9b).
Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 21 classifications, suggesting high uncertainty for this class (Figure 9a).As obtained from single-date classifications, the main road pixels are associated with high SI (Figure 9b); high SI values are also obtained over a bottom valley (orange parallelogram in Figures 7a-d and 9b), and some areas are associated with low SI values (e.g., located by the red rectangle on Figures 7 and 9b).Some other regions are associated with higher SI than those obtained from single-date classifications (e.g., located by the orange rectangle on Figure 9b).

Toward the Use of an Uncertainty Threshold
The estimated uncertainties using the SI index can be used as thresholds for filtering the pixels that are considered too uncertain for being documented by a predicted soil texture.This loss in pixels having predictions may be compensated by better prediction performances for the pixels that remain predicted.We verified this both for SI calculated from the set of six single data classifications and for SI calculated from the set of 100 bootstrapped classifications of the best image (24th of April).
As expected, the gradients of variations of uncertainty and amount of classified bare soil are opposite.From the set of the six single-date classifications, 57% of bare soil pixels are classified if an SI of 0 is allowed (black triangles on Figure 10), and the classification accuracy associated with these bare soil pixels is estimated to be 0.63 (blue triangles on Figure 10).A nearly full coverage of classified bare soil pixels (93%) could be obtained for a maximum SI of 0.44, which corresponds to a class identified four times among classification processes performed over the six Sentinel-2 images (black triangles on Figure 10).The classification accuracy associated with this maximum SI of 0.44 is estimated to be 0.6 (blue triangles on Figure 10).
From the set of the 100 classifications obtained by bootstrap from the image acquired on the 24th of April, 39% of classified bare soil pixels are classified if an SI of 0 is allowed (black circles on Figure 10), and the associated classification accuracy is estimated to be 0.73 (blue circles on Figure 10).A nearly full coverage of classified bare soil pixels (93%) could be obtained for a maximum SI of 0.46 (black circles on Figure 10).The classification accuracy associated with this maximum SI of 0.46 is estimated to be 0.56 (blue circles on Figure 10).
Globally, if the allowed SI is inferior to 0.3, the set of the 100 classifications obtained by bootstrap from the image acquired on 24th of April provided better accuracy but lower classified coverage than the set of the six single-date classifications (Figure 10).From an SI of 0.3, the classified coverage remains the same, regardless of the set of classifications used.With an SI of 0.3 to 0.46, the accuracy obtained from the set of the 100 classifications obtained by bootstrap from the image acquired on the 24th of April is lower than the accuracy obtained from the set of the six single-date classifications.From an SI of 0.46, this tendency is opposite: the accuracy obtained from the set of the

Toward the Use of an Uncertainty Threshold
The estimated uncertainties using the SI index can be used as thresholds for filtering the pixels that are considered too uncertain for being documented by a predicted soil texture.This loss in pixels having predictions may be compensated by better prediction performances for the pixels that remain predicted.We verified this both for SI calculated from the set of six single data classifications and for SI calculated from the set of 100 bootstrapped classifications of the best image (24th of April).
As expected, the gradients of variations of uncertainty and amount of classified bare soil are opposite.From the set of the six single-date classifications, 57% of bare soil pixels are classified if an SI of 0 is allowed (black triangles on Figure 10), and the classification accuracy associated with these bare soil pixels is estimated to be 0.63 (blue triangles on Figure 10).A nearly full coverage of classified bare soil pixels (93%) could be obtained for a maximum SI of 0.44, which corresponds to a class identified four times among classification processes performed over the six Sentinel-2 images (black triangles on Figure 10).The classification accuracy associated with this maximum SI of 0.44 is estimated to be 0.6 (blue triangles on Figure 10).
From the set of the 100 classifications obtained by bootstrap from the image acquired on the 24th of April, 39% of classified bare soil pixels are classified if an SI of 0 is allowed (black circles on Figure 10), and the associated classification accuracy is estimated to be 0.73 (blue circles on Figure 10).A nearly full coverage of classified bare soil pixels (93%) could be obtained for a maximum SI of 0.46 (black circles on Figure 10).The classification accuracy associated with this maximum SI of 0.46 is estimated to be 0.56 (blue circles on Figure 10).100 classifications obtained by bootstrap from the image acquired on the 24th of April is higher than the accuracy obtained from the set of the six single-date classifications (Figure 10).

Classification Performance by Individual Sentinel-2 Data
The classification models provided low to moderate performances for soil texture classification depending on the date of acquisition of the Sentinel-2 image used (accuracy ranging from 0.43 to 0.5).These performances were lower than those obtained by [18], who reported an accuracy of approximately 0.58 to 0.64 for soil texture classification using Landsat-5 TM images and lower than those obtained by [19], who reported an overall accuracy of 0.64 for soil texture classification and using Landsat-5 TM images.
The best performance of soil texture classification was obtained using the image acquired on the 24th of April 2017, and lower performance was obtained when using images acquired in February (Figure 4), which may be explained by higher soil moisture in the winter season.The lowest performance, obtained when using the image acquired on the 4th of April 2017 (Figure 4) compared to the 24th of April 2017, may also be explained by soil moisture resulting from rainfall that occurred days before the acquisition (Figure 2), impacting the soil reflectance and reducing the spectral dissimilarity among classes.These results are in agreement with the results obtained by [53], who described a progressive decrease of performance for soil organic matter estimation from the drier to the wetter subsets.Soil surface roughness and crop residue over fields may also contribute to the poor discrimination among textural classes in addition to soil surface moisture, as suggested by [18], who attempted to classify soil texture from Landsat images.Globally, if the allowed SI is inferior to 0.3, the set of the 100 classifications obtained by bootstrap from the image acquired on 24th of April provided better accuracy but lower classified coverage than the set of the six single-date classifications (Figure 10).From an SI of 0.3, the classified coverage remains the same, regardless of the set of classifications used.With an SI of 0.3 to 0.46, the accuracy obtained from the set of the 100 classifications obtained by bootstrap from the image acquired on the 24th of April is lower than the accuracy obtained from the set of the six single-date classifications.From an SI of 0.46, this tendency is opposite: the accuracy obtained from the set of the 100 classifications obtained by bootstrap from the image acquired on the 24th of April is higher than the accuracy obtained from the set of the six single-date classifications (Figure 10).

Classification Performance by Individual Sentinel-2 Data
The classification models provided low to moderate performances for soil texture classification depending on the date of acquisition of the Sentinel-2 image used (accuracy ranging from 0.43 to 0.5).These performances were lower than those obtained by [18], who reported an accuracy of approximately 0.58 to 0.64 for soil texture classification using Landsat-5 TM images and lower than those obtained by [19], who reported an overall accuracy of 0.64 for soil texture classification and using Landsat-5 TM images.
The best performance of soil texture classification was obtained using the image acquired on the 24th of April 2017, and lower performance was obtained when using images acquired in February (Figure 4), which may be explained by higher soil moisture in the winter season.The lowest performance, obtained when using the image acquired on the 4th of April 2017 (Figure 4) compared to the 24th of April 2017, may also be explained by soil moisture resulting from rainfall that occurred days before the acquisition (Figure 2), impacting the soil reflectance and reducing the spectral dissimilarity among classes.These results are in agreement with the results obtained by [53], who described a progressive decrease of performance for soil organic matter estimation from the drier to the wetter subsets.Soil surface roughness and crop residue over fields may also contribute to the poor discrimination among textural classes in addition to soil surface moisture, as suggested by [18], who attempted to classify soil texture from Landsat images.
Soil texture classification maps produced in our study showed good discrimination ability for (i) hillslopes and uplands, which correspond to coarse soil texture (sandy loam) due to erosion processes, and (ii) valleys, which correspond to finer soil texture (clay) due to deposition processes.Low confusion was obtained between the C and SL classes corresponding to remote textural classes (Table 1), and high confusion was obtained between neighboring textural classes, i.e., between C and SC and between SL and SCL, independent of the acquisition time of the Sentinel-2 image (Table 1).Dematte et al. [19] also succeeded in discriminating remote textural classes (clayey from sandy soils) using Landsat-5 TM images.As neighboring textural classes (e.g., C and SC) require almost similar types of management and behave similarly in crop production, whereas remote textural classes require different types of management, soil resource management policies may strongly benefit from the capacity of Sentinel-2 data to accurately identify these soil texture properties over regional scales.

How Training Dataset Selection and Acquisition Date Are Sources of Uncertainties
The bootstrap procedure provided an uncertainty expression in class prediction for each Sentinel-2 image and each bare soil pixel due to soil sample selection in the training dataset.Despite the scarcity of studies about uncertainty estimations of soil property prediction, the bootstrap procedure seems to be the most commonly used way to estimate it, as tested by [21,50,54].These previous works estimated quantitative soil properties (soil organic carbon for [50,54]; clay content for [21]), so their uncertainties are represented by prediction variances.As we studied the soil texture class, which is a qualitative soil property, our uncertainties are represented by Simpson's diversity index.
Classification models seem to be highly sensitive to the soil sample selection in the training dataset, regardless of the Sentinel-2 image used (Figure 7).As an example, only 53% of the bare soil pixels were classified in the same textural class after the 100 bootstrap applied to the image acquired on the 24th of April 2017 (black circles on Figure 10).Regardless of the Sentinel-2 image used, low uncertainties were mostly obtained for pixels associated with the remote textural classes C and SL, while high uncertainties were mostly obtained for pixels associated with the class SC (Figure 8).Thus, if one classification is produced from one soil sample selection, the risk of misclassification of class SC is high, while the risk of misclassification of classes C and SL is lower.The high uncertainty associated with the textural class SC cannot be explained by an underrepresentation of class SC, as it corresponds to 30% of the training samples.The image acquired on the 4th of April 2017, which provided the lowest classification performance (Figure 4a), was associated with the highest uncertainties (Figure 7e).Therefore, this uncertainties map focuses attention on the use of a soil textural classification obtained from an image probably impacted by soil moisture resulting from rainfall that occurred days before the acquisition (Figure 2).
Based on the hypothesis that soil texture does not significantly change within months in natural conditions, the Sentinel-2 time-series provided an uncertainty expression in class prediction for each bare soil pixel due to the acquisition date.Classification models seem to be highly sensitive to the acquisition date, as only 57% of the bare soil pixels were classified in the same textural class among the time-series classification (black triangles on Figure 10).The lowest uncertainties were obtained for pixels associated with the remote textural classes C and SL.Inversely, the class SC was never predicted by the six images over the same bare soil pixels.This confirms that pixels for which soil texture is identified as SC based on individual images should be considered very carefully.
Uncertainties due to the acquisition date and soil sample selection in the training dataset are theoretically independent.Despite this independence, some specific areas were always associated with low uncertainties (red rectangle, Figures 7 and 9b).Conversely, other areas were associated with different levels of uncertainty depending on the images used (orange rectangles, Figures 7  and 9b).Therefore, this study demonstrated that uncertainties may be important, variable and spatially heterogeneous.However, hierarchical uncertainties cannot be proposed based on these results.The number of soil samples (130), the number of Sentinel-2 images (6) and the time period (4 months) might serve as limitations on clear hierarchical uncertainties analysis.

Additional Uncertainties and Inaccuracies
The classification accuracy over a pixel i may depend on the composition of this pixel i and the composition of pixels of the training database.As greater fraction of materials other than the soil (such as the dry and green vegetation) are covering these pixels (i and the ones used to train the classification model), the higher would be the importance of perturbation factors of VNIR-SWIR spectral measurements [55,56].Therefore, the step of vegetated pixels mask is crucial.In VNIR-SWIR hyperspectral studies, both NDVI and Cellulose Absorption Index (CAI) spectral indexes [57] are used to identify green and dry vegetation pixels, respectively, and mask them [58].In VNIR-SWIR multispectral studies, including using the Sentinel-2 sensor, the CAI may not be calculated as it needs spectral information measured at 2000, 2100 and 2200 nm.Spectral information measured at 2100 and 2200 nm are included in one band of Sentinel-2 (centered at 2190 nm with a band width of 180 nm), and the 2000 nm wavelength is not covered by Sentinel-2.So a part of inaccuracy and uncertainty may be due to crop residue coverage which cannot be removed from Sentinel-2 data.
The classification accuracy may also depend on the accuracy of class determination on training data.In this study, the determination of soil textural classes was performed on the field by experienced soil surveyors, using the feel method based on codified field tests [41].Expertise of soil surveyors is more uncertain than laboratory routine process based on physico-chemical analysis.Confusion between soil textural classes may happen, in particular between neighboring textural classes.Among 60 soil textural classes determined from both physico-chemical analysis and the feel method over soil samples collected on 17 soil profiles on the Berambadi area, 23.3 % are misclassified by the feel method (Table 3).In addition, the higher confusion happens between both neighboring textural classes SC and SCL (Table 3).This high confusion between SC and SCL determined by the feel method may explain the high difficulty of the SVM model to correctly classify this class SC whatever the acquisition date and the training dataset.So the use of physico-chemical analysis, instead of the feel method, to determine the soil textural classes is likely to improve the capacity of Sentinel-2 data for mapping soil textural classes.Importantly, in terms of uncertainties, the confusion between SC and SCL by the feel method may also explain the high uncertainties associated with the predicted class SC after each bootstrap.So our uncertainty maps provide valuable information for avoiding misclassification due to the reference data.

How Uncertainty and Classification Estimations May be Used Together
Uncertainty estimation accompanying biophysical property estimation by remote sensing is a well-researched topic (e.g., [21,59,60]).The Global Soil Map consortium, which aims to make a new digital soil map of the world at fine resolution, proposes to use emerging technologies such as optical remote sensing for soil mapping and predicting soil properties [61].Uncertainty estimations must be provided in association with soil property estimations [62].Therefore, the proposed approach providing an estimation of two uncertainties coupled to a soil property classification is in line with actual research in soil science.
One viable option is to use the whole classification (over all bare soil pixels) coupled to the whole uncertainty maps.Another strategy could be finding a compromise between the proportion of classified pixels and the model performance by choosing an acceptable uncertainty threshold.A high proportion of classified pixels may be obtained in accepting high uncertainties in the classification results, and inversely, a low proportion of classified pixels may be obtained in accepting low uncertainties in the classification results (Figure 10).The uncertainty threshold may depend on the use of the soil textural map as an input in a crop model, as a covariable in a digital soil mapping approach or for management decisions.

Conclusions
Considering the low temporal variation of soil properties, classification or estimation of soil surface properties based on VNIR-SWIR remote sensing data are usually performed by using only one acquisition during a dry period.This paper shows the benefits of using Sentinel-2 data for classification of soil surface texture as our soil texture classification maps showed good discrimination ability for hillslopes and uplands which correspond to coarse soil texture due to erosion processes, and for valleys which correspond to finer soil texture due to deposition processes.In addition, this work highlights the impact of both the training dataset and the Sentinel-2 acquisition date on soil textural soil classifications.Indeed, both the bootstrap process and Sentinel-2 time series allow the production of uncertainty maps associated with soil property classifications, showing that remote textural classes are mapped with a high confidence level, whereas intermediate soil texture classes are mapped with lower confidence.
This kind of uncertainty map will potentially help to better use soil property maps obtained by VNIR-SWIR remote sensing data by providing confidence levels for soil property maps used as input data for environmental models (e.g., crop models) or in digital soil mapping approaches.Finally, this approach using Sentinel-2 time-series to produce soil surface texture classification with associated uncertainty maps will be tested on other soil properties characterized by limited short-term variations, such as carbonates and iron.

Figure 1 .
Figure 1.(a) Location of the Berambadi catchment in India (red point) and (b) the 216 collected soil surface samples plotted over a SRTM Digital Elevation Model, from black (low elevation) to white (high elevation) areas.

Figure 1 .
Figure 1.(a) Location of the Berambadi catchment in India (red point) and (b) the 216 collected soil surface samples plotted over a SRTM Digital Elevation Model, from black (low elevation) to white (high elevation) areas.

Figure 2 .
Figure 2. Rainfall (in mm) measured every day at 6 am, over Maddur village (Berambadi).Vertical red lines indicate the dates of Sentinel-2 images acquisition.

Figure 3 :
Figure 3: Workflow of the soil texture classification associated with the uncertainty analysis.(a) Extraction of spectral information associated to field soil texture, (b) classification process, (c) intra-date uncertainty analysis and intra-date soil texture map and (d) inter-date uncertainty analysis and final soil texture map

Figure 3 .
Figure 3. Workflow of the soil texture classification associated with the uncertainty analysis.(a) Extraction of spectral information associated to field soil texture, (b) classification process, (c) intra-date uncertainty analysis and intra-date soil texture map and (d) inter-date uncertainty analysis and final soil texture map.

Figure 4 .
Figure 4. (a) Accuracy and (b) kappa coefficient obtained for 100 classification models built using each Sentinel-2 image.

Figure 6 .
Figure 6.Distribution of the most frequent textural class estimated over bare soils pixels at each acquisition date.

Figure 8 .
Figure 8. Boxplot of Simpson's diversity index per class (the most frequent textural class in the set of 100 estimations) and per image.

Figure 9 .
Figure 9. Map of (a) the most frequent textural class and (b) the Simpson's diversity index, both calculated from the time series of six classifications.The white areas correspond to the masked mixed pixels.

Figure 9 .
Figure 9. Map of (a) the most frequent textural class and (b) the Simpson's diversity index, both calculated from the time series of six classifications.The white areas correspond to the masked mixed pixels.

Figure 10 .
Figure 10.Proportion of classified bare soil pixels and classification performance (accuracy), versus a maximal value of SI allowed.Triangles correspond to results obtained from the set of six classifications.Circles correspond to results obtained from the bootstrap process applied to the image acquired on 24th of April.

Figure 10 .
Figure 10.Proportion of classified bare soil pixels and classification performance (accuracy), versus a maximal value of SI allowed.Triangles correspond to results obtained from the set of six classifications.Circles correspond to results obtained from the bootstrap process applied to the image acquired on 24th of April.

Table 3 .
Confusion matrix obtained from 60 soil textural classes determined from both physico-chemical analysis and the feel method over soil samples collected on 17 soil profiles on the Berambadi area.

Table 1 .
Mean of Confusion Matrixes (expressed in %) obtained by the 100 classification models from each image.Each row of the confusion matrix represents the instances in a predicted class while each column represents the instances in an observed class.

Table 2 .
Pearson coefficient of correlation (R) between the classification (most frequent class after the bootstrap process) obtained at each date.