Multi-Temporal Satellite Images on Topsoil Attribute Quantiﬁcation and the Relationship with Soil Classes and Geology

: The mapping of soil attributes provides support to agricultural planning and land use monitoring, which consequently aids the improvement of soil quality and food production. Landsat 5 Thematic Mapper (TM) images are often used to estimate a given soil attribute (i.e., clay), but have the potential to model many other attributes, providing input for soil mapping applications. In this paper, we aim to evaluate a Bare Soil Composite Image (BSCI) from the state of S ã o Paulo, Brazil, calculated from a multi-temporal dataset, and study its relationship with topsoil properties, such as soil class and geology. The method presented detects bare soil in satellite images in a time series of 16 years, based on Landsat 5 TM observations. The compilation derived a BSCI for the agricultural sites (242,000 hectare area) characterized by very complex geology. Soil properties were analyzed to calibrate prediction models using 740 soil samples (0–20 cm) collected of the area. Partial least squares regression (PLSR) based on the BSCI spectral dataset was performed to quantify soil attributes. The method identiﬁed that a single image represents 7 to 20% of bare soil while the compilation of the multi-temporal dataset increases to 53%. Clay content had the best soil attribute prediction estimates (R 2 = 0.75, root mean square error (RMSE) = 89.84 g kg − 1 , and accuracy = 74%). Soil organic matter, cation exchange capacity and sandy soils also achieved moderate predictions. The BSCI demonstrates a strong relationship with legacy geological maps detecting variations in soils. From a single composite image, it was possible to use spectroscopy to evaluate several environmental parameters. This technique could greatly improve soil mapping and consequently aid several applications, such as land use of land degradation, updating

5 years of Landsat imagery and the Barest Soil Composite technique. The information was subsequently employed to map sand, silt, clay, and organic matter at the study site. Rogge et al. [23] developed the Soil Composite Mapping Processor (SCMaP) as an approach to overcome limited soil exposure and map soil characteristics in Germany.
The Piracicaba region in central São Paulo, Brazil, is characterized by complex relief, with high variability in soils formed by different parent materials. Soil image information is important for this region due to agribusiness intensification of the sugar and bioenergy industries. Thus, there is a specific time-window for satellite sensor monitoring of bare soil because of the crop cultivation during the year. The bare soil information could be used to calibrate models and map topsoil attributes. Furthermore, the integration of multi-temporal images would be beneficial to improving soil coverage and creating a single composite image dataset. Our hypothesis is that the retrieval of this unique information could be related to soil spatial variability. This study aims to create a multi-temporal Bare Soil Composite Image (BSCI) and verify its usefulness in the mapping of soil properties, and the relationship with soil classes and geology. The relationship with soil mineralogy, color and clay fraction presented in legacy maps is also established. This study intends to show that image spectroscopy can be applied to several environmental parameters.

Materials and Methods
The proposed method is to assess bare soil, by masking green vegetation and crop residue targets from Landsat images. The compilation of time series images creates a BSCI, providing soil surface reflectance data for modeling topsoil properties (Figure 1).
Remote Sens. 2018, 10, x FOR PEER REVIEW 3 of 21 developed the Soil Composite Mapping Processor (SCMaP) as an approach to overcome limited soil exposure and map soil characteristics in Germany. The Piracicaba region in central São Paulo, Brazil, is characterized by complex relief, with high variability in soils formed by different parent materials. Soil image information is important for this region due to agribusiness intensification of the sugar and bioenergy industries. Thus, there is a specific time-window for satellite sensor monitoring of bare soil because of the crop cultivation during the year. The bare soil information could be used to calibrate models and map topsoil attributes. Furthermore, the integration of multi-temporal images would be beneficial to improving soil coverage and creating a single composite image dataset. Our hypothesis is that the retrieval of this unique information could be related to soil spatial variability. This study aims to create a multitemporal Bare Soil Composite Image (BSCI) and verify its usefulness in the mapping of soil properties, and the relationship with soil classes and geology. The relationship with soil mineralogy, color and clay fraction presented in legacy maps is also established. This study intends to show that image spectroscopy can be applied to several environmental parameters.

Materials and Methods
The proposed method is to assess bare soil, by masking green vegetation and crop residue targets from Landsat images. The compilation of time series images creates a BSCI, providing soil surface reflectance data for modeling topsoil properties (Figure 1).

Figure 1.
Flowchart of the method applied to obtain a multi-temporal bare soil image and predict soil clay content, Cation Exchange Capacity (CEC), and Organic Matter (OM) maps.

Study Site and Soil Samples
The 242,000 hectare study site covers eight cities (Piracicaba, Saltinho, Mombuca, Rio das Pedras, Iracemápolis, Charqueada, Capivari and Rafard) in central São Paulo State, Brazil ( Figure 2). The region is classified as humid subtropical on the Koppen classification system and characterized by

Study Site and Soil Samples
The 242,000 hectare study site covers eight cities (Piracicaba, Saltinho, Mombuca, Rio das Pedras, Iracemápolis, Charqueada, Capivari and Rafard) in central São Paulo State, Brazil ( Figure 2). The region is classified as humid subtropical on the Koppen classification system and characterized by dry winter (May-September) and hot rainy summers (October-April) [24]. The average annual rainfall is approximately 1273 mm year −1 and the average annual temperature is 24 • C. The landscape is gently undulating and rolling uplands, with slopes between 0 and 42%.
Remote Sens. 2018, 10, x FOR PEER REVIEW 4 of 21 dry winter (May-September) and hot rainy summers (October-April) [24]. The average annual rainfall is approximately 1273 mm year −1 and the average annual temperature is 24 °C. The landscape is gently undulating and rolling uplands, with slopes between 0 and 42%. Agriculture is the predominant land use in the area, with plowing being a common practice in preparation for sowing. The plowing activity break up and turn over soil in a depth of 0 to 20 cm, it keeps the surface layer homogenous. Thus, topsoil samples (Epipedon) were collected in toposequence scheme with an auger (0-20 cm). The toposequence was designed using several parameters as a variation of soil type [25], geology [26], and Digital Elevation Model (DEM) [27] observations ( Figure 3). The topsoil samples corresponded to 740 points, which represent the variability of the entire area ( Figure 2). Sampling sites were geo-referenced using global positioning system (GPS).
The soil samples were oven-dried for 48 h at 50 °C, ground and sieved (2 mm mesh) to analyze chemical and particle size properties according to the Brazilian Soil Classification System [28]. The densimeter method was used to obtain soil particle size, coarse and fine sand, silt, and clay [29]. The following chemical attributes were determined according to Raij et al. [30]: exchangeable bases (Ca 2+ , Mg 2+ , and K + ), available phosphorus and soil organic carbon. From the chemical attributes, the sum of bases (SB), cation exchange capacity (CEC), base saturation (V%) and aluminum saturation (m%) were calculated. Agriculture is the predominant land use in the area, with plowing being a common practice in preparation for sowing. The plowing activity break up and turn over soil in a depth of 0 to 20 cm, it keeps the surface layer homogenous. Thus, topsoil samples (Epipedon) were collected in toposequence scheme with an auger (0-20 cm). The toposequence was designed using several parameters as a variation of soil type [25], geology [26], and Digital Elevation Model (DEM) [27] observations ( Figure 3). The topsoil samples corresponded to 740 points, which represent the variability of the entire area ( Figure 2). Sampling sites were geo-referenced using global positioning system (GPS).
The soil samples were oven-dried for 48 h at 50 • C, ground and sieved (2 mm mesh) to analyze chemical and particle size properties according to the Brazilian Soil Classification System [28]. The densimeter method was used to obtain soil particle size, coarse and fine sand, silt, and clay [29]. The following chemical attributes were determined according to Raij et al. [30]: exchangeable bases (Ca 2+ , Mg 2+ , and K + ), available phosphorus and soil organic carbon. From the chemical attributes, the sum of bases (SB), cation exchange capacity (CEC), base saturation (V%) and aluminum saturation (m%) were calculated. At the study site, the geology (Figure 3a) includes fine to medium, reddish, silty-clay sandstone of the Botucatu and Pirambóia Formation and dike elements of basic intrusive diabase rocks of the  [26], (b) Soil map, 1:100,000 scale [25] according to the Brazilian Soil Classification System [28], (c) Surface soil textural classes based on the soil map.
At the study site, the geology ( Figure 3a) includes fine to medium, reddish, silty-clay sandstone of the Botucatu and Pirambóia Formation and dike elements of basic intrusive diabase rocks of the Serra Geral Formation. There are places where the Pirambóia Formation covers the Corumbataí Formation, presenting the incidence of chert marks of the contact between these two Formations. The Irati Formation (Passa Dois Group) arises on the top of sandy deposits where the relief is gently undulating to flat, but as it approaches the hillshade areas the diabase rocks of Serra Geral formation occur (Figure 3b). The diabase rocks also ascend between the siltstones that are the predominant lithology of the Itararé and Tatuí Formation (Tubarão group).
The high soil variability is mainly due to the complexity and diversity of the parent material of the neo-Cenozoic surficial deposits of various textures (Figure 3b). Here, the soil types were classified at the suborder level according to the Brazilian Soil Classification System (SiBCS) [28]. The corresponding Soil Taxonomy [31], and World Reference Base [32] classes are shown in Table 1. The soil types on the landscape demonstrate the occurrence of soils derived from diabase rocks, such as Ferralsol (LV, LVA, and LVf), which is in the highest position of the relief, while Nitosol (NV and NVL) is over smooth to gently undulating relief.
Soils of common occurrence derived from sediments include Leptsol (RL) and Lixisol (PVA, PV), which may be associated or not, and Arenosol (RQ) derived from the substrate of the Pirambóia Formation (sandstones). Nearby, on the flat relief, there is Planosol (SX) and Gleysol (GX) above drainage conditions ( Figure 3b). Topsoil textures have a large variability between the western area of the region (large areas dominated by sandy textures) and the remaining areas (clayey, clay-loamy, and loamy texture) (Figure 3c).  [28], and corresponding Soil Taxonomy [31], and Word Reference Base [32].

SiBCS Classification
Soil Taxonomy  Thirteen cloud-free Landsat 5 TM images from the dry season (June to September) between 1994 and 2010 were selected. The dry season was chosen to reduce moisture conditions in the agricultural fields as high soil moisture content decreases soil reflectance and affects the features of the spectral profile, influencing absorption features [33].

Image Preprocessing
The geometric and atmospheric corrections and the spectral analysis were performed using the ENVI (Environment for Visualizing Images) program (Exelis Visual Information Solutions, Boulder, Colorado). Clipping processes, index calculation, and masks of the multispectral images were executed on the ArcGIS program [34].
The geometric corrections were performed using control points obtained in the field with GPS (Global Position System) equipment [35]; the images were then reprojected to the Universal Transverse Mercator system (UTM/Zone 23/South). The atmospheric correction was performed with the FLAASH module (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes), which integrates the MODTRAN (Moderate Resolution Atmospheric Transmission) radiation transfer code and accounts for water vapor and aerosol retrieval [36]. This preprocessing aimed to normalize the digital data and eliminate the addictive effects of the atmosphere.
The steps of this technique, according to Chander et al. [37], consist of converting the digital numbers from pixels to radiance, for which the Radiometric Calibration tool of the ENVI program was used. Then, the conversion of the radiance to top-of-atmosphere reflectance and the top-of-atmosphere reflectance to surface reflectance were performed using the FLAASH module. The images resulted in gray level or minimum digital number (zero value) corresponding to 0% reflectance, and the highest gray level (255), through to 100% reflectance [38], thus, the reflectance data of the historical series of the satellite images could be compared to each other.
The reflectance data of the soil (texture variation), sugarcane, and straw obtained under laboratory conditions indicated regions of reflection and absorption in the spectral bands of Landsat 5. In the electromagnetic spectrum, the TM3 band captures strong absorption by green vegetation reflection, and TM4 expressively reflects the energy of vegetation targets. It indicates good contrast between pixels of vegetation and those without it (bare soil) [39]. The TM5 and TM7 bands also contrast both targets. These bands in the electromagnetic spectrum are positioned in the region of strong reflection of soils and high absorption of healthy vegetation [40] (Figure 4a). TM3 (RED), TM4 (NIR), TM5 (SWIR1) and TM7 (SWIR2) were used to compose the RGB (5, 4, 3 or 7, 4, 3) image and the spectral indices. Colorado). Clipping processes, index calculation, and masks of the multispectral images were executed on the ArcGIS program [34]. The geometric corrections were performed using control points obtained in the field with GPS (Global Position System) equipment [35]; the images were then reprojected to the Universal Transverse Mercator system (UTM/Zone 23/South). The atmospheric correction was performed with the FLAASH module (Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes), which integrates the MODTRAN (Moderate Resolution Atmospheric Transmission) radiation transfer code and accounts for water vapor and aerosol retrieval [36]. This preprocessing aimed to normalize the digital data and eliminate the addictive effects of the atmosphere.
The steps of this technique, according to Chander et al. [37], consist of converting the digital numbers from pixels to radiance, for which the Radiometric Calibration tool of the ENVI program was used. Then, the conversion of the radiance to top-of-atmosphere reflectance and the top-ofatmosphere reflectance to surface reflectance were performed using the FLAASH module. The images resulted in gray level or minimum digital number (zero value) corresponding to 0% reflectance, and the highest gray level (255), through to 100% reflectance [38], thus, the reflectance data of the historical series of the satellite images could be compared to each other.
The reflectance data of the soil (texture variation), sugarcane, and straw obtained under laboratory conditions indicated regions of reflection and absorption in the spectral bands of Landsat 5. In the electromagnetic spectrum, the TM3 band captures strong absorption by green vegetation reflection, and TM4 expressively reflects the energy of vegetation targets. It indicates good contrast between pixels of vegetation and those without it (bare soil) [39]. The TM5 and TM7 bands also contrast both targets. These bands in the electromagnetic spectrum are positioned in the region of strong reflection of soils and high absorption of healthy vegetation [40] (Figure 4a). TM3 (RED), TM4 (NIR), TM5 (SWIR1) and TM7 (SWIR2) were used to compose the RGB (5, 4, 3 or 7, 4, 3) image and the spectral indices. Estimation of soil properties, such as clay content, is hampered if vegetation or crop residues cover the pixels. These targets are often spectrally similar; however, they differ in intensity and shape of the VIS-NIR-SWIR wavelengths.

Soil Masking
Pixels of bare soil are required to evaluate soil properties. A model was developed in the model builder tool of the ArcGIS program to detect residue cover and green vegetation classes in each Estimation of soil properties, such as clay content, is hampered if vegetation or crop residues cover the pixels. These targets are often spectrally similar; however, they differ in intensity and shape of the VIS-NIR-SWIR wavelengths.

Soil Masking
Pixels of bare soil are required to evaluate soil properties. A model was developed in the model builder tool of the ArcGIS program to detect residue cover and green vegetation classes in each image. This model is a diagram that chains together geoprocessing tools and build automate workflows. The model was built to create a mask of bare soil, the chains of the workflows analyzed simultaneously the Normalized Difference Vegetation Index (NDVI) profile and MID-Infrared index profile over all the images, another process was to clip water reservoirs, and urban sites. The mask targets were flagged as not available values (NA) (Figure 1).
The NDVI is an index that differentiates vegetation in satellite images, evaluating the variation of a green area within a certain period. The index calculation is represented in Equation (1) [41] and varies from −1 to 1. In Figure 4a, the values below the thresholds were an input to the model creating a mask for non-vegetation patterns (i.e., bare soil or straw).
where, NDVI is the Normalized Difference Vegetation Index; NIR is the reflectance value on near infrared of Band 4-TM4; and R is the reflectance value on red of Band 3-TM3. The MID-Infrared index was calculated to mask soil texture classes and straw (Figure 4a,b). Then, as described by Demattê et al. [18], bands TM5 and TM7 (near lignin band) were used to discriminate between these two objects. The MID-Infrared index (Equation (2)) is used to establish thresholds to contrast soils and straw to obtain only pixels of bare soil. Hence, values below the bottom limits were used to create the model to mask straw ( Figure 4a).
where, MID-Infrared is the normalized difference between Band 5-TM5 and Band 7-TM7 index; SWIR1 is the reflectance value on near-infrared Band 5-TM5; SWIR 2 is the reflectance value on near-infrared Band 7-TM7.

Design of BSCI
A composite image was constructed, adapted from Dematê et al. [18], which operated an algorithm in R program language [42]. The algorithm requires the bare soil mask image and the MID-Infrared index image as input files. The lowest value of the MID-Infrared index was established to standardize the multi-temporal occurrences of bare soil for the same pixel. This algorithm selected the pixel of lowest MID-Infrared index value from all possible occurrences for a given pixel to finally create the BSCI (Figure 1).

Evaluation of the BSCI
From the BSCI, multi-temporal spectra data (TM bands) from the soil sample points ( Figure 2) were extracted using the Extract Multi-Value by Points tool of ArcGIS program [26] for spatializing soil attributes.
The spectral profile and soil line concept [43] were applied to the multi-temporal spectra data to verify the BSCI. The soil line concept is useful for evaluation of the bare soil pixel [19]. The soil line is a two-dimensional diagram (scatterplot) that contrasts the reflectance values obtained from TM3 and TM4 bands, which starts between the X-axis and the Y axis at 45 degree inclination. It presents linear correlation (1:1), and the bare soil pixels must be located close to the 1:1 line.
A comparison between the BSCI, the conventional soil and geology map [25,26] (Figure 3) was carried out. The Tukey's test for mean comparison of reflectance according to each geology classes was used when p < 0.05. The area was divided into six subsites to make comparisons related to the Remote Sens. 2018, 10, 1571 9 of 21 mineralogy of the soils, the composition of the clay fraction and the color of the satellite images. These were the only maps available for the area, besides the scale map and the spatial resolution that did not match with the satellite resolution.

Soil Attribute Modeling
The partial least square regression (PLSR) was used to calibrate soil attribute (clay, sand, CEC, soil organic matter (SOM)) models using Unscramble software [44]. For each soil attribute, the model calibration divided the multi-temporal spectra data of BSCI into calibration and validation sets, of which 80% of the spectra were used in calibration, while the remaining 20% were employed for validation. The models were evaluated based on the coefficient of determination (R 2 ), root mean square error (RMSE) and the ratio of performance to deviation (RPD) for both calibration and validation datasets [45].
A confusion matrix was also used to validate the predicted maps by comparing the measured class from the validation points (20%) and the respective predicted class. Overall accuracy, the Kappa coefficient [47], and the user's accuracy and the producer's accuracy were calculated. The overall accuracy is the ratio between the sum of all pixels correctly classified and the total number of validation pixels. The Kappa coefficient indicates the level of agreement between measured and predicted observations. This coefficient is used to account for the possibility of data being correctly clustered by chance [48] (Equation (3)).
where, K is the Kappa coefficient; θ 1 is the weighted observed agreement; and θ 2 is the weighted chance agreement. Components located on the diagonal of the confusion matrix have the maximum weight on a linear scale. Components further from the correct ones receive lower weights. The level of agreement classification by Landis and Koch [49] was considered to evaluate map quality. Kappa values from 0.21 to 0.40 are considered as fair agreement; from 0.41 to 0.60, moderate agreement; from 0.61 to 0.80, substantial agreement and values above 0.81, almost perfect agreement.

Soil Characterization
The soil property (0-20 cm) summary statistics indicate high variability of all physical and chemical soil attributes ( Table 2). The chemical attributes have a higher variability than the physical attributes, although their status is suitable for plant development. Croplands are the most common land use of the region. Common soil textures are sand, loam, and clay-loam ( Figure 5). Clay content, for instance, is between 41 and 765 g kg −1 , with a variation of 70% (Table 2). SOM presents values between 0 and 52 g kg −1 , varying by 50%. The amplitude of values for SOM is related to tropical conditions, which are highly affected by environmental factors and especially by soil management practices, which impacts the steady state over time [50]. The clay, silt and sand contents are less impacted by agricultural practices when compared to the chemical properties and SOM. However, the high variability presented in Figure 5 shows the influence of the complex geology of the region.
Remote Sens. 2018, 10, x FOR PEER REVIEW 10 of 21 Figure 5. Soil textural triangle of the measured samples, representing variation [21]. Red spots indicate sandy soil, yellow indicates loamy soil, blue is clay-loamy soil, green represents clay soil, and light blue shows silty soil.

Multi-Temporal Bare Soil Detection
The percentage of bare soil for each of the 13 images varied from 7% (20 June 2007) to 20% (24 August 1996) (Figure 6), with an average bare soil coverage of 12 % in the area. Many pixels of bare soil occur in the same croplands, indicating that with time the bare soil occurrences overlap. In fact, in only a few images, it was observed that there were moments in time when the increase in bare soil is higher than the average. The selection period covered the months of July and August, which corresponds to the dry season, often coinciding with the practice of plowing management. The availability of satellite images in the dry season is limited but efficient. Thus, it is possible to detect large areas of bare soil in the dry season. Moreover, images are often influenced by clouds during the rainy period and are consequently not suitable for analysis.
Some pixels had no bare soil at any time, which were classified as NA. These NAs correspond to land uses such as urban areas (6% of the area), water bodies, forests, perennial crops, and grassland (41% of the area). Thus, it will be difficult to reach 100% bare soil for this study site. The BSCI presented 53% (128,260 ha) of bare soil coverage ( Figure 6). The BSCI is defined as an almost continuous soil surface constructed from 16 years of 30 m spatial resolution Landsat images, Figure 5. Soil textural triangle of the measured samples, representing variation [21]. Red spots indicate sandy soil, yellow indicates loamy soil, blue is clay-loamy soil, green represents clay soil, and light blue shows silty soil.

Multi-Temporal Bare Soil Detection
The percentage of bare soil for each of the 13 images varied from 7% (20 June 2007) to 20% (24 August 1996) (Figure 6), with an average bare soil coverage of 12% in the area. Many pixels of bare soil occur in the same croplands, indicating that with time the bare soil occurrences overlap. In fact, in only a few images, it was observed that there were moments in time when the increase in bare soil is higher than the average. The selection period covered the months of July and August, which corresponds to the dry season, often coinciding with the practice of plowing management. The availability of satellite images in the dry season is limited but efficient. Thus, it is possible to detect large areas of bare soil in the dry season. Moreover, images are often influenced by clouds during the rainy period and are consequently not suitable for analysis.
Some pixels had no bare soil at any time, which were classified as NA. These NAs correspond to land uses such as urban areas (6% of the area), water bodies, forests, perennial crops, and grassland (41% of the area). Thus, it will be difficult to reach 100% bare soil for this study site. The BSCI presented 53% (128,260 ha) of bare soil coverage ( Figure 6). The BSCI is defined as an almost continuous soil surface constructed from 16 years of 30 m spatial resolution Landsat images, representing the spectral surface variability of soils for the agricultural area of Piracicaba. The results demonstrated that the temporal coverage of bare soil is dependent on Landsat coverage, weather conditions and land management system according to crop type.

Bare Soil Composite Image Related to Soil and Geology
The BSCI shows more bare soil in subsites 1, 2 and 3 (Figure 7), which are related to areas with sugarcane crops. However, the BSCI in sub-areas 4, 5 and 6 illustrates very steep sites with less bare soils, upon which mechanization is hindered, leading to a prevalence of grassland and perennial crops, where the soil is rarely exposed by plowing.
The degree of soil evolution and weathering intensity determines the mineralogical composition of the clay fraction, which is essential for evaluating the soil spectral profile [51]. Tropical soils may have different proportions of oxides in their mineralogy, causing color variations and consequently influencing the spectral reflectance retrieved from satellite images [19]. Soils derived from volcanic rocks and with high iron content have a lower reflectance intensity (high absorption), while soils with medium to low iron content have intermediate and high values of reflectance intensity (low absorption) [52].  Figure 6. Percentage of bare soil occurrences from the satellite images and the Bare Soil Composite Image (BSCI), which represents total bare soil for the entire area of 242,000 hectares.

Bare Soil Composite Image Related to Soil and Geology
The BSCI shows more bare soil in subsites 1, 2 and 3 (Figure 7), which are related to areas with sugarcane crops. However, the BSCI in sub-areas 4, 5 and 6 illustrates very steep sites with less bare soils, upon which mechanization is hindered, leading to a prevalence of grassland and perennial crops, where the soil is rarely exposed by plowing.
The degree of soil evolution and weathering intensity determines the mineralogical composition of the clay fraction, which is essential for evaluating the soil spectral profile [51]. Tropical soils may have different proportions of oxides in their mineralogy, causing color variations and consequently influencing the spectral reflectance retrieved from satellite images [19]. Soils derived from volcanic rocks and with high iron content have a lower reflectance intensity (high absorption), while soils with medium to low iron content have intermediate and high values of reflectance intensity (low absorption) [52].
The variations in the BSCI are represented by the false color composition (R-TM5, G-TM4, B-TM3) (Figure 7), and are mostly related to clay proportions, soil color, and mineralogy (e.g., oxides, kaolinite, and quartz). The RGB values vary from 0 to 255, whereby low values (low reflectance) generate darker colors that indicate higher amounts of clay and oxides. Lighter shades indicate sandy soils, which have higher reflectance due to lower clay and iron oxide content. Clearly, the BSCI variations represented by false color compositions are extremely important because they enable quick assessment of soil texture and mineral distribution across the study site.  (Figure 7), and are mostly related to clay proportions, soil color, and mineralogy (e.g., oxides, kaolinite, and quartz). The RGB values vary from 0 to 255, whereby low values (low reflectance) generate darker colors that indicate higher amounts of clay and oxides. Lighter shades indicate sandy soils, which have higher reflectance due to lower clay and iron oxide content. Clearly, the BSCI variations represented by false color compositions are extremely important because they enable quick assessment of soil texture and mineral distribution across the study site.
The legacy geological and soil type maps were important in the establishment of relevant relationships between parent material, soil types, and image reflectance data. The mean reflectance of all bands is statistically different according to geology classes in accordance with the Tukey test at p < 0.05 significance (Table 3). This demonstrates that the BSCI correctly detects variations in soils, which are related to the parent material. The legacy geological and soil type maps were important in the establishment of relevant relationships between parent material, soil types, and image reflectance data. The mean reflectance of all bands is statistically different according to geology classes in accordance with the Tukey test at p < 0.05 significance (Table 3). This demonstrates that the BSCI correctly detects variations in soils, which are related to the parent material. The qualitative comparison covered 80% of the study site [25,26] (Figure 3). The proportion of soil surface texture variation was 33% for sandy, 18% for clay-loam, 9% for loamy, and 4% for clay soils, although 34% were not classified (Figure 3c). Locations discriminated as "no classification" correspond to areas not covered by the legacy soil map (semi-detail scale). These sites refer to Lixisol (PV) and less representative soil classes, including Planosol, Gleysol, and Cambisol (SX, GX, and CX). These patterns are not frequent on the legacy map but appear with more frequency as the number of field observations increases. In this case, the BSCI could achieve greater detail from the soil surface as it has 30 m spatial resolution, which is more detailed than the legacy soil maps.
The BSCI correctly detects small scale variations in soil texture, as in subsites 5 and 6 (Figure 7), where small abrupt changes of clay-loamy and clayey soils can be observed in a predominantly sandy texture location. Moreover, the BSCI detected the neo-Cenozoic sites located in the highest parts of the landscape. These sites correspond to subsite 1, 2 and 3, in stains of diabase, where occurrences of Ferrolsol and Nitosol (LV/LVf/NV/NVL) are described. These spots are detected as clayey and clay-loamy texture by the BSCI. Thus, the BSCI variation in texture for these soils matches the legacy maps.
Despite the low scale of the lithology map (1:100.000), the Botucatu and Pirambóia Formations (Jbp) are associated with Arenosol (RQ) (subsite 4- Figure 7), and they are sandy texture throughout the soil profile. The sandy texture detected by the satellite image enables the inference that they are Arenosol (RQ). Nevertheless, Lixisol (PV) may be sandy/loamy at the surface and loamy/clay-loamy or clayey in subsurface layers; but the image does not detect the B-horizon, thus confusing PV with RQ.
These results show that BSCI is an effective tool that can provide support to new soil surveys and update legacy maps. Furthermore, it may be used to extrapolate or interpolate areas with no available information on soil attributes. Considering that satellite images may be limited in the discrimination of some soil classes, such information may be combined with relief data when mapping soil types [53] (e.g., slope, curvature, topographic index). In addition, the relief could improve the detection of sites where Planosol, Gleysol, and Cambisol (SX, GX, and CX) appear, especially when related to drainage features.

Evaluation of Spectral Data (Spectral Profile and Soil Line)
The soil spectra were obtained from the Landsat image of each sampling location (Figure 8a). A great concern when using satellite images is the influence of soil moisture and surface roughness (shadow) in the spectral profile [22,54]. The spectral features are more affected by soil moisture, which is related to the water absorption bands around 1800 and 2119 nm (SWIR range) [55]. Clayey soils with significant amounts of SOM have higher water holding capacity than sandy soils, thus soil moisture has a positive correlation with clay content and SOM, and, consequently, a negative correlation with soil sand content [22]. To reduce the influence of external factors, atmospheric correction and data transformation to reflectance were performed. In addition, images from the dry season were selected and their quality with MID-Infrared index assessed, before calculating the BSCI [33]. Meanwhile, the profile demonstrates intensity and form of the soil spectral data. The intensity of BSCI spectra increases from Band 1 to Band 5. Besides that, sandy soils present higher reflectance and ascending shape when compared to clay-loamy soils (Figure 8b).
Among the spectral features influenced by quartz, there is an angular variation in TM5 [19,54]. Soils with loamy or clayey texture have lower angles in comparison to sandy soils, which agrees with the findings of Madeira, Netto and Baptista [56]. Pixels corresponding to vegetated areas may be clearly differentiated from other targets by the angle of 119 • and by their high reflectance in band TM4 (NIR). The low spectral reflectance of water, which absorbs energy in the entire electromagnetic spectrum, is characterized by an angle of 90 • .
The spectral behavior of areas covered with crop residues in the satellite images has a shape and intensity similar to those shown by sandy soils, which cause misunderstanding. However, two aspects differentiate straw from sandy soil: (1) straw presents higher reflectance intensity, and (2) the angular difference of TM5 is greater for straw than for sandy soils. Therefore, through the MID-Infrared index, it is possible to differentiate satellite images with an exposition of straw from those with bare sandy soils.
Here, the MID-Infrared index threshold was defined based on interpretation of the images (Figure 4a,b) and related to field observation analysis and Landsat images. Thus, this index is efficient for a drier period with sufficient land cover. This work brings the novelty of using a threshold for each image, instead of a mean value for all images. It increases the sensibility of the procedure, providing a realistic representation of bare soil conditions.
The spectral profile (Figure 8b) confirms that the BSCI only corresponds to bare soil, as it presents similar shape and intensity compared to the satellite multispectral data of Demattê et al. [19]. Besides the granulometric properties, iron content is also responsible for variations in satellite TM5 reflectance [19,54]. Clayey and clay-loamy soils that present high hematite (iron oxide) content have lower reflectance intensity with a horizontal curve. This is an evident feature among samples from the same parent material, as in the case of soils derived from diabase, such as in the Ferralsol (LV/LVf) and Nitosol (NV/NVL). Yet, as the iron content decreases the soil reflectance increases, as in the Ferralsol (LVA) and Lixisol (PV/PVA). Furthermore, in these soils, the hematite decreases, and the goethite increases. Therefore, as the sandy texture becomes more pronounced (e.g., in the Arenosol-RQ), higher reflectance is observed, with an even higher peak in TM5.
Soils with loamy or clayey texture have lower angles in comparison to sandy soils, which agrees with the findings of Madeira, Netto and Baptista [56]. Pixels corresponding to vegetated areas may be clearly differentiated from other targets by the angle of 119° and by their high reflectance in band TM4 (NIR). The low spectral reflectance of water, which absorbs energy in the entire electromagnetic spectrum, is characterized by an angle of 90°.
The spectral behavior of areas covered with crop residues in the satellite images has a shape and intensity similar to those shown by sandy soils, which cause misunderstanding. However, two aspects differentiate straw from sandy soil: (1) straw presents higher reflectance intensity, and (2) the angular difference of TM5 is greater for straw than for sandy soils. Therefore, through the MID-Infrared index, it is possible to differentiate satellite images with an exposition of straw from those with bare sandy soils.
Here, the MID-Infrared index threshold was defined based on interpretation of the images (Figure 4a,b) and related to field observation analysis and Landsat images. Thus, this index is efficient for a drier period with sufficient land cover. This work brings the novelty of using a threshold for each image, instead of a mean value for all images. It increases the sensibility of the procedure, providing a realistic representation of bare soil conditions.
The spectral profile (Figure 8b) confirms that the BSCI only corresponds to bare soil, as it presents similar shape and intensity compared to the satellite multispectral data of Demattê et al. [19]. Besides the granulometric properties, iron content is also responsible for variations in satellite TM5 reflectance [19,54]. Clayey and clay-loamy soils that present high hematite (iron oxide) content have lower reflectance intensity with a horizontal curve. This is an evident feature among samples from the same parent material, as in the case of soils derived from diabase, such as in the Ferralsol (LV/LVf) and Nitosol (NV/NVL). Yet, as the iron content decreases the soil reflectance increases, as in the Ferralsol (LVA) and Lixisol (PV/PVA). Furthermore, in these soils, the hematite decreases, and the goethite increases. Therefore, as the sandy texture becomes more pronounced (e.g., in the Arenosol-RQ), higher reflectance is observed, with an even higher peak in TM5.  The soil line plot from bands TM3 and TM4 (red and NIR wavelengths) (Figure 8c) is useful to understand surface variations and identify bare soil areas [35,36]. Such a method is employed to discriminate different soil types based on texture, iron content and organic matter variations [19,54]. Bare soil occurs in regions where the albedo approaches the 1:1 line of the plot, but in sandy soils, the reflectance spreads (Figure 8c). The low reflectance values in red (TM3) and high values in NIR (TM4) are found in pixels with higher proportions of vegetation. In the bottom left portion of the scatter plot are the pixels corresponding to water reservoirs (Figure 8c).
The soil line of the BSCI was found by calculating the intercept of lines, slope, and coefficient of determination (R 2 ), which change according to variations in bare soil reflectance [42] (Figure 8d).
The statistics of the BSCI soil line show high R 2 , which confirms BSCI pixels as being bare soils. The reflectance variation of soil texture was adjusted by different soil lines. Similar parameters were observed, but with slight changes from clay-loamy to loamy soils. Additionally, each soil texture class had a slope shift from the 1:1 line. The slope coefficient and intercepts varied with reflectance, whereby clayey soils had lower intensity (Ferralson-LVf), presenting negative intercept values and higher slope coefficient values, which agrees with Demattê and Terra [54]. Clayey soils (19 observations) and clay-loamy soils (218 observations) on the soil line plot (Figure 8c) correspond to ferric oxide soils according to Demattê et al. [19]. Thus, Ferrasols and Nitossols (LVf/LV/NV/NVL) are positioned near the origin of the soil line. The loamy soil albedo (295 observations) and sandy soils (203 observations) tend to be positioned further from the origin of the soil line plot. In our study, these texture classes correspond to Lixisols (PV) and Arenosols (RQ), respectively, and agree with the findings of Demattê et al. [19].

Calibration and Validation
Soil attribute prediction of the cross-validation of the validation dataset is shown by the coefficient of determination (R 2 ), root mean square error (RMSE) and the ratio of performance to deviation (RPD) for sand, clay, CEC and soil organic matter (SOM) (Figure 9). SOM presented the lowest error among the soil attributes. RPD presents good prediction accuracy, according to Chang et al. [57], as it is in the interval between 1.5 and 2.0. Dunn et al. [58] propose that models with RPD greater than 2.0 are excellent; therefore, the clay content and CEC models are both stable and excellent in their prediction accuracy. According to Viscarra Rossel et al. [59], the RPD values for clay and CEC models are considered good and very good. RPD values equal to 1.4 and 1.5 for the SOM and sand content models are considered fair and may be useful for mapping purposes. The results of the PLSR for clay content, CEC and SOM ( Figure 10) showed a similar pattern that matches well with the geology and soil type of the study site (as described in Sections 2.1 and 3.3.1). Furthermore, the high Moran index demonstrated that spatial distribution of the predicted attributes Predicted versus measured values for the soil attributes are presented in (Figure 9). For clay and sand, the range of the predicted values is similar to the measured values. CEC and SOM estimates had a slightly narrower range compared to the measured values, indicating underestimation for these attributes. We noted similar mean values in the summary statistics from the measured ( Table 2) and validation predicted dataset values (Figure 9). The standard deviation of the predicted value is smaller, especially for SOM and CEC, meaning that the extreme values are not well represented.
Considering the distance of 800 km from the Earth's surface to the satellite sensor, as well as the spatial resolution of 30 m, the predictions present satisfactory performance. Furthermore, this method is fast and suitable for several temporal and spatial scales. The tendency is increasing demand for regional and global methods to obtain soil information [9]. The reason for such increasing demand is related to ecosystem services and global land monitoring, aiming to reduce soil loss and degradation [60]. The method presented in the present study has the potential to quantify soil attributes, providing spatially explicit information over extended areas. Other similar studies, based only on Landsat data, have also shown positive results in their predictions of soil attributes [20][21][22][23]. Therefore, digital soil mapping integrated with other geotechnologies has the potential to aid soil surveys.

Soil Attribute Mapping and Autocorrelation Analysis
The results of the PLSR for clay content, CEC and SOM ( Figure 10) showed a similar pattern that matches well with the geology and soil type of the study site (as described in Sections 2.1 and 3.3.1). Furthermore, the high Moran index demonstrated that spatial distribution of the predicted attributes is clustered.
Sand content is almost inversely related to clay content, thus only the soil texture map based on clay content is presented (Figure 10a). In tropical soils, where weathering is very intense, the occurrence of silt is very low ( Figure 5). Thus, the majority of granulometric particles are constituted of sand and clay. Predicted clay varied from 3 to 797 g kg −1 (Figure 10a). Loamy soils accounted for 47% of the classified pixels, followed by 33% for clay-loamy soil and 18% for sandy soil; the remaining were those of clayey soil. Textural classes varying from clay-loamy to clayey are related to areas of intensive cultivation. As expected, in these areas the CEC was classified as very high, with values above 80 mmol c kg −1 .
Predicted SOM content varied from 0 to 32 g kg −1 and predicted CEC varied from 0 to 140 mmol c kg −1 . Among the distribution of organic matter content pixels, 65% included the medium class of SOM and approximately 5% of the pixels represent very high class.
In loamy, clay-loamy, and clayey soil areas, very high class CEC predominated, representing 70% of the pixels. In areas with sandy soils, the mean values of medium class CEC predominated, which represents 21% of the pixels, while less than 1% have values below 30 mmol c kg −1 . In these areas, around 10% of the pixels presented values lower than 10 g kg −1 of SOM, mostly represented by Arenosol (RQ). These low SOM contents are related to the rapid decomposition of organic matter content in sandy soils.
Clay content is related to intrinsic factors of formation, such as parent material. However, CEC and SOM are related to extrinsic factors, which can be influenced by the type of land management. As such, defining textural classes is advantageous, as they do not vary with anthropic factors and may assist in defining homogeneous soils. Furthermore, to capture variability of CEC and SOM over time, application of PLSR models to each moment in time is suggested, as implemented by Diek et al. [22]. Despite the detected influence of soil moisture over the season, this technique may be efficient in monitoring soil attributes over time.
Remote Sens. 2018, 10, x FOR PEER REVIEW 17 of 21 time, application of PLSR models to each moment in time is suggested, as implemented by Diek et al. [22]. Despite the detected influence of soil moisture over the season, this technique may be efficient in monitoring soil attributes over time. The additional validation of maps is presented in the confusion matrix, where 123 points were used (Table 4). From these results, producer accuracy (precision) and user accuracy of the soil attribute maps, and consequently the Kappa coefficient (K) were determined. The overall accuracy of soil texture mapping represents 74 % (Kappa = 0.63), which is considered very good [49]. Loamy and clay-loamy soils respectively presented higher producer accuracy at 93% and 87%, demonstrating that the classification probability agrees with the field validation measurements. Concerning the user accuracy, almost all classes demonstrated high probability, with loam, clay-loam and clay having a chance of being correctly predicted in 72%, 71% and 96% of the evaluated cases. Only the sand class presented low value of user accuracy due to confusion with the loam class, which could be related to loss of absorption features in the spectra.
The additional validation of maps is presented in the confusion matrix, where 123 points were used (Table 4). From these results, producer accuracy (precision) and user accuracy of the soil attribute maps, and consequently the Kappa coefficient (K) were determined. The overall accuracy of soil texture mapping represents 74% (Kappa = 0.63), which is considered very good [49]. Loamy and clay-loamy soils respectively presented higher producer accuracy at 93% and 87%, demonstrating that the classification probability agrees with the field validation measurements. Concerning the user accuracy, almost all classes demonstrated high probability, with loam, clay-loam and clay having a chance of being correctly predicted in 72%, 71% and 96% of the evaluated cases. Only the sand class presented low value of user accuracy due to confusion with the loam class, which could be related to loss of absorption features in the spectra.
The overall accuracy of the mapping shows 46% (Kappa = 0.23) and 52% (Kappa = 031), for SOM and CEC respectively, which is considered fair [47], whereas the high class of SOM shows 75% producer accuracy, which presented less confusion and fewer errors, being more precise when compared to other classes. The other observed classes had low values of producer accuracy, indicating a low amount of measured soil samples erroneously included in these classes. All SOM classes presented an average probability of 45%, which represents the SOM as in the measured field soil sample.
The CEC class presented over 50% precision for the medium, high, and very high classes. However, the high class presented lower user accuracy, which was caused by the high confusion with very high class. Medium and very high CEC classes demonstrated a probability of 71% for user accuracy. Digital soil attribute maps derived from the combination of multi-temporal soil spectra and PLSR models demonstrated reliable agreement for texture and moderate agreement for SOM and CEC.

Conclusions
The multi-temporal images provide historical and temporal information for a given area, enabling reconstruction and creation of a single composite image of bare soil. The BSCI can determine up to seven times more information than a single image from a single year. Without the system, the user would have about 7% bare soil coverage for the area, which increased to 53% bare soil when applying the technique presented in this study.
The NDVI and MID-Infrared indices are efficient in retrieving bare soil occurrences. The method is quick and capable of separating bare soil from straw, and vegetation.
The spectral data, despite the 800 km distance from the target, were efficient in determining soil attributes such as clay (R 2 over 0.75). Even though sand, OM and CEC presented lower values, their estimates are still significant. All digital soil maps presented high spatial autocorrelation, indicating the importance of RS products for spatial modeling.
Satellite images of bare topsoil detect variations in soils texture, which is related to parent material that is statistically significant. Soils with equal clay content on the surface and in the subsurface (Ferralsols) could be detected, but if the soil has a textural gradient (i.e., Lixisol), this technique is limited.
The use of a product with 30 m resolution, which represented 242,000 ha with an almost continuous bare soil surface, opens possibilities in several areas such as land use planning, environmental monitoring, attribute mapping, precision agriculture, digital soil mapping, and updating legacy surveys.