In recent years, the average contribution of Rio Grande do Sul State accounted for about 17% of the national grain production. It is the third soybean producer in Brazil [
13] and both yield and crop area is still rising. The State has a total area of 282,062 km
2 with 496 municipalities. Its soybean production is concentrated in the center-north region. The average annual rainfall is 1,500 mm, being relatively well distributed throughout the year, but subjected to dry periods. The State climate is subtropical with four well-defined seasons. The 39 municipalities analyzed are totally covered by a Landsat scene path/row 222-080. All municipalities are aggregated inside an intensive soybean production area, which is shown at
Figure 1.
2.2. Satellite Data
The data sources used for the algorithm development were the following: (i) Landsat-5 TM images, obtained from Instituto Nacional de Pesquisas Espaciais (INPE:
www.dgi.inpe.br); (ii) Shuttle Radar Topography Mission (SRTM) data, used to generate a slope map with 90 m spatial resolution, according to [
15], in order to exclude improper areas for mechanization (slope > 12%); (iii) annual soybean agricultural statistics, at State and municipality level, from IBGE, used to compare and evaluate the obtained results from the present soybean area estimation procedure for the 39 municipalities; (iv) a soybean reference thematic map, available for crop year 2000/2001 [
3] obtained from multi-temporal Landsat TM images analysis, at 30 m spatial resolution, used to evaluate the soybean thematic map obtained from 2000/2001 in this study; and (v) geolocation reference images from Global Land Survey (GLS), which is composed by cloud free images and geo-referencing metrics with good quality. This product was used to provide an accurate geo-registration of the selected images at
Table 1.
2.3. LANDSAT Image Calibration
The Landsat-5 TM images were fully calibrated and corrected to generate reflectance values, according to Landsat Calibration Documents [
16,
17]. Usually, correction includes atmospheric and sensor related parameters and thus leads to the derivation of physical units such as reflectance [
18]. In the strict sense, full absolute image correction involves both applications of absolute calibration coefficients for sensor and related parameters of atmospheric correction, to derive estimates of surface reflectance in order to produce a consistent temporal reflectance trajectory [
18]. In this work, the first step was to convert the digital numbers (DN) into radiance and afterwards to reflectance, according to calibration parameters of [
17]. After the transformation of DN to reflectance, images were atmospherically corrected according to the methodology of [
19].
2.4. Selected Bands
During soybean vegetation development, a rapid increase of near-infrared (NIR) reflectance values is observed, reaching its maximum values after a relatively short period. After that, the maximum vegetation period is observed in the time window between 20 January and before 20 March [
11,
20] for Rio Grande do Sul. During the maximum vegetation development stage, soybean crop cultivation presents a particular spectral behavior when compared with other classes of regional land use cover [
3]. At that stage, vegetation is expected to present low reflectance values at the red band (0.65 to 0.69 μm), high reflectance values at the NIR band (0.76 to 0.90 μm) and a very steady reflectance variation at the Short-Wave Infrared (SWIR) band (1.55 to 1.75 μm), as described in [
21] for irrigated and rainfed soybean crops. So, using a mathematical-computational rule, it is possible to establish operators that are able to identify soybean crop area characteristics that remain through time even under different vegetation development conditions. This means that an algorithm that meets the appropriate mathematical combination of bands 3, 4 and 5 can accurately select soybean crop areas.
Soybean sowing occurs from around early October to early December. Usually, maize is sown earlier than soybean [
20], favoring a calendar-based discrimination. In this sense, even when maize is sown during the same period as soybean, it would not be erroneously tagged as soybean, because it is possible to identify a particularly spectral behavior by using simultaneously the bands 3–5 that is associated to each culture.
2.5. Algorithm Development: Theoretical Approach
The soybean classification procedure developed in this study was named Reflectance-based Crop Detection Algorithm (RCDA), whose diagrammatic flowchart is presented in
Figure 2. RCDA was developed in the ERDAS 9.1 Modeler Environment. Initially, surface reflectance values for soybean during maximum plant development were tested. Based on published spectral characteristics [
22,
23] of bands 3–5 of Landsat-5 TM, soybean crop vegetation typically presents a calibrated reflectance of about 5%, 50% and 21% respectively. It is because those reflectance values, usually available on literature, are averaged from standardized conditions and represent soybean crop fields at full-pixel coverage over soybean areas.
It is well known that in drought-free years, well developed vegetation reflects just a little part of incident solar radiation in the visible band of spectrum, due to chlorophyll absorption properties and others plant pigments that absorbs sunlight. In the NIR, plants reflect much more, due to a scattering effect caused by the internal structure of leaves and water content [
24]. Depending on the intensity of water deficit due to drought, seasonal heat waves or both effects coupled together [
25] it is possible that vegetation remains green for a time lag after the onset of water stress [
26]. In this way, it is expected that bands 3 (red) and 4 (NIR) do not retrieve detectable changes during this time lag. Additionally, reflectance at band 5 of Landsat is closely associated to vegetation moisture [
27,
28] and therefore, its behavior through time needs to be more deeply investigated. However, it is also known to have the property of penetrating thin clouds due to wavelength size [
28,
29], which tends to be very useful in a mapping study and land use change.
It became clear that the challenge is to find out those accurate reflectance values located at the lower limits of the spectral range (lower reflectance values) for each band, in a way that includes not only pure soybean pixels at the normal conditions, but also under water deficit development or mixed pixels located at the border of soybean fields. Those lower limits values were defined in this work as R3min, R4min and R5min.
2.6. RCDA Development: Test Sites
One way to perform the better fitting of the representative reflectance values, due to their importance as input parameters for RCDA, was to obtain a set of soybean reflectance samples in each of the bands 3–5 from selected test sites inside the 39 municipalities.
Direct visual inspection and mapping over the images was used to set the location of test sites with total of 9,925 pixels through all crop years 1996/1997, 2000/2001, 2003/2004, 2005/2006, 2006/2007, 2007/2008, 2008/2009 and 2009/2010. It is important to note that at least, two of the selected crop years (2003/2004 and 2006/2007) were under quite different development conditions [
30]. So, different Physically Driven Components (PDC) related to agricultural practices, weather or climatological forcings were acting at the agricultural system. We refer PDC to the main physical dynamic processes involved, from one harvest to another, which leads to adjustments in the mathematical modeling of vegetation development. PDC investigation became necessary in order to identify more accurately a multi-temporal threshold for band 3–5 that remained representative of soybean vegetation through time.
The soybean areas were mapped and selected by using false-color composition of bands (RGB-453) described at [
3]. Those minimum reflectance values for each band are also associated to the limits of border regions between full-pixel coverage of soybean and mixed pixels.
Pixels below R3min are typically associated to cloud shadows or water bodies. Actually, it was observed at the test sites that soybean vegetation reflectance, even under different development conditions, stands lower than 0.07 in the red, band 3; stands greater than 0.42 in the NIR band 4; and stands greater than 0.18 in the Short-wave Infrared (SWIR) band, band 5. In terms of reflectance, these are crucial as input parameters to soybean characterization.
2.7. RCDA Development: How Does That Work?
For a given crop year, all available Landsat-5 TM images from the maximum development period were combined into the algorithm. In doing so, five computational steps were established in order to get the best use of PDC that rule soybean spectral behavior in bands 3–5.
Pixels with reflectance values that fall under the defined R3min were tagged as soybean according to condition A; pixels with reflectance values that fall above R4min were tagged as soybean according to condition B; pixels with reflectance values that fall above R5min were tagged as soybean according to condition C; pixels which the reflectance are above the sum of bands 4 and 5 were tagged as soybean according to condition D; and pixels with NDVI values which are above NDVImim were tagged as soybean according to condition E, in
Figure 2. In the final step, all conditions are multiplied and a pixel that is representative of a soybean area must have the value one. In this procedure, a pixel will be automatically classified as soybean if it adheres simultaneously to conditions A, B, C, D and E. By using mathematical Boolean rule, a pixel will be selected as soybean if all conditions are simultaneously satisfied.
Pixels with a calibrated reflectance that does not follow at least one of defined rules A, B, C, D or E, are not selected, because they will overlap areas that are not from soybean crops. Additionally, all four conditions are modulated by NDVI values greater than 0.6 units in order to avoid background and/or cloud contamination that usually have high values of reflectance. Also, saturation effects of Normalized Difference Vegetation Index (NDVI) when Leaf Area Index (LAI) is greater than 3 can mask water stress [
31].
2.8. Tuning Procedure of the RCDA
After a first run of the RCDA, follows a tuning phase based on a stepwise procedure using some specific Landsat satellite image from
Table 1. This step is necessary to interactively fine tune the most appropriate values of R3min, R4min and R5min in order to minimize the omission and commission errors, when comparing to the reference map from [
3].
A comparison of Landsat images and RCDA mapping was then performed. By overlaying the Landsat images and the classification maps, we observed that the soybean area was overestimated for the first run of the RCDA. Therefore, a second run was performed adjusting the values of R3min, R4min and R5min. For each combination, a new soybean classification was generated, which was visually compared with the corresponding available Landsat-5 TM images.
After several interactions, over several crop years and comparing the results of each one of new RCDA classification with Landsat-5 TM images, the combination with best performance of the final calibrated values of RCDA were defined as R3min = 0.07; R4min = 0.39; R5min = 0.15 and R4min + R5min = 0.58, according to
Figure 2.
We should emphasize that the RCDA tuning procedure is only completed when R3min, R4min and R5min, which were chosen to be representative of predominantly PDC of soybean, can be used as the same input for all analyzed crop years. Therefore, once the parameters R3min, R4min and R5min were identified, no post-adjustment was allowed, partially to constrain dynamical adjustment process of the algorithm. During this phase, when some further adjustment was needed, in order to plot a better fitting of crop area map for one crop year or more, then this new parameter is run for all tested crops years.
Since the RCDA map is a binary image where 1 indicates soybean and zero indicates non-soybean, the next step is to combine a soybean area map from one Landsat-5 TM image to another one. It is expected that a combination of three consecutive Landsat-5 TM images would be available for the maximum vegetation development period. According to [
3], it is mandatory that at least two images inside the critical period with good quality and low cloud contamination exist. If cloud occurrence is severe over the interest area, a delay of 16 d is expected in order to acquire the next Landsat-5 TM image. Therefore, the soybean estimation can be released no later than early March if two or more Landsat-5 TM images are available. Even if a third image is necessary, a soybean map can still be released during March. However, if no useful images were found due to at least one of the following situations, no other computational rules are applicable and no crop area estimation is generated for the crop year: cloud contamination, named as situation Cloud; image quality/noise presence, named as situation Quality, or unavailable overpassing, named as situation No overpassing,. In the case of a crop area forecast, a RCDA map can be provided right after a second Landsat-5 TM image is available inside the time-window, which normally occurs in mid/late February.
Table 1 presents all Landsat-5 TM images available. All crop years that the available Landsat-5 TM images have flagged as OK, in
Table 1, were used in the validation process. It is important to emphasize that the parameters defined in RCDA for detection crop areas are constant, as a fixed criteria during the period we studied (eight crop years between 1996/1997 and 2009/2010).