1. Introduction
Sugarcane is the key industrial crop for Brazil and a number of other countries (e.g., India, China, Thailand, etc.) [
1]. Monitoring of sugarcane harvest is important to schedule and optimize logistic operations as well as forecast the crop productivity. The latter is applied to forecast the production indicators of sugar industry enterprises, biofuel (ethanol), etc. [
2,
3,
4,
5,
6]. In Brazil, 60% of the sugarcane fields are located in the state of São Paulo [
7]. Recently, Brazil has been demonstrating the rapid transition from the low-productive, costly, and environmentally unfriendly (due to preliminary burning of sugarcane leaves) technologies of manual harvest to mechanized harvest technologies (harvest without burning—green harvest) [
5,
8]. That makes it possible to accelerate considerably the rate of operations and harvest throughout a year including the rainy season. Along with the growing Earth population and increasing rates of global economic development, the need for sugar industry products is rising as well. That requires the development and application of innovative technologies for sugarcane growing and harvest, and as a result—the implementation of a global harvest monitoring to evaluate the production output and to control certain changes in the fields.
Ground-based harvest monitoring is possible by installing the navigation equipment and receiving GPS data from harvesters, tractors, and semi-trucks [
9]. The equipment transmits in real-time mode the signal about the machinery location (latitude and longitude) to a remote receiving station. An operator transforms the geographical coordinates into a spatial data file for their visualization in a GIS viewer and evaluation of the harvested parcel area. The methods based on the processing of digital mosaics of optical and SAR (Synthetic Aperture Radar) images of sub-meter spatial resolution, obtained from the Unmanned Aerial Vehicle (UAV), allow the ability to construct a detailed map of plants within the territory under study with an area of several hectares [
10,
11]. Such approaches are efficient on a field and farming enterprise level; however, they require considerable costs while monitoring on regional, state or the whole country level. The solution here is to use the global satellite monitoring data with a spatial resolution compatible with the size of sugarcane fields [
12].
To estimate sugarcane vegetation conditions and evaluate the content of nitrogen, moisture, and salts in the plant stalks, recent studies of the modern authors often involve different spectral indices, such as
NDVI (Normalised Difference Vegetation Index) [
13],
RVI (Ratio Vegetation Index) [
14],
EVI (Enhanced Vegetation Index) [
15],
NDWI (Normalized Difference Water Index) [
16]. The studies also pay much attention to optical remote sensing of the crop residue and tillage that is related to the problems of the research [
17]. Daughtry et al. [
18] highlights considerable differences in spectral reflectance of green vegetation, bare soil, and crop residue. In particular, it is shown that the residue has much higher spectral reflectance in the Short-wave Infra-Red (SWIR) range than the green vegetation or soil. Tuszynska et al. [
19] stresses that harvest is characterised by the sharp increase of spectral reflectance in SWIR 1 and SWIR 2 bands of Sentinel-2 satellite. In terms of mechanized harvest, sugarcane is not burnt; thus, during the harvest period, the field has thick green vegetation. When sugarcane is harvested, roots and parts of stalks are left in the soil. Later, new plants will grow from them (ratoon). The residual stalks differ considerably in their spectral characteristics from the plants before harvest. That allows using spectral indices to record the harvest date. For instance, the cane growth and harvest correlates with
NDVI, which increases smoothly during the plant growth and decreases sharply after harvest. The complexity is in the fact that in non-irrigated fields, the cane leaves may dry out and differ slightly in the optical images from the harvested stalks. Moreover, along with the development of mechanization technologies, harvest may be carried out during the rainy periods when optical images of the fields are impossible to obtain within the weeks or even months due to the cloudiness. In those cases, SAR images are the only source of information.
Recent papers show the examples of SAR data successful use to find changes in sugarcane state, including the ones related to the harvest. The C-band SAR (Sentinel-1 and Radarsat-2), L-band SAR (ALOS-2), and X-band (TerraSAR-X) data are used successfully to map the state and evaluate the sugarcane productivity [
7,
10,
12]. The possibility to determine the harvest data using SAR images is connected with the fact that SAR images are sensitive to the changing vegetation height and field surface texture due to harvest. The cane harvest may be carried out throughout a year. In addition, several parcels planted and harvested at different periods of time are possible within one cadastral field. Consequently, the main requirements for SAR selection data are as follows: rather high spatial and temporal resolution compatible with the dimensions of field fragments and time interval of harvest [
12]. Open access SAR images from Sentinel-1 A/B satellite constellation allow the ability to monitor harvest every 6–12 days with a spatial resolution of up to 10 m. Interaction of radar signal with the surface is determined by the backscattering coefficient (
) and coherence—a module of the complex correlation coefficient between two Single Look Complex (SLC) images containing the data on amplitude and phase of radar signal [
20]. Harvest is identified by the accompanying abrupt decrease of
[
3]. It is pointed out that along with the cane stalks growth,
values increase as well. When stalks are cut,
falls sharply. Changes in
related to the cutting of an above-ground part of a plant are the most characteristic feature of crop harvest in terms of SAR data [
21,
22,
23]. During that period, coherence is low since SAR-images before and after harvest correlate weakly between each other [
24,
25]. However, since the SAR-images regularity is several days (6–12 days for Sentinel-1) and mechanized harvest may be carried out during rainy periods, there is a high possibility of false detection of harvest dates due to high
sensitivity for precipitation at minor and medium incidence angles, especially for seedlings (precipitation provokes
bursts) [
12]. Thus, such an approach requires the available additional data on precipitation within the monitoring parcels that complicate its use. Elimination of precipitation effect is possible, for instance, by averaging of
data at a grid scale 10 km × 10 km: if the
burst is observed within all grid cells—it is caused by precipitation; otherwise, it is caused by the crop growing [
26]. Precipitation can also be detected by the data from local meteorological services and Global Precipitation Measurement (GPM) data [
27]. Baghdadi et al. [
12] emphasize that possibility of harvest date determining using the TerraSAR-X data is also influenced by the incidence angle. They claim that the best results of harvest mapping may be obtained with an incidence angle of
.
The Sentinel-1 SLC C-band SAR data have been used successfully to determine crop harvest dates [
24]. However, natural conditions and technologies of sugarcane growing differ considerably from the ones described in [
24]. Thus, the algorithm [
24] applicability for sugarcane needs additional studies.
Numerous papers deal with the application of segmentation, Machine Learning, and Data Mining methods to estimate the sugarcane state using classification maps [
28,
29,
30,
31]. The simplest approaches involve classification of the sugarcane state and stages of its growth (including the harvested areas), i.e., with the use of statistic methods (such as a maximum likelihood estimation) or Object-Based Image Analysis (OBIA) [
32]. The
values in different polarizations as well as polarimetric coefficients can be used as inputs in SAR classification tasks [
33]. To single out the sugarcane plantations at different growth stages, one can use regression trees (Classification and Regression Trees—CART), support vector machine classifiers (Polynomial-SVM and RBF-SVM), a random forest classifier (RF), an artificial neural network classifier (ANN), and a decision tree classifier (CART-DT) [
29,
34]. The Random Forest and XGBoost methods may identify the sugarcane plantations to high precision, including the early growth stages (that indicates indirectly the preceding harvest) [
33]. Early identification of the agricultural plantings (including the early sugarcane emerging) is possible from Sentinel-1 time series using such neural network classifiers as convolutional neural networks (CNNs), long short-term memory recurrent neural networks (LSTM RNNs), and gated recurrent unit RNNs (GRU RNNs), etc. [
35].
The disadvantage of those methods is in the requirement for the available ground-based training sample that is not always possible in the case of regional and global monitoring. Such methods are sensitive to the composition of reference objects and may need additional retraining while transferring to other parcels of the fields being classified. They are often complicated for their implementation and require considerable computational resources.
The objective of the research is to develop the algorithms for determining the harvesting dates suitable for large sugarcane areas monitoring. The algorithm should provide correct results in the lack of a ground-based training sample and be computationally simple, thereby overcoming the limitations of existing methods. To do that, open-access data from Sentinel-1 and Sentinel-2 were used. There are the following main practical implications of the research results: control of the harvesting operations and determination of the harvested fields to estimate the harvested crop.
This paper is organized as follows.
Section 2 analyzes the climatic conditions and technologies of sugarcane growing in the state of São Paulo, Brazil. Methods for constructing time series of optical (Sentinel-2
NDVI) and SAR (Sentinel-1 coherence) data are described.
Section 3 involves the formulation of the algorithm for determining the harvest dates from NDVI time series and the algorithm based on both SAR and optical data. The latter is the development of the algorithm proposed for grain-crops in [
24].
Section 4 represents the accuracy analysis of harvest dates and harvested areas determination. In
Section 5 some examples of errors in harvest dates determining dates are discussed. The error causes and possible approaches to their elimination are analyzed. In particular, the effect of clouds on the accuracy of harvest date determination according to
NDVI is investigated. Finally, the possibility of further development of the proposed algorithms is demonstrated.
Section 6 contains the conclusions of the research.
3. Methodology
The sugarcane harvest monitoring includes two independent algorithms of harvest date determining based on the decision trees generated from time series of optical and SAR data respectively. Time series of the average Sentinel-2 NDVI values and average Sentinel-1 SAR coherence in VV polarization at the level of a grid cell were calculated for test sugarcane fields. The NDVI data are also used to check and clarify the SAR algorithm results.
3.1. Optical Data
Each parcel has the time series composed of N images. is the parcel average NDVI calculated for the ith image , is the date of the ith image.
The main stage of the sugarcane harvest is stalk cutting in separate stripes according to the terrain features. Depending on the growth process and crop rotation plan, the cut stalks are either left unaffected (ratoon) or harvested completely with the following parcel plowing. In both cases, harvest results in sharp
NDVI decrease (
Figure 4).
As a rule, short-term periods of NDVI decrease with its following increase up to the previous values are caused by clouds or cloud shadow. The modified median filter is used to smooth them.
A standard median filter [
43] smoothes the time series fragments with high
NDVI located among the dates with dense clouds (in
Figure 5 it should be noted the lack of
NDVI local peak in the middle of November as a result of median filtering). That may result in the loss of harvest dates. Hence, the filter was modified to use the median data
only where they exceed the
NDVI values of the initial time series.
Following conditions should be met for the harvest date
:
Condition (
2) means that
NDVI decreases sharply in the harvest date
—not less than by
comparing to the previous date. According to condition (
3), the before-harvest
NDVI should be rather high. Otherwise, the
NDVI decrease may be caused not by the harvest but by the field plowing. The
NDVI threshold value during the harvest date
takes into account different harvesting technologies. The
values correspond to the parcels with moderate and sparse vegetation [
44]. Condition (
4) represents the fact that harvest causes long-term
NDVI decrease: if
NDVI is not raised up to the level of
within
period, then harvest was carried out at
moment. The
coefficient compensates insignificant
NDVI fluctuations as well as the effect of clouds during the
NDVI decrease period.
Thus, the algorithm of determining the harvest dates according to the optical data consists of:
preliminary filtration (
1) of time series
;
control if conditions (
2)–(
4) are met.
The main algorithm parameters are as follows: , , , , and .
To check if condition (
4) is met for the harvest dates falling on the end of a calendar year, the
NDVI time series should be prolonged by
beyond the limits of a harvest monitoring interval.
3.2. SAR + Optical Data
The satellite derived observations for sugarcane show a dynamic pattern similar to wheat and other crops [
24,
45]:
Slow increase of coherence when the crop gradually dries out (
Figure 6a, 12 April–29 July 2018). Then, it drops due to harvest (
Figure 6a, 29 July–22 August 2018). Intermittent growth of coherence (
Figure 6a, 22 August–3 September 2018) shows that the field is harvested.
Low and practically unchangeable coherence (
Figure 6b, 6 January–11 June 2018), characteristic for dense sappy vegetation cover, drops to low coherence at harvest. High coherence values (
Figure 6b, 11 June–29 July 2018) show that the field is harvested.
It should be noted that optical sensors are capable to determine the harvesting operations immediately at the surveying moment while the SAR data help identify the fact of harvest completion, and determine high coherence peculiar for the bare soil, sparse vegetation, and harvest residues. Nevertheless, both cases deal with the algorithm for harvest date determination.
The algorithm for determining the sugarcane harvest date from SAR data is based on the algorithm from Kavats et al. [
24]. It searches for the patterns of coherence time series of types 1 and 2, and assessing the vegetation cover density after possible harvest from the backscattering coefficient
. In contrast to [
24], the study uses
NDVI to estimate the vegetation state. Use of Statistics API to generate
NDVI time series helped save the disk space and time required before to store and process
data. Moreover, other improvements are introduced in the algorithm [
24] representing the features of sugarcane cultivation and harvest.
Consider the proposed algorithm in more detail. Take
time-ordered SAR SLC images. Denote the date of the
ith image by
. Calculate the coherences between
ith and
th images as it is shown in
Section 2.4. Generate a time series of the field-average coherence values
.
To search for the time series patterns corresponding to the harvest, calculate the differences of coherences
and determine the directions of their changes
The threshold is chosen so that the coherence change within the range of could be considered as insignificant.
Harvest either results in coherence drop (
) or does not change it comparing to the previous period (
). After harvest, coherence grows (
). Thus, harvest corresponds to the transfers
of type:
or
. Introducing the second differences
write down the conditions of coherence change corresponding to the harvest
Condition (
5) corresponds to the time series of type 1 coherence; and condition (6) corresponds to the time series of type 2 coherence. Conditional check
is meant for eliminating from the analysis the transfers
from
to 0. The latter may be caused by the beginning of harvest but not by its completion.
When the harvest is over, one can usually observe considerable growth of coherence. Thus, introduce the additional condition, when (
5) and (6) are checked
where
.
Table 1 demonstrates connection between the time series indices. Thus, index
i, determined from the conditions (
5)–(
7), corresponds to the harvest date
.
Check the identified potential harvest dates using the NDVI time series. The checking is based on the following assumptions:
harvest is carried out in terms of NDVI descending trend;
on the harvest completion, NDVI should be low.
The
NDVI trend construction consists of three stages (
Figure 7):
Harvest condition on the descending
NDVI trend:
where
and
l are the numbers of optical images before and after the potential harvest date.
After the harvest,
NDVI should be low.
That condition is similar to the second condition from (
3) for the algorithm based on the optical data.
Gradual coherence growth before harvest (
Figure 6a) may be caused not only by the plant drying-out but by the partial field harvest. The harvested part is empty, and it has high coherence, while the rest of the field is of low coherence. Along with the increasing harvest area, average field coherence grows gradually so that one cannot observe the fast coherence growth upon the harvest completion. As a result, the harvest date is not determined from the conditions (
5)–(
7). Irrespective of that, high coherence is observed within the parcel upon the harvest completion.
Single out the members of time series
with high coherence
High coherence values correspond to the sparse vegetation or bare soil. Consequently, harvest may be carried out only after a long period from the last date with high coherence
. Denote the potential harvest date by
and introduce the checking
where
is the minimum time period required for the plant growth.
Finally, if the harvest dates are not determined from the patterns of coherence time series (
5)–(
7) and conditions (
8)–(
11), check the dates of high coherence (
10) as for their meeting the conditions (
8), (
9), and (
11).
Thus, the algorithm of harvest monitoring in terms of the SAR and optical data consists of the following steps:
Searching for the rapid changes in coherence corresponding to types 1 and 2 (
5)–(
7).
Checking the dates determined from
NDVI (
8), (
9).
Checking the potential harvest dates, left after step 2, as for their following the high coherence dates (
11).
If no harvest dates are found in steps 1–3, perform steps 2–3 for high coherence dates (
10).
The algorithm parameters are: , , , , .
It should be noted that the use of
NDVI trend does not mean the determining of harvest dates from optical data. Comparing to the data used in the optical algorithm, the
NDVI trend may derive from the data of much lower spatial resolution and/or periodicity: MODIS, Landsat, etc. (
Figure 8).
3.3. Quality Metrics
Metrics are needed to evaluate (1) the quality of harvest dates determining, (2) the sugarcane area harvested in a month.
Each complete or partial field harvest must be determined. Since the reference harvest dates are determined by visual interpretation of the optical data being the same with the data applied in the algorithm, it is incorrect to assess the error of harvest dates determining with the help of traditional error metrics, such as mean absolute error or root mean square error. To assess the quality of harvest dates determining, the
Table 2 is used. The table columns correspond to the harvest dates for the reference data; the rows correspond to the harvest dates determined with the help of the algorithm.
True Match is the number of coincidences of harvest dates. The algorithm error is the total of false responses (False Match) and omissions (False Not-match) of the harvest dates. Introduce also the following
and
True match rate shows the proportion of matches for harvest dates determined by the algorithm from the total number of harvest dates of the reference sample. Match predictive value shows the proportion of matching harvest dates among all dates found by the algorithm.
Estimation of the sugarcane area harvested per month is rather useful for practical applications. The harvested area is determined by harvest end dates for cells. The harvest end date is the last date in the sequence of harvest dates for a cell, separated by a time interval less than one month.
5. Discussion
The causes of the errors in determining the harvest dates are considered below and a way of possible improvement to the algorithms is specified.
5.1. Analysis of the Time Series
Consider the causes of the errors on the examples of time series for several cells.
Lengthy cloudiness impedes the construction of a regular
NDVI time series during the harvest period. For the cell shown in
Figure 12 and
Figure 13 (cid is cell id), the April harvest date (18 April 2018) is determined correctly by both algorithms. The December harvest date (4 April 2018) is preceded by dense clouds (
Figure 13a,b), resulting in a false response of the optical algorithm in November. The real harvest date is omitted due to the lack of the
NDVI decrease required by the algorithm. According to the SAR data, harvest was in early November–early December. Harvest date cannot be determined more accurately as there is no SAR image in December. However, the behaviour of time series of the SAR data holds out the hope that the date can be identified in case of the available image.
Cloudiness results in the
NDVI drop within late March (29 March 2018) while harvest takes place actually in early April (
Figure 14 and
Figure 15). The date of the harvest completion in April (13 April 2018), determined according to the optical data, may be corrected according to the SAR data as the date of 12 April 2018 shows high coherence. Apparently, false response of the SAR algorithm in August is connected with some agrotechnical operations within the field. The November harvest date is omitted by the SAR algorithm due to the image lack. The optical algorithm responds early due to dense clouds.
As before, the early response of the optical algorithm is caused by clouds (
Figure 16 and
Figure 17). SAR algorithm determines the harvest date correctly with all its possible accuracy: an interval between its determined harvest date 23 June 2018 and real harvest date 12 June 2018 does not exceed the interval between surveys. During late June–early August, certain agrotechnical works are carried out within the field resulting in the coherence decrease.
In the case of minor cloudiness, the time delay of the harvest dates determined from the optical data is not more than 5 days relative to the actual dates, which is the characteristic interval between the Sentinal-2 survey within that area. A long-term cloudiness period may result in two errors: preliminary false response and omission of the real harvest date. Clouds cause rapid
NDVI drop; during the following harvest,
NDVI experiences no decrease as it already has low values. Note that less than 40% cloud cover (
Figure 10a) provides a rather accurate evaluation of the harvested areas.
In this case, the harvest dates cannot be determined according to the SAR data during the maximum cloudiness period due to the lack of SAR image in November. Nevertheless, the SAR data allow the ability to clarify the harvest dates obtained by the optical algorithm. Thus, it is most probably that the harvest, falling on the early dates of a high-coherence period, recorded by the optical algorithm took place several days earlier—within the period between optical images (
Figure 14, Coherence VV). That allows postponing the harvest date by several days but not more than by five days.
One of the reasons for erroneous algorithm functioning is non-harvest field operations (plantation burning, plowing, etc.); those operations are detected as harvest.
Early September is characterized by the period of dense clouds resulting in a false response of the optical algorithm (
Figure 18). The algorithm, relying on the SAR data, determines correctly the harvest in October but determines it falsely in August. Apparently, the coherence decrease in August is caused by the field works. Pay attention to the slow coherence growth during March–July that corresponds to the slow
NDVI decrease.
The following group of problems is connected with the erros in field boundaries and inclusion of the nonagricultural lands into its area. Such inclusions change the behavioural pattern of the NDVI and coherence time series, making it impossible to record the fact of harvest.
For the spatially heterogeneous cell shown in
Figure 19 and
Figure 20, the harvest date is determined according to the optical data for 12 June 2018. According to the SAR data, the harvest took place within the period up to 11 June 2018 when the coherence growth begins. However, that date is omitted by the SAR algorithm due to insufficient coherence increase. The latter is caused by the fact that the parcel includes the nonagricultural lands (
Figure 20).
Nevertheless, the observed coherence jump violates the behavioral pattern of the time series and results in the harvest date omission. The false response of the algorithm occurs in July. It appears that further coherence drop in late July–early August is connected with the post-harvest operations.
For the cell represented in
Figure 21 and
Figure 22, a part of the sugarcane parcel is covered with nonagricultural vegetation and has low coherence within the whole season. The first March harvest date was omitted since it did not cause the coherence jump. High coherence appears when most of the parcel is harvested. That happens in June–July. The response of the optical algorithm in September is not confirmed by low coherence values within that period.
Spatial heterogeneity can be the result of a digitization error or inaccuracy in combining several cadastral parcels into a single agricultural field. Incorrect delineation of the field boundaries (
Figure 23) has resulted in the fact that the field includes a road and a part of a neighbouring field with other harvest dates.
A sugarcane field is homogeneous if the plants are planted roughly at the same time. As a rule, harvest of such a field takes not more than a month; the next harvest will be only after a long period. All the NDVI jumps or SAR algorithm responses during that period are possible due to the interferences in observations or due to the agrotechnical non-harvest operations within the field. The field homogeneity is good support while identifying the harvest dates.
Plants in a nonhomogeneous field, planted during different periods, are at different development stages; it means that they differ considerably in terms of NDVI value and backscattering of the SAR signal. On such a field, NDVI drop and jump in the coherence growth may be poorly expressed as the harvest covers only some part of the parcel.
Differences in sowing dates result in harvest different dates for field parts. The optical algorithm demonstrates the premature false response caused by cloudiness for the cell shown in
Figure 24 and
Figure 25. The SAR algorithm omits harvest in April: due to the harvest of a part of the parcel, coherence growth does not reach its threshold. Instead, there are false responses in June and August. The October harvest date is identified correctly by the SAR algorithm.
Thus, the proposed algorithms will function correctly for the homogeneous fields not including certain parts of nonagricultural lands. It is desirable to divide the nonhomogeneous lands into homogeneous parts. The latter is of special importance for the proper functioning of the algorithm based on coherences data.
5.2. Possible Development of the Algorithms
The division into small parcels has made it possible to reduce nonhomogeneity of the fields and record better the harvest-related
NDVI drop. That division was performed from geometrical considerations without taking into account vegetation state. Preliminary field segmentation according to the optical images would allow singling out the homogeneous parcels. In particular, according to Murugan and Singh [
46], the use of k-means clustering to single out homogeneous fields, within which the harvest was monitored based on Sentinel-1
,
, and AVHRR
NDVI time series, provided the overall 82.17% accuracy of the result. It should be emphasized that in this case, the segmentation would result in the increased volume of the processed data.
While supposing a certain duration of the vegetation period of sugarcane, the simplest model of its growth was used implicitly. The use of more advanced mathematical models will be the natural development of that approach. Such a model could forecast a trend, being required for operational monitoring. Gaussian function may be applied to model the NDVI change in time [
47].
There were no data concerning the dates of planting or preceding harvest for the parcels under analysis. If there is information about the dates of growth beginning, local climatic and meteorological parameters, and state of soils, it is possible to apply biophysical modeling of the plant growth and development as well as the harvest at different vegetation stages. For instance, Marin and Jones [
48] construct a dynamic simulation model of sugarcane growth in Brazil basing on the analysis of biophysical processes in terms of 27 variables for Photosynthesis, Phenology, Leaf Development, Biomass Partitioning, Sucrose Partitioning, Plant Extension, and Root and Water Stress. Extreme learning machine was applied by Ghazvinei et al. [
49] to predict the sugarcane growth based on such ground measurements as amount of daily water irrigation, soil electrical conductivity, daily maximum temperature, evaporation, sunshine hours, rainfall precipitation, humidity percentage, and mean wind speed at 2 m above the ground. Such observations would help narrow down the time intervals of harvest dates identification and clarify the results of satellite monitoring, having excluded the deliberately false dates.
Time delay of the harvest dates detected by the SAR algorithms are mostly related to the post-harvest operations. A calendar of agrotechnical operations changing soil surface or vegetation cover state, as well as data on plant growth stage makes it possible to single out the harvest among the operations changing the nature of the radar signal reflection.
A much shorter revisit time for the study area is the important advantage of the optical data compared to the SAR one. An interval between surveying is 5 days in the case of Sentinel-2 in comparison with 12 days for Sentinel-1. Increasing revisit frequency may improve considerably the algorithm based on SAR data.
The use of textural features in harvest monitoring is one more tendency in the algorithm development. It will require machine learning methods capable of efficient identification and use of such features. Significant advantages can be achieved using high-resolution data (about 1 m), allowing to track changes in field texture associated with the agricultural machinery operation.
Classification methods and artificial neural networks used to detect harvested fields during the harvest monitoring within certain scenes are the alternative for the threshold methods and solution tree. Such algorithms require a large reference sample, which completeness and representativeness affect the accuracy of the results. However, once adjusted decision rule of a classifier will help reduce the following computational costs owing to the processing of individual scenes and no need to construct a long time series. For sugarcane fields in the state of São Paulo, a method of object-oriented classification is applied by Goltz et al. [
50] to discriminate different sugarcane harvest types (burnt, non-burnt, or no harvest) for different soil types using Landsat-TM data. There are the following image characteristics (classifier inputs): Haralick texture features, features of the field shapes, and reflection coefficients in spectral bands. The classification accuracy was estimated in terms of kappa indexes, being between 0.69 up to 0.84. The classification methods, including Support Vector Machine, Naive Bayes, and Artificial Neural Networks were applied by Rahmad et al. [
51] to assess the sugarcane suitability for harvest with the accuracy up to 88%. A combination of the machine learning techniques and methods proposed in the research may improve the overall result and increase the accuracy of detecting the harvested fields. However, that assumption needs additional studies.
6. Conclusions
The algorithms of sugarcane harvest monitoring based on the time series of the optical and SAR data are developed.
The NDVI time series have been constructed based on the data from optical sensor Sentinel-2 MSI. Sharp and continuous decrease of the NDVI values is the main feature of harvest.
The NDVI time series allows the ability to record more than half of the harvest dates. Cloudiness is the main problem for accurate identification of harvest dates according to the optical data. In the case of light cloudiness, the harvest dates may be time-delayed relative to real dates by not more than five days, being the characteristic interval between Sentinel-2 surveys within that specific area. Short-term dense clouds result in the omission of harvest dates. A long-term cloudy period may cause the early false response of the algorithm with the following omission of the real harvest date.
The best estimates of the sugarcane harvested areas per month have been obtained for the period of March–August 2018 when a cloudy pixel percentage was less than 45% of the image area.
To monitor harvest according to the SAR data, the coherence time series generated from Sentinel-1 IW SLC images have been used. Low coherence corresponds to the harvest period; the coherence grows sharply upon the harvest completion due to the bare soil or sparse vegetation. False responses of the SAR-based algorithm may be connected with the non-harvest field operations, that alter the signal reflection. To reduce the number of false responses, trends of the NDVI time series were used. It is supposed that the harvest is carried out on the decreasing NDVI trend. Note that in this case, the spatial resolution and/or periodicity of the optical sensors survey used for the construction of NDVI trends may be much lower than the ones used for detecting the harvest dates in the optical algorithm.
Accuracy of the harvest dates identification from Sentinel-1 SAR data is clearly inferior to the algorithm based on the Sentinel-2 optical data due to a considerably larger interval between surveys (12 days compared to 5 days). However, the SAR data may be used to clarify the harvest dates obtained according to the optical data and, in the long view, to specify the harvest dates during the periods of dense clouds.
Both algorithms have demonstrated the best performance while working with the fields where sugarcane is planted approximately simultaneously over the whole area. For the fields consisting of several sugarcane parcels at different stages of growth, it is recommended to perform preliminary segmentation into the homogeneous parcels.