2.3. Sentinel-1A Data Preprocessing and Smoothing Processing
Sentinel-1A launched in April 2014 by the European Space Agency (ESA) is an earth observation satellite for the Copernicus Initiative. Copernicus, previously known as global monitoring for environment and security (GMES), is a European initiative for the implementation of information services dealing with environment and security. The revisit time of this satellite is 12 day at the equator. The Sentinel-1A satellite operates at the C-band (central frequency = 5.404 GHz), containing VH and VV polarizations with a spatial resolution of 5 m by 20 m in the range and azimuth directions. In this study, the Sentinel-1A data was acquired in the interferometric wide swath (IW) acquisition mode with a swath width of 250 km. Besides, data utilized level-1 ground range detected (GRD) format with a pixel spacing of 10 m by 10 m (
https://sentinel.esa.int/). The Sentinel-1A IW level-1 GRD data, with VH and VV polarizations, ascending and descending orbit modes, was downloaded from
https://search.asf.alaska.edu/#/. In addition, these acquired Sentinel-1A data are open access and free from the website. Since this study mainly focused on the first-stage rice-field detection, the data were downloaded from early February to late July in 2017. The complete acquisition dates of research data from Sentinel-1A were shown in
Table 2. The incidence angles of the ascending and descending orbital modes in the study area are from 31.5 to 36.3° and 31.8 to 36.5°, respectively.
The Sentinel-1A data were pre-processed using the Sentinel Application Platform (SNAP) Sentinel-1 Toolbox software developed by ESA. Pretreatment includes three main steps. First, the radiometric calibration process was performed to convert the pixel data to actual backscattering values of sigma naught (dB). Thus, the pixel values in the imagery can be directly related to the radar backscatter of the scene. Then range Doppler terrain correction process corrected the geometric distortion in the range and projected range to Taiwan Datum 1997 (TWD97) earth ellipsoid model. Finally, the refined Lee filter [
49] was applied to remove speckle noise in SAR data.
In this study, the proposed rice detection method is based on the rice-growth related features, which are extracted from the time-series backscatters during rice growth. Due to different farming behavior, such as different sowing dates, the heading or maturity dates of rice growth are also different. The times corresponding to minimum and maximum backscatter values are different for each rice-field. Therefore, instead of directly using backscatters, a model corresponding to backscattering change is established first, and then the rice features are extracted from the model.
However, the fluctuation of backscattering coefficients will cause model distortion and affect the extracted features, for example the maximum backscatter values and the growth time period. Therefore, it is difficult to model the rice growth curve effectively based on the acquired temporal data. To reduce the influence of the fluctuation of the backscattering coefficients, the SAR data are preprocessed by a smoothing approach [
50]. Since the cultivation of crops was the block-based and clustered, a spatial mask was applied to perform pixel convolution for spatial smoothing. For example, the image in
Figure 3a is the original SAR image in the study area. After the spatial smoothing, the output image was smoother, as shown in
Figure 3c. Subsequently, temporal smoothing is performed on each pixel by a convolution mask to alleviate the randomness of temporal evolution. As shown in
Figure 4a, the original SAR data fluctuates greatly. The temporal backscatter coefficient is smoother and the randomness has been reduced after the temporal convolution.
2.4. Modeling and Feature Extraction
After smoothing processing, a model of rice growth curve was established based on the complete time-series data of the rice growth period. Observing the temporal data in
Figure 4, the time corresponding to minimum and maximum backscattering values can represent the sowing and heading dates of rice growing, respectively. Therefore, the date corresponding to the minimum backscatter value of the time-series data can be used to locate the starting date of the rice growth period and complete time-series data were collected from this date.
Due to the different sowing dates, the length of the complete time-series data of each rice-field is different. To overcome this problem in the study, a cubic polynomial function was used to fit the collected time-series data, and a model of the rice growth trend curve was established.
Figure 5 shows the fitting curves of rice and non-rice crops. In terms of electromagnetic interaction mechanisms between radar waves, vegetation canopy, soil and water, there was high correlation between paddy rice backscatter coefficient and its specific growth period [
32,
51,
52]. In the sowing, agricultural land was underwater, so the backscatter from rice-field was dominated by the double bounce volume scattering. Therefore, the backscatter energy of water was greater than that of paddy rice. Water layers are smooth and homogenous, causing reflected radar pulses to be weak and the backscatter coefficient is low in the sowing of rice growth period. During the vegetative to heading perio, the backscatter coefficient showed a significant increase, due to the volume scattering from within the rice canopy and multiple reflections between the plants and water surface. The backscatter then decreases slightly during the reproductive to harvest period, due to the fact that the water content of the plant decreases, and so do stem and leaf densities [
41,
42,
46]. In the study area, the non-rice food crops include corn, peanut, onion and wheat.
Figure 5 shows the temporal variation of backscatter coefficient of rice and the above non-rice food crops under VH and VV polarizations. These backscatter variation curves were obtained by using 50 crop fields randomly selected from the ground truth data of each crop. In order to show the deviation between the backscattering curve and the real data, the curve in
Figure 5 was represented by the mean value and the corresponding standard deviation bar for each food crop. Obviously, the backscattering coefficient of rice changes greatly during the growing period, while the backscattering changes of other crops (such as peanuts and onions) are smoother. Thus, the backscatter coefficient of paddy rice changes significantly during the rice growth period compared with other non-rice crops.
According to the growth and the cycle of rice crops, five features are extracted from the constructed rice growth trend model, as shown in
Figure 5.
(1) Backscatter Difference (BD)
Compared with other non-rice crops in
Figure 5, due to the obvious changes of rice plants during the growth period, the backscattering coefficient of rice also correspondingly changes significantly [
30,
33,
34,
36,
37,
38,
40,
41,
42,
43,
44,
46,
47]. From the fitting curves of rice and non-rice in
Figure 5, it can be observed that the backscatter change between the maximum and minimum values of rice is greater than that of other non-rice crops during the rice growing season (early Feb. to late Jul.). Thus, the backscatter difference (BD) between maximum and minimum values of time-series data during the rice growing season (from early Feb. to late Jul. in the study) was further examined for rice and non-rice crops, as shown in
Figure 6.
Figure 7a shows the probability density functions (pdf) of BD for different crops. It can be observed that the BD values of rice are distributed in 4–7 dB, while the BD values of maize and wheat are distributed in 1–5 dB and 2–5 dB, respectively, and the others are distributed in 1–4 dB. These results indicate that BD can help distinguish rice from other non-rice crops in the study area. In this study, BD was selected as one feature in rice-field detection. In following rice detection, the BD threshold was set based on the 95% confidence interval of pdf from the training data (i.e., 4.3 dB ≤ BD ≤ 6.6 dB in experiments for VH polarization with ascending orbit mode).
(2) Time Interval (TI)
In addition to the characteristics of BD, the growth period of crops is also an important feature [
31,
33,
36,
37,
40,
41]. Each crop has its own specific growth time period. The growth cycle of rice is carried out in the order of sowing, vegetative growth, heading and maturity, which takes about four months in Taiwan. According to the rice growth model in
Figure 6, the minimum backscatter point, the midpoint a, the maximum backscatter point and midpoint b are applied to represent the above four rice growth stages, respectively. The midpoints, a and b indicate the positions where the average value of the maximum and minimum backscatter appear in the rice growth curve.
and
are the corresponding dates of midpoint a and b, as shown in
Figure 6. In the study, the time difference between two midpoints,
, of the rice growth curve, which denotes the time interval (TI) between vegetative growth and maturity, was used as one of the characteristics of rice growth. The TI distributions of rice and non-rice crops are given in
Figure 7b. It can be observed that the average TI of non-rice crops are 54, 58, 62 and 65 day for onion, peanut, maize and wheat, respectively, which are relatively smaller than the 76 day average TI of rice. In the experiment, the TI decision interval is between 69.0 day and 82.5 day based on the 95% confidence interval of TI distributions.
(3) Backscatter Variation Rate (BVR)
During the tillering, the variation rate of backscattering in rice is obviously accelerated. After the time corresponding to midpoint a of the curve in
Figure 6, the backscatter of rice increases and reaches the peak at heading. Therefore, the slope from midpoint a to maximum backscatter point was used to represent the variation rate of backscattering at the tillering in the study, which is calculated by
and
are the backscatter values at the maximum backscatter point and midpoint a, respectively.
and
represent the corresponding dates of maximum backscatter point and midpoint a, respectively.
Figure 7c shows the pdfs of backscatter variation rate (BVR) for rice and non-rice crops. It can be observed that the BVR values of rice are larger than those of non-rice crops. According to the 95% confidence interval of the pdf, the values between
dB/d and
dB/d were chosen as the decision interval of BVR in the subsequent rice detection experiments.
(4) Average Normalized Backscatter (ANB)
The trend of rice growth formed by the backscatter coefficients is consistent with the rice growth cycle from sowing, heading to maturity. When the maximum backscatter value is normalized to one for each crop in
Figure 5, it can be observed that the value of average normalized backscatter is close to one for a flatter curve, while this value becomes smaller for a curve with greater variation. The ANB is calculated by:
and
represent the corresponding dates of the minimum backscatter point and end point in the rice growth season, respectively. In the study, the average normalized backscatter (ANB) is the ratio of the integrated area of the normalized growth curve (with the maximum value normalized to 1) to the growth period, as shown in
Figure 6. According to the pdfs of ANB shown in
Figure 7d, the ANB of rice is much smaller than that of other crops. Therefore, ANB was also selected as an indicator of rice detection. The decision interval of ANB was chosen to be from
to
in the rice detection experiment.
(5) Maximum Backscatter (MB)
Each crop has its backscatter distribution range, and the corresponding maximum backscatter (MB) value is also different, as shown in
Figure 5. The pdfs of the MB values are given in
Figure 7e, which shows that the MB distribution of rice is different from other crops. In order to distinguish rice from other non-rice objects, MB is used as one of rice identification features in the study [
33,
36,
37,
40,
45]. In the experiment, the decision interval of MB was chosen between −16.0 dB and −14.5 dB for rice detection.
The above five features can be extracted from the fitting models. Then the distribution of each feature is estimated from the training data of rice and non-rice, respectively. To detect rice crops, a feature-based decision method is introduced based on five extracted features. The decision threshold is determined by the 95% confidence interval of rice feature distribution, shown in
Figure 7. Moreover, it can be observed from
Figure 7 that the distributions of the extracted features of rice overlap with those corresponding to other non-rice crops. These reasons make rice detection difficult. If only one or two features are used for rice detection, some non-rice crops will be misclassified as rice. This leads to lower detection accuracy of non-rice. Therefore, this study proposed a decision method based on all five features, BD, TI, BVR, ANB and MB, to detect the mapping of rice-fields. When all five extracted features of the test field meet the decision conditions, this field is classified as rice, otherwise it is classified as non-rice. In experiments, the rice field is identified by the following decision rules:
2.5. Classification Algorithms
To evaluate the performance of the proposed method, the detection results were compared with four other classification algorithms: decision tree (DT) [
53], support vector machine (SVM) [
54], K-nearest neighbor (KNN) [
55] and quadratic discriminant analysis (QDA) [
56]. DT, SVM and KNN are non-parametric supervised learning methods for classification. The DT classifier infers the decision rules from data features and divides the input dataset into categorical classes by recursive partitioning based on the splitting rules. SVM classifier constructs a hyperplane, through which a good separation can be achieved. That is the constructed hyperplane has the largest distance to the nearest training-data point of any class. The KNN classifier predicts the target label by finding the nearest neighbor category which is determined based on the distance measures. QDA is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes. The decision boundary is generated by fitting class conditional densities to the data based on Bayes’ rule.
Moreover, for DT, SVM and KNN classifiers, there are three scales to be considered in the experiment, namely Fine, Medium and Coarse, provided by MATLAB Machine Learning Toolbox. Rice detection performance was verified by 5-fold cross-validation [
57] in MATLAB. The 5-fold cross-validation splits the dataset into five equal parts. In experiments, four parts were used as training data, and the remaining parts were used as test data. This process was repeated five times and the results averaged, each time using one different part as the testing data.
In the rice-field mapping experiment in
Section 3.3, all image pixels will be used instead of only the samples of the dataset. Due to the clustering characteristics of crop planting, the image was first partitioned into clusters by the fuzzy c-means algorithm [
58]. Therefore, pixels with similar temporal characteristics were grouped into a cluster. The corresponding model and rice-growth features of each cluster will be utilized in the subsequent rice detection.