Mapping a Cloud-Free Rice Growth Stages Using the Integration of PROBA-V and Sentinel-1 and Its Temporal Correlation with Sub-District Statistics

Monitoring rice production is essential for securing food security against climate change threats, such as drought and flood events becoming more intense and frequent. The current practice to survey an area of rice production manually and in near real-time is expensive and involves a high workload for local statisticians. Remote sensing technology with satellite-based sensors has grown in popularity in recent decades as an alternative approach, reducing the cost and time required for spatial analysis over a wide area. However, cloud-free pixels of optical imagery are required to produce accurate outputs for agriculture applications. Thus, in this study, we propose an integration of optical (PROBA-V) and radar (Sentinel-1) imagery for temporal mapping of rice growth stages, including bare land, vegetative, reproductive, and ripening stages. We have built classification models for both sensors and combined them into 12-day periodical rice growth-stage maps from January 2017 to September 2018 at the sub-district level over Java Island, the top rice production area in Indonesia. The accuracy measurement was based on the test dataset and the predicted cross-correlated with monthly local statistics. The overall accuracy of the rice growth-stage model of PROBA-V was 83.87%, and the Sentinel-1 model was 71.74% with the Support Vector Machine classifier. The temporal maps were comparable with local statistics, with an average correlation between the vegetative area (remote sensing) and harvested area (local statistics) is 0.50, and lag time 89.5 days (n = 91). This result was similar to local statistics data, which correlate planting and the harvested area at 0.61, and the lag time as 90.4 days, respectively. Moreover, the cross-correlation between the predicted rice growth stage was also consistent with rice development in the area (r > 0.52, p < 0.01). This novel method is straightforward, easy to replicate and apply to other areas, and can be scaled up to the national and regional level to be used by stakeholders to support improved agricultural policies for sustainable rice production.


Introduction
Rice (Oryza sativa L.) is one of the main crops grown in the tropical and subtropical area, with more than half the world population depending on rice as a staple food [1]. However, global production is close to its limits while the dependent population is expected to grow up to 9.26 billion by 2050 [2]. Moreover, urbanization, depleting water resources, climate change, and natural disasters have been threatening the sustainable production of rice despite its increased production by introducing new cultivars, chemical fertilizer, and better irrigation. Near-real-time and accurate information about rice growth stages is vital to support stakeholders to make better decisions to maximize production volumes and secure food production.
However, the existing monitoring approach is incapable of temporal tracking using local statistics due to missing data in some periods, and the results are not consistent.
This study aims to overcome these challenges by proposing a new workflow to integrate ground observations and high-frequency revisit optical sensors with PROBA-V and radar sensors such as Sentinel-1. Furthermore, images from both sensors are available on the Google Earth Engine (GEE), which is freely accessible for public use. Thus, the objectives of this study are as follows: building classification models for mapping the rice growth stages using the integration of PROBA-V and Sentinel-1; and measuring the correlation between the area of rice growth stages with local statistics at sub-district levels. This study will provide a foundation for mapping the rice growth stages accurately and making these available to the stakeholders for making better decisions when manual observations are limited.

Rice Growth Stages
The rice production cycle usually takes about 3-4 months, depending on the variety and environmental conditions, to grow from seed to mature plants. They experience three general phases of growth: vegetative, reproductive, and ripening [40]. Figure 1 illustrates the rice growth phases and surface condition in the rice fields [41]. First, the seed is planted in a small bed and, after 25 days, is transplanted into the main rice field to produce a higher yield and reduce weed occurrences. The vegetative stage spans from seed germination to maximum tillering. The next stage is the reproductive stage, where the plant grows from panicle initiation to heading. The last phase is ripening. The young grain in the panicle starts to develop starch. The grain colour becomes gold and then it is harvested. Within this study, we added a bare land class to capture the dynamics of the changing land surface within rice fields.
(http://katam.litbang.pertanian.go.id/sc/, accessed date: 15 January 2021). However, the existing monitoring approach is incapable of temporal tracking using local statistics due to missing data in some periods, and the results are not consistent.
This study aims to overcome these challenges by proposing a new workflow to integrate ground observations and high-frequency revisit optical sensors with PROBA-V and radar sensors such as Sentinel-1. Furthermore, images from both sensors are available on the Google Earth Engine (GEE), which is freely accessible for public use. Thus, the objectives of this study are as follows: building classification models for mapping the rice growth stages using the integration of PROBA-V and Sentinel-1; and measuring the correlation between the area of rice growth stages with local statistics at sub-district levels. This study will provide a foundation for mapping the rice growth stages accurately and making these available to the stakeholders for making better decisions when manual observations are limited.

Rice Growth Stages
The rice production cycle usually takes about 3-4 months, depending on the variety and environmental conditions, to grow from seed to mature plants. They experience three general phases of growth: vegetative, reproductive, and ripening [40]. Figure 1 illustrates the rice growth phases and surface condition in the rice fields [41]. First, the seed is planted in a small bed and, after 25 days, is transplanted into the main rice field to produce a higher yield and reduce weed occurrences. The vegetative stage spans from seed germination to maximum tillering. The next stage is the reproductive stage, where the plant grows from panicle initiation to heading. The last phase is ripening. The young grain in the panicle starts to develop starch. The grain colour becomes gold and then it is harvested. Within this study, we added a bare land class to capture the dynamics of the changing land surface within rice fields.

Study Area
The study area is located on Java Island, the leading rice producer in Indonesia, with a combination of irrigated and low land areas. The area consists of three regencies: Karawang, Subang, and Indramayu, with 309,046 ha in West Java Province (Figure 2), which is split into 91 sub-districts administratively [42]. The leading rice production areas are usually within the sub-districts in Indramayu (116,869 ha) and Subang (90,474 ha). Karawang (101,703 ha) is a regency with high land-use change rates due to industrialization and housing construction. The most significant paddy field area is Losarang, Indramayu, with 7244 ha, and the smallest is Cikampek, Karawang (416 ha). The average rice field area, over 91 sub-districts, is 3373 ha.

Study Area
The study area is located on Java Island, the leading rice producer in Indonesia, with a combination of irrigated and low land areas. The area consists of three regencies: Karawang, Subang, and Indramayu, with 309,046 ha in West Java Province (Figure 2), which is split into 91 sub-districts administratively [42]. The leading rice production areas are usually within the sub-districts in Indramayu (116,869 ha) and Subang (90,474 ha). Karawang (101,703 ha) is a regency with high land-use change rates due to industrialization and housing construction. The most significant paddy field area is Losarang, Indramayu, with 7244 ha, and the smallest is Cikampek, Karawang (416 ha). The average rice field area, over 91 sub-districts, is 3373 ha.
lite Agency (GEE id: VITO_PROBAV_C1_S1_TOC_100M). The product contains five bands, e.g., red (658 nm), near-infrared (NIR) (834 nm), blue (460 nm), short-wave infrared (SWIR) (1610 nm), and the NDVI values were calculated from the red band and NIR band [44]. This dataset comes from a composite of 300 m spatial resolution every two days, and 100 m every five days, which has been corrected at the atmospheric and radiometric level [45]. The Sentinel-1 dataset comes from the Copernicus project by the European Space Agency (ESA) as one of the space missions to monitor land on a global scale. Sentinel-1 has a dual-polarization C-band Synthetic Aperture Radar (SAR) with two satellites in the The majority of the land area is irrigated and dominated by alluvial soils which are most suitable for rice cultivation. This comes from deposition from the Cimanuk, Citarum, and Cilamaya rivers. A state-owned company maintains the water distribution from the Jatiluhur dam in the south of the study area. The climate is monsoon with two seasons: the wet and dry seasons, and is classified as tropical rainforest based on Köppen-Geiger climate classifications [43]. The paddy cultivation includes the use of short-duration rice varieties such as IR64, Ciherang, glutinous rice, and other varieties [26,27]. Fertilizer is applied twice during growth, and chemical pesticides used to remove pests. Crops are harvested using manual tools and labor. The rice crop typically cultivates twice a year, with farmers starting sowing during the rainy season in November-December and harvest in January-February. The second sowing is undertaken in March-April and harvest from June to July. However, some areas have scheduled water irrigation which allows rice cultivation even in the dry season. The complete cropping pattern of the study area is well described in the previous study [26].

Satellite Imagery
This study uses two multitemporal imagery sources: Project for On-Board Autonomy-Vegetation (PROBA-V) and Sentinel-1 satellite imagery was downloaded directly from GEE storage within the period from January 2017 to August 2018. PROBA-V Top of Canopy dataset comes from Flemish Institute for Technological Research/European Satellite Agency (GEE id: VITO_PROBAV_C1_S1_TOC_100M). The product contains five bands, e.g., red (658 nm), near-infrared (NIR) (834 nm), blue (460 nm), short-wave infrared (SWIR) (1610 nm), and the NDVI values were calculated from the red band and NIR band [44]. This dataset comes from a composite of 300 m spatial resolution every two days, and 100 m every five days, which has been corrected at the atmospheric and radiometric level [45].
The Sentinel-1 dataset comes from the Copernicus project by the European Space Agency (ESA) as one of the space missions to monitor land on a global scale. Sentinel-1 has Remote Sens. 2021, 13, 1498 5 of 21 a dual-polarization C-band Synthetic Aperture Radar (SAR) with two satellites in the same orbit to have shorter revisit times (6-12 days). The dataset has been processed with the Sentinel-1 toolbox to remove thermal noise, calibrate radiometric problems, and correct the terrain (GEE id: COPERNICUS_S1_GRD). The dataset contains vertical-horizontal (VH) and vertical-vertical (VV) cross-polarization with Interferometric Wide Swath mode in order to have the largest area in one swath with 10 m resolution from a descending orbit dataset. A revised Lee filter was run to decrease speckle noise [46]. In the next step, the Sentinel-1 (VH only) images were resampled to 100 m to match the PROBA-V's resolution.
The total number of images of PROBA-V and Sentinel-1 processed from 1 January 2017 to 23 July 2018 were 598 and 132 images, respectively. All the PROBA-V and Sentinel-1 processed images were masked with existing rice field maps from the official Indonesian Ministry of Agriculture, which come from high-resolution images obtained in 2010. The masking process was carried out to ensure only rice fields remain within the image.

Local Statistics
The agriculture agencies at the regency level have been collecting rice statistics, including planting area, harvesting area, and productivity for each month at the sub-district level since the 1980s. The planting and harvesting data are from farmer-provided information to the local agriculture statistics group. The collated data report would be sent to the agricultural division at the regency, province, and national level for the Indonesian Central Bureau of Statistics and Indonesia Ministry of Agriculture.
The overall statistics of rice planting and harvested area are illustrated in Figure 3. The highest rice planting area between January 2017 to July 2018 was 94,428 ha in December 2017, and the harvested area was 76,300 ha in March 2018 for all three regencies [47][48][49].
Remote Sens. 2021, 13, x FOR PEER REVIEW 5 of 21 same orbit to have shorter revisit times (6-12 days). The dataset has been processed with the Sentinel-1 toolbox to remove thermal noise, calibrate radiometric problems, and correct the terrain (GEE id: COPERNICUS_S1_GRD). The dataset contains vertical-horizontal (VH) and vertical-vertical (VV) cross-polarization with Interferometric Wide Swath mode in order to have the largest area in one swath with 10 m resolution from a descending orbit dataset. A revised Lee filter was run to decrease speckle noise [46]. In the next step, the Sentinel-1 (VH only) images were resampled to 100 m to match the PROBA-V's resolution.
The total number of images of PROBA-V and Sentinel-1 processed from 1 January 2017 to 23 July 2018 were 598 and 132 images, respectively. All the PROBA-V and Sentinel-1 processed images were masked with existing rice field maps from the official Indonesian Ministry of Agriculture, which come from high-resolution images obtained in 2010. The masking process was carried out to ensure only rice fields remain within the image.

Local Statistics
The agriculture agencies at the regency level have been collecting rice statistics, including planting area, harvesting area, and productivity for each month at the sub-district level since the 1980s. The planting and harvesting data are from farmer-provided information to the local agriculture statistics group. The collated data report would be sent to the agricultural division at the regency, province, and national level for the Indonesian Central Bureau of Statistics and Indonesia Ministry of Agriculture.
The overall statistics of rice planting and harvested area are illustrated in Figure 3. The highest rice planting area between January 2017 to July 2018 was 94,428 ha in December 2017, and the harvested area was 76,300 ha in March 2018 for all three regencies [47][48][49].

Methods
There are four steps to find the cross-correlation between rice growth stages and rice planting and harvested areas. The first step was collecting data from the field campaign to generate the ground-truthing dataset. The second step is to build a statistical learning model to classify rice growth stages using images from PROBA-V and Sentinel-1 and ground-truthing dataset. In the third step, the time-series rice growth stages prediction maps were generated, and sub-district level maps are calculated. The last step was to calculate cross-correlation and map the correlation value to each sub-district to generate a correlated and lag-day distribution map, as shown in Figure 4.
planting and harvested areas. The first step was collecting data from the field campaign to generate the ground-truthing dataset. The second step is to build a statistical learning model to classify rice growth stages using images from PROBA-V and Sentinel-1 and ground-truthing dataset. In the third step, the time-series rice growth stages prediction maps were generated, and sub-district level maps are calculated. The last step was to calculate cross-correlation and map the correlation value to each sub-district to generate a correlated and lag-day distribution map, as shown in Figure 4.

Data Sampling
The purposive random sampling was undertaken during a field campaign based on rice field area from Indonesia Ministry of Agriculture from 4 July to 1 August 2018, yielding 316 points. The surveyor visited the designated points to take field photos and record the rice field surface conditions such as land preparation, bare land, flooding, vegetative, reproductive, ripening, or harvested using a GPS-enabled smartphone. The designated points should have uniform and wide enough to represent the distribution of the area for reducing the mixed pixel effect [50]. An example of a field survey under various conditions is shown in Figure 5. tinel-1.

Data Sampling
The purposive random sampling was undertaken during a field campaign based on rice field area from Indonesia Ministry of Agriculture from 4 July to 1 August 2018, yielding 316 points. The surveyor visited the designated points to take field photos and record the rice field surface conditions such as land preparation, bare land, flooding, vegetative, reproductive, ripening, or harvested using a GPS-enabled smartphone. The designated points should have uniform and wide enough to represent the distribution of the area for reducing the mixed pixel effect [50]. An example of a field survey under various conditions is shown in Figure 5.

Building Classification Models
The resulting data from the field surveys were labelled into four classes: bare land, vegetative, reproductive, and ripening, and then synchronized with the PROBA-V and Sentinel-1 dataset to the closest date. Thus, the dataset was used to build a prediction model using a machine learning classifier with the caret package in the R statistical program [51,52] with leave-one-out cross-validation (LOOCV). The total dataset was divided into training (70%) and test (30%) datasets for building a training model and testing, respectively. Here, we used the Support Vector Machine (SVM) with the radial basis function kernel (SVM-RBF) classifier, which is one of the most used classifiers for solving multi-class problems developed by Vapnik [53,54] and suitable for this application due to its flexibility of high variability and complex dataset [25,55]. Additionally, Griffiths, et al. [56] suggested that SVM have high accuracy when used with a small dataset. Moreover, the previous study shows that SVM has better performance than the neural network and random forest classifier [57]. It could be used to create the automatization of rice growth stages map [39]. The SVM-RBF has two hyperparameters to increase separability between classes: Cost and Gamma, which need to be found using a grid search with initial values.
A classification model was built using the five predictors of PROBA-V bands with 223 points as a training dataset and 103 points as the test dataset. Another classification model was built using time series Sentinel-1 images. In contrast to the PROBA-V model, S-1 VH1, VH2, and VH3 images were collected from three consecutive dates as predictors to build the S-1 model. The predictor VH1 refers to the sampling date, VH2 is the previous 12-days VH data, and VH3 is the previous 24-days of VH data from the sampling date, respectively. The VH has better sensitivity to detect rice phenology than vertically emitted and vertically receiving (VV) due to cross-polarization having more signal depolarization in the rice canopy with multiple reflections [15,25].

Accuracy Assessment
The accuracy of PROBA-V and Sentinel-1 models were assessed using the comparison of predicted values from the training dataset and test dataset within pixels using the confusion matrix. The overall accuracy (OA), user's accuracy (UA), producer's accuracy (PA) can be calculated as suggested by Foody [58].

Building Classification Models
The resulting data from the field surveys were labelled into four classes: bare land, vegetative, reproductive, and ripening, and then synchronized with the PROBA-V and Sentinel-1 dataset to the closest date. Thus, the dataset was used to build a prediction model using a machine learning classifier with the caret package in the R statistical program [51,52] with leave-one-out cross-validation (LOOCV). The total dataset was divided into training (70%) and test (30%) datasets for building a training model and testing, respectively. Here, we used the Support Vector Machine (SVM) with the radial basis function kernel (SVM-RBF) classifier, which is one of the most used classifiers for solving multi-class problems developed by Vapnik [53,54] and suitable for this application due to its flexibility of high variability and complex dataset [25,55]. Additionally, Griffiths, et al. [56] suggested that SVM have high accuracy when used with a small dataset. Moreover, the previous study shows that SVM has better performance than the neural network and random forest classifier [57]. It could be used to create the automatization of rice growth stages map [39]. The SVM-RBF has two hyperparameters to increase separability between classes: Cost and Gamma, which need to be found using a grid search with initial values.
A classification model was built using the five predictors of PROBA-V bands with 223 points as a training dataset and 103 points as the test dataset. Another classification model was built using time series Sentinel-1 images. In contrast to the PROBA-V model, S-1 VH1, VH2, and VH3 images were collected from three consecutive dates as predictors to build the S-1 model. The predictor VH1 refers to the sampling date, VH2 is the previous 12-days VH data, and VH3 is the previous 24-days of VH data from the sampling date, respectively. The VH has better sensitivity to detect rice phenology than vertically emitted and vertically receiving (VV) due to cross-polarization having more signal depolarization in the rice canopy with multiple reflections [15,25].

Accuracy Assessment
The accuracy of PROBA-V and Sentinel-1 models were assessed using the comparison of predicted values from the training dataset and test dataset within pixels using the confusion matrix. The overall accuracy (OA), user's accuracy (UA), producer's accuracy (PA) can be calculated as suggested by Foody [58].

Integration Map of PROBA-V and Sentinel-1, and Time Series Modulator
Initially, rice growth stage maps were generated using PROBA-V, and then cloud pixels were filled with the S-1 prediction map. Consequently, time series maps were generated from the integration of both images. To increase the consistency of time series maps, a time series modulator was applied. The modulator's work is to check whether the current map is consistent with the previous period map and correct it automatically. For example, if the previous map had a bare land class and the current map shows a ripening class, which is not consistent, the current map's value was changed into a bare land class. This modulator also applied with ripening (the previous map)-reproductive class (the current map) and vegetative (the previous map)-ripening class (the current map).
The prediction maps for rice growth stages were overlayed with sub-district maps. The intersect maps were calculated to obtain an area of rice growth stages over a 12-day period for each sub-district (using Geographic Information System software) to compare with the local statistical records.

Cross-Correlation
The similarity between rice growth stages and rice planting and harvesting area in time series was calculated using cross-correlation. Cross-correlation allows finding the best correlation and lag days between two-time series data. The time series pair datasets are from the generated rice growth-stage maps (vegetative, reproductive, and ripening) and time series from monthly locally collected statistics (planting area and harvested) as shown in Table 1. The lag days information with correlation index shows how strong the relationship between the two classes has the same temporal pattern but at different times. The cross-correlation (r) on five pair datasets can be calculated as follows [59]: where r xy k = cross-correlation coefficient for a k period lag for x and y time series, y = mean of y time series, y t = value of time series y on period t, x t−k = value of time series k periods before period t x = mean of x time series, SD x and SD y are the standard deviation of the x and y time series, respectively. The p-value with the two-tailed test was also calculated using the highest correlation value on a specific range for each pair of time series.
In addition, distribution maps of the correlation and the lag time among sub-districts to show the spatial information were created to understand the spatial distribution. Moreover, the classification of correlation value can be grouped as follows: (1) high (0.6 < r ≤ 1.0), medium (0.4 < r ≤ 0.59), and low (r ≤ 0.39). Figure 6a illustrates the distribution of surface reflectance of different rice growth stages. The graph shows that bare land and rice growth stages have distinctive spectral features. The bare land is significantly different from rice growth stages in the SWIR region due to its high surface reflectance, while the vegetative stage has lower values. On the other hand, reproductive and ripening phases have overly similar reflectance values.

Spectral Bands of PROBA and VH Backscattering
The backscattering of VH signatures over three consecutive acquisitions show that the vegetative stage has the lowest value (<−22 dB) due to water scattering from wet soil [60] ( Figure 6b). The reproductive stage tends to decrease from −18 to −21 dB, which is the same as the ripening phase but in the higher value range from −16 to −23 dB. Additionally, the bare land has a steady increase in backscattering value from −19 to −16 dB due to less biomass on the ground [61]. The SAR data show a larger separation between the reproductive and ripening phases than the optical data.
[60] (Figure 6b). The reproductive stage tends to decrease from −18 to −21 dB, which is the same as the ripening phase but in the higher value range from −16 to −23 dB. Additionally, the bare land has a steady increase in backscattering value from −19 to −16 dB due to less biomass on the ground [61]. The SAR data show a larger separation between the reproductive and ripening phases than the optical data.  Table 2 shows the accuracy of rice growth stages of the PROBA-V model has higher OA (83.87%) than the Sentinel-1 model (71.74%). The majority of rice growth stages were predicted with high accuracy (>80%), except for the ripening stage with PA (50.00%) for the PROBA-V model. The UA of the ripening class shows the least accuracy than other classes.

Accuracy of the Machine Learning Model
On the other hand, the rice growth stages of the Sentinel-1 model shows an acceptable PA and UA (>66%) for all classes except the ripening class. The highest PA was noticed with the bare land (75.00%) and the highest UA with the reproductive stage (80.77%). The vegetative stages are more likely to overlap with bare land due to limitations of Sentinel-1 detecting the wet soil from land preparation and vegetative stage.   Table 2 shows the accuracy of rice growth stages of the PROBA-V model has higher OA (83.87%) than the Sentinel-1 model (71.74%). The majority of rice growth stages were predicted with high accuracy (>80%), except for the ripening stage with PA (50.00%) for the PROBA-V model. The UA of the ripening class shows the least accuracy than other classes. On the other hand, the rice growth stages of the Sentinel-1 model shows an acceptable PA and UA (>66%) for all classes except the ripening class. The highest PA was noticed with the bare land (75.00%) and the highest UA with the reproductive stage (80.77%). The vegetative stages are more likely to overlap with bare land due to limitations of Sentinel-1 detecting the wet soil from land preparation and vegetative stage.   (Figure 7). The Sentinel-1 based maps have a high similarity of PROBA-V on 17-28 August 2017, which show the prediction model can estimate the rice growth stages with acceptable accuracy. However, the downside of the Sentinel-1 rice growth stage map is less sensitive to the ripening phase ( Figure S4). Nonetheless, the integration of two satellites data can significantly improve data continuity of the rice growth stages every 12-days (more maps are available on Supplementary Data (Figures S1-S16)).  Figure 8 shows the temporal pattern of the rice growth stage area predicted from remote sensing images. In the Indramayu regency, the bare land area increased steadily until April 2017 and then decreased. The second planting time was also captured in April-May 2017 as the vegetative stages area increased on Indramayu and Subang Regency. Conversely, the Karawang Regency had a different peak of planting time in July 2017 ( Figure 8). The harvested area in the three regencies also fluctuates and is lower than the vegetative area, except for Indramayu on April-May 2018. Figure 9 shows the planting and harvested area calculated from local statistics and remote sensing data (vegetative and the ripening regions). The temporal trend was closely aligned with both methods. However, remote sensing data overestimated the vegetative area, which was almost double the actual rice planting area. This phenomenon is due to double counting of the vegetative stage from flooding to maximum tillering, lasting for four months. The ripening phase also has a similarity of paddy harvested area in three regencies in February 2017, August 2017, March 2018, and May 2018. However, it has less area than the paddy harvested area due to the ripening stage during the last 30 days.  Figure 9 shows the planting and harvested area calculated from local statistics and remote sensing data (vegetative and the ripening regions). The temporal trend was closely aligned with both methods. However, remote sensing data overestimated the vegetative area, which was almost double the actual rice planting area. This phenomenon is due to double counting of the vegetative stage from flooding to maximum tillering, lasting for four months. The ripening phase also has a similarity of paddy harvested area in three regencies in February 2017, August 2017, March 2018, and May 2018. However, it has less area than the paddy harvested area due to the ripening stage during the last 30 days.       Figure 10 shows the comparison of rice growth stages in temporal at the six subdistricts with >7000 ha. It shows that every sub-district has its unique temporal data, but some sub-districts had similar trend and cropping pattern, such as Gantar, Kroya, and Terisi in the Indramayu Regency. Those Figure 11 shows the temporal comparison between vegetative and ripening stages and rice planting and harvested area in the sub-districts level. Most of the sub-district indicates that there is a temporal relationship between predicted data and observation data. There was some overlap between the rice planting area and vegetative stage in March-April 2017 on sub-districts such as Gantar, Kroya, and Terisi. Moreover, the vegetative stage peak was higher than the rice planting area, and the ripening stage also less than the rice harvested area such as Ciasem, Subang Regency. More details of each sub-district temporal plots can be seen on https://github.com/FadhlullahRamadhani/Remote-sensed-correlationstatistics/tree/master/Results_PROBA_S1/sd, accessed date: 13 April 2021.

Results of Cross-Correlation Analysis
Multitemporal rice growth stage maps were compared with local statistics, especially at the sub-districts level (n = 91). Figure 12a shows the correlation values of predicted values to the local statistics. The paired comparison between rice growth stages have a high correlation (>0.6), and the lag time is similar to rice cultivation time (Figure 12b). The vegetative stage area has a 5.27 day lag time with rice planting (r = 0.52, p < 0.01) and 89.47 day lag time with the harvested area (r = 0.50, p < 0.01), which is similar to the rice planting area and harvested area (Table S1). The correlation between reproductive and harvested area has a medium relationship (r = 0.57, p < 0.01) and the lag day (44.04 days), also consistent with the rice farming in the study area. However, the lag time for ripening and harvested pair is 12 days (r = 0.60, p < 0.01), which is less than the 30 days for ripening stages, indicating that the model only predicted half of the period of ripening. Figure 11. The comparison of temporal rice growth stages area (vegetative and ripening) with local statistics area (planting and harvested) for six sub-districts with rice area >7000 ha on Indramayu, Karawang, and Subang Regency from 1 January 2017-23 July 2018.
Most of the sub-districts have high correlation values such as Cikedung, Cilamaya Wetan, and Cipunagara (r > 0.7, p < 0.01). Nevertheless, some sub-districts have the least correlation value such as Gantar, Klari, and Cipeundeuy (r < 0.3, p > 0.05) (Table S1). Figure 13 shows the map of the correlation coefficient's distribution in the study area from cross-correlation analysis. It shows that most of the north of the study area has a high to medium correlation for five paired time-series, especially reproductive-harvested and ripening-harvested pair. Moreover, many sub-districts have more high correlation value on the three pair dataset, such as the reproductive-harvested, ripening-harvested, and planting-harvested dataset. However, the south part of Subang Regency has the lowest correlation for all comparison except the planting-harvested stage area, especially Cijambe, Cisalak, Ciater, Kasomalang, Jalancagak, Sagalherang, and Serapanjang sub-district.
The distribution map of lag time for three regencies is illustrated in Figure 14. The lag time for vegetative-planting is well distributed through three regencies with a lag time of 0-12 days. The distribution of lag days on vegetative-the harvested area is also similar to the distribution of local statistics (96-108 days). Only seven sub-districts are less than 72 days, particularly the Cikampek sub-district.
However, other comparisons have varied lag days, indicating the variability of the relationship between rice growth stages and local statistics tabulation. For example, the reproductive stage-harvested area pair has a varied lag time (24-60 days), where it should be around 45-60 days. Moreover, some inconsistency in lag time (0-12 days) in the reproductive stage-the harvested area should be 24-36 days in 15 subdistricts, mainly located north of Karawang and Subang.  Figure 13 shows the map of the correlation coefficient's distribution in the study area from cross-correlation analysis. It shows that most of the north of the study area has a high to medium correlation for five paired time-series, especially reproductive-harvested and ripening-harvested pair. Moreover, many sub-districts have more high correlation value on the three pair dataset, such as the reproductive-harvested, ripening-harvested,

Discussion
This study has demonstrated the capability to integrate PROBA-V and Sentinel-1 satellite images to produce a cloud-free multitemporal map from 1 July 2017-23 July 2018.
The approach was one solution to increase data availability, certainty, and consistency compared with the locally-derived statistics. The integration approach between optical and radar sensor has been demonstrated to increase the accuracy and data accessibility in other studies [39,62]. The radar sensor can generate maps in the wet season when the optical sensor failed to provide the required information. However, the accuracy of the Sentinel-1 model in the dry season is lower than the wet season due to scattering on dry land being indistinct from the vegetative stage, especially in July-August on the north of Subang and Indramayu regencies. Conversely, the PROBA-V model will give a more accurate result on the dry season, which has less cloud.
The results of this study show the integration of PROBA-V and Sentinel-1 can be one alternative to deliver rice growth stage maps in the near-real-time with high accuracy of each rice growth stage models with cloud-free data, compared with a previous study [57]. Figure 15 shows the fluctuation of the composition of the integration of two sensors. It illustrates that PROBA-V based maps have >60% in the dry season (April-September), and Sentinel-1's rice growth stages maps are more dominant in the wet season (January-February, and October-December). The advantage of fusion at the decision level is an easily implemented monitoring system than pixel-level or features-level fusion, which required pixel co-registration and high memory requirements [63].

Discussion
This study has demonstrated the capability to integrate PROBA-V and Sentinel-1 satellite images to produce a cloud-free multitemporal map from 1 July 2017-23 July 2018. The approach was one solution to increase data availability, certainty, and consistency compared with the locally-derived statistics. The integration approach between optical and radar sensor has been demonstrated to increase the accuracy and data accessibility in other studies [39,62]. The radar sensor can generate maps in the wet season when the optical sensor failed to provide the required information. However, the accuracy of the Sentinel-1 model in the dry season is lower than the wet season due to scattering on dry land being indistinct from the vegetative stage, especially in July-August on the north of Subang and Indramayu regencies. Conversely, the PROBA-V model will give a more accurate result on the dry season, which has less cloud.
The results of this study show the integration of PROBA-V and Sentinel-1 can be one alternative to deliver rice growth stage maps in the near-real-time with high accuracy of each rice growth stage models with cloud-free data, compared with a previous study [57]. Figure 15 shows the fluctuation of the composition of the integration of two sensors. It illustrates that PROBA-V based maps have >60% in the dry season (April-September), and Sentinel-1's rice growth stages maps are more dominant in the wet season (January-February, and October-December). The advantage of fusion at the decision level is an easily implemented monitoring system than pixel-level or features-level fusion, which required pixel co-registration and high memory requirements [63]. The correlation analysis on the sub-districts level shows a high similarity of the vegetative-harvested area from generated maps with planting-harvested area (Figure 12b). Other similarities also applied to vegetative-planting area and reproductive-harvested area. However, the pair of ripening and harvested areas with low similarity may be due to remote sensing maps failing to capture the harvested area in a specific time. The other explanation is that the many farmers are still harvested with manual labor, only cut half The correlation analysis on the sub-districts level shows a high similarity of the vegetative-harvested area from generated maps with planting-harvested area (Figure 12b). Other similarities also applied to vegetative-planting area and reproductiveharvested area. However, the pair of ripening and harvested areas with low similarity may be due to remote sensing maps failing to capture the harvested area in a specific time. The other explanation is that the many farmers are still harvested with manual labor, only cut half of the rice canopy to get the grain rather than all rice stem with a combined harvester, leading to false classification to ripening class in the model. Moreover, the correlation value distribution map shows that south of Subang Regency has low correlation values ( Figure 13). Based on the ground truth, the low correlation between those areas is due to different irrigation schemes with the north Subang Regency. The south of Subang Regency is mostly a rainfall-dependent area where the model can have false classification predominantly vegetative with bare land class and ripening to bare land. Moreover, the rice area on those areas is small patches in the valley or hillside where remote sensing area is difficult to capture due to interference with other canopies, such as trees on Sentinel-1 and limited pixel resolution PROBA-V.
Nevertheless, our rice growth stage maps are similar to previous studies on the north of West Java island, where rice cultivation season starts from the south of the Indramayu region in September and end on the north of Karawang Regency February [27]. Moreover, Sianturi, Jetten and Sartohadi [26] investigated that the north of the study area is prone to flood due to sea-level rise every wet season.
Our study can be compared with the work of Rudiyanto, Minasny, Shah, Che Soh, Arif and Indra Setiawan [27] using time-series Sentinel-1 data with the unsupervised method to produce rice area and monthly rice growth stages area in the same study location. The advantage of their work on rice growth stage maps is the ability to distinguish secondary non-rice patterns where our study assumed that all crop cultivation on the area is rice cultivation. However, this proposed method indicates the simplicity of the implementation classification procedure. It can be used for the near-real-time application due to high accuracy in two rice growth stage models where their work depends on local knowledge expertise. The temporal resolution of our study is 12-days, representing a significant improvement compared to Rudiyanto, Minasny, Shah, Che Soh, Arif and Indra Setiawan [27], which have a monthly period. A shorter period is preferred by stakeholders as the rice cultivation period is a short duration farming, and where the change of rice growth stages is imminent and easy to verify, it can also be complemented with crop modeling to produce rice production estimation [64].
The overall result of the present work shows that a 100 m spatial and 12-day temporal resolution period can be one of the methods for filling the data gap with other information that has been available from MODIS (250 m, 16 days revisited time), Landsat-8 OLI (30 m, 16 days revisited time), and Sentinel-2 (10 m, five days revisited time). The three mentioned methods have difficulties compared with the existing local statistics data due to cloud interference in temporal space.
This study proved that remote sensing data obtained from multiple platforms could have a beneficial impact on the prediction model accuracies (Table 2). Therefore, it can be integrated into rice growth stage mapping efforts from regional to country scales. Our methodology can be evaluated and deployed elsewhere, with other crops using new training and test datasets. The source code can be viewed on https://github.com/ FadhlullahRamadhani/Remote-sensed-correlation-statistics, accessed date: 1 April 2021. Additionally, the machine learning classifier can be changed to deep learning classifiers where previous studies may increase the accuracy and the speed of image processing [65][66][67][68].
Despite the positive result of this study, some limitations need to be acknowledged in the future. The overall accuracy still depends on the capability and size of the groundtruthing data. The classification error could be caused by the presence of mixed pixels, where some regions commonly grow two-three crops at a time. Another limitation is the official rice field area which we used as the masking area, may not be accurate in some areas, especially in the Karawang Regency, where it has a high land-use change frequency [69]. In the future, this study can be combined with a scene-based classification of rice areas using an area sampling framework to increase the accuracy of the machine learning model, thus increasing the consistency over time [70]. Furthermore, the climate can shift the cropping pattern to some extent where the farmer is unable to cultivate rice in a few years, especially in strong El-Nino season [71].

Conclusions
The rice production stakeholders for the public and private sectors need accurate information to provide supply and trade information efficiently. In this paper, we have developed a cloud-free method of mapping rice growth stages with a spatial resolution of 100 m and a 12-day periodic time with a high correlation with local statistics, making the outputs more likely to be utilized by the agriculture stakeholder due to acceptable accuracy. Additionally, this method can be implemented or combined in near-real-time and automatic rice monitoring system such as S-2 RGS to fill the missing information.
This project was developed to increase the crop information availability of the Indonesian Ministry of agriculture to provide better information for more reliable policy, especially for climate change adaptation and mitigation action planning. In the future, the integration of multiple sensors (MODIS, PROBA-V, Landsat-8 OLI, Sentinel-1, Sentinel-2, or Sentinel-3) with different resolution data can be applied to provide a cloud-free map for rice production or other crops. Moreover, the application development in the GEE environment is a preferable option due to the free, fast, and less infrastructure needed to download the images, analyses processing, and information dissemination. The future launch of Landsat 9, Sentinel-1 C and D, and freely available satellite data such as Hyperspectral Precursor of the Application Mission (PRISMA) can be used to increase data volume and variety, especially for the countries prone to crop failure due to natural climate variability and extreme weather events.