Comparing Sentinel-1 and -2 Data and Indices for Agricultural Land Use Monitoring

: Agricultural vegetation development and harvest date monitoring over large areas requires frequent remote sensing observations. In regions with persistent cloud coverage during the vegetation season this is only feasible with active systems, such as SAR, and is limited for optical data. To date, optical remote sensing vegetation indices are more frequently used to monitor agricultural vegetation status because they are easily processed, and the characteristics are widely known. This study evaluated the correlations of three Sentinel-2 optical indices with Sentinel-1 SAR indices over agricultural areas to gain knowledge about their relationship. We compared Sentinel-2 Normalized Di ﬀ erence Vegetation Index, Normalized Di ﬀ erence Water Index, and Plant Senescence Radiation Index with Sentinel-1 SAR VV and VH backscatter, VH / VV ratio, and Sentinel-1 Radar Vegetation Index. The study was conducted on 22 test sites covering approximately 35,000 ha of four di ﬀ erent main European agricultural land use types, namely grassland, maize, spring barley, and winter wheat, in Lower Saxony, Germany, in 2018. We investigated the relationship between Sentinel-1 and Sentinel-2 indices for each land use type considering three phenophases (growing, green, senescence). The strength of the correlations of optical and SAR indices di ﬀ ered among land use type and phenophase. There was no generic correlation between optical and SAR indices in our study. However, when the data were split by land use types and phenophases, the correlations increased remarkably. Overall, the highest correlations were found for the Radar Vegetation Index and VH backscatter. Correlations for grassland were lower than for the other land use types. Adding auxiliary data to a multiple linear regression analysis revealed that, in addition to land use type and phenophase information, the lower quartile and median SAR values per ﬁeld, and a spatial variable, improved the models. Other auxiliary data retrieved from a digital elevation model, Sentinel-1 orbit direction, soil type information, and other SAR values had minor impacts on the model performance. In conclusion, despite the di ﬀ erent nature of the signal generation, there were distinct relationships between optical and SAR indices which were independent of environmental variables but could be stratiﬁed by land use type and phenophase. These relationships showed similar patterns across di ﬀ erent test sites. However, a regional clustering of landscapes would signiﬁcantly improve the relationships.


Introduction
Area-wide monitoring of agricultural activities provides information of great benefit for different policy sectors. The intensity of use, productivity, and environmental status of agricultural areas are important regarding the environment and changing global conditions such as climate change and population growth. Furthermore, the Common Agricultural Policy (CAP) is the most important policy initiative of the European Union (EU) in financial terms [1]. Farmers receive subsidies if they comply with specific regulations of this policy, and national agencies must verify whether these farmers comply with the obligations. Currently, at least five percent of the payees need to be controlled on site. At a large scale, encompassing and prompt monitoring of agricultural activities via on-site inspections are not feasible, but remote sensing is a practicable option [2]. Therefore, for the upcoming CAP funding period from 2021-2027, EU Member States are encouraged to make use of satellite data to verify payment applications and establish area-wide satellite-based CAP monitoring [3].
To date, satellite-based agricultural monitoring has been hampered by the availability of dense and gap-free time series of sufficient spatial resolution. With the Copernicus program of the European Space Agency (ESA), new optical and Synthetic Aperture Radar (SAR) satellite data with a high temporal and spatial resolution are now freely available and offer a unique opportunity to support CAP monitoring. The ESA program comprises different sensor types for land monitoring including Sentinel-1 (S1), a C-band SAR satellite, and Sentinel-2 (S2), an optical satellite.
The utilization of spectral and SAR remote sensing data in vegetation monitoring is based on the fact that the electromagnetic radiation in their respective spectral range is sensitive to specific bio-physical and/or eco-physiological parameters of the plants, such as chlorophyll and water content, or the structure of leaves and plants. Due to the different spectral ranges of optical and SAR radiation, the detected biophysical plant parameters are completely different. Optical data originates from reflection and refraction characteristics of the canopy surface, and soil if the vegetation cover is below 100%, providing insights into vegetative biophysical processes. Measured radar radiation results from the backscattering of transmitted signals, which is influenced by the geometrical structure and the dielectric characteristics of the canopy and, depending on the frequency, below the canopy down into the soil [4]. Nevertheless, both types of sensors allow assessment of the same plant parameters, for example, biomass [5][6][7], crop type [8][9][10], or phenophase [11,12].
Optical remote sensing is more frequently used in agricultural vegetation monitoring than SAR remote sensing because optical data-particularly as vegetation indices-are easily approachable and have proven to be suitable for agricultural vegetation monitoring [2,13]. In many agricultural applications, solely using optical data outperforms solely using SAR data when enough images are available [14][15][16]. Unfortunately, optical remote sensing depends on cloudless skies and solar illumination, which often leads to fragmented time series. Although active SAR remote sensing images are harder to comprehend because of the ill-posed nature of SAR signal interpretation and higher noise interference than in optical remote sensing images, the biggest advantage of SAR radiation is its ability to penetrate clouds and work without illumination. Therefore, the temporal resolution is high. Furthermore, the use of SAR remote sensing data for agricultural applications is well established [17]. For different SAR satellites, radar vegetation indices have been established and used in the agricultural domain [18,19]. Efforts have been made to combine optical and radar data [9]. This was for example done by treating, S1 and S2 data independently in the first step and then feeding both into analysis frameworks such as land use classification algorithms [20][21][22][23] or together in regression analysis for biomass, leaf area index (LAI) or yield estimation or soil moisture retrieval [24][25][26][27][28][29][30][31]. By assessing the relationships between optical and SAR data, we can determine the extent to which the two data sets are interchangeable or complementary.
So far, most studies compared SAR backscatter or optical indices with crop parameters like biomass or plant height e.g., [7,32,33]. Due to the ubiquitous lack of ground truth data in remote sensing, the comparison of SAR with optical indices, which have been proven to correlate with these plant parameters, can help to generate more knowledge about the behaviour of SAR backscatter. Only few Remote Sens. 2020, 12, 2919 3 of 27 studies performed this for agricultural land use [34][35][36][37]. Veloso et al. [34] compared temporal profiles of the Normalized Difference Vegetation Index (NDVI) with S1 backscatter in single polarization and the ratio of vertical transmit, horizontal receive (VH) and vertical transmit, vertical receive (VV) backscatter with each other and with vegetation biomass. They thus worked towards a direct comparison between optical and SAR indices. They found that SAR backscatter and ratios are complementary in their temporal behaviour. Other studies only analysed the optical and SAR index relationships without comparing them to plant parameters: Filgueiras et al. [38] examined different algorithms to directly model NDVI for maize and soybeans from S1 data but did not address the underlying mechanisms. Gonenc et al. [39] compared a Radar Vegetation Index (RVI) from fully polarized Radarsat-2 with NDVI from Landsat 8 and found a good correlation. Each of the three studies focused on one study site within one region, which is not enough for area-wide applications.
With this study, we aimed to thoroughly investigate the relationship between S2 optical reflection and S1 SAR backscatter of agricultural fields to better understand the similarities and differences of the two data types for the intra-annual monitoring of large agriculturally used areas. In contrast to a case study, this experimental study shall obtain a large, generic overview of the relationships between optical and SAR indices for agricultural areas. To consider several aspects of agricultural monitoring we included four very different agricultural land use types at different phenophases by using observations of the whole growing period. We used several optical indices, and the so far rarely used S1 RVI for a very large area to be able to make statements that are as generally valid as possible. We used additional explanatory variables to understand the impact of soil type, topography, and other parameters on the said relationship and investigated the importance of the different explanatory variables. In addition to three major crop types (winter wheat, spring barley, maize), we extended the analysis to grassland, which has not been considered in such studies to date; however, with a 13% area share in Germany and 26% share globally, is an important type of agricultural land use [40,41]. We examined whether there is a generic correlation between optical indices and SAR for agricultural areas and which additional information could have an impact on this relationship. In the assessment process, we used different indices from S2 data that respond to different plant parameters: NDVI, Normalized Difference Water Index (NDWI), and Plant Senescence Radiation Index (PSRI). We examined their relationship with S1 VH and VV backscatter, the VH/VV ratio, and a S1 dual-polarization RVI.

Study Sites
The study sites are located in the federal state of Lower Saxony, Germany. For area selection, a 10 × 10 km grid of Lower Saxony was created and only grid cells fully within the land area of Lower Saxony were considered. Then, for each cell we tested whether there were at least two S2 scenes in 2018 with no clouds at the same day as a S1 scene. Of those cells, 30 were randomly selected. Within the test sites, we concentrated on four important types of land use in Central European agriculture. These were winter wheat (WW) (Triticum aestivum), spring barley (SB) (Hordeum vulgare), maize (MZ) (Zea mays), and permanent grassland (GL). These four types were chosen as they differ markedly with respect to their phenological development. We selected parcels larger than 0.5 ha for these land use types and buffered those inwards by 15 m to avoid edge effects. Test sites with less than 50 parcels were excluded. Thus, 22 study sites were finally selected. Figure 1 shows the test sites and the relative area share of the selected land use types. GL and MZ dominate most study sites in the western part of Lower Saxony. WW dominates some test sites in the eastern part of Lower Saxony. In some test sites of the eastern region, area shares of all land use types are almost equally distributed. The final data set covered 17,857 parcels on roughly 35,000 ha of agricultural land.
The study sites are located between −1.6 and 282.8 m (36.5 m mean) above sea level on mostly flat terrain with slopes between 0 and 19 • (0.8 • mean). The sites are mostly in old drift morainic landscapes. In the north, where the North Sea delimits Lower Saxony, the so-called German lowland is characterised by a maritime climate and flat topography. In the south lie low mountain ranges with a decreasing maritime influence [42]. Thus, the test sites are heterogeneous in terms of climate, elevation, and slope. In the study year of 2018, the weather was warmer (+2.1 • C) and dryer (−240 mm) than the 30-year long-term average of Lower Saxony (avg. 8.6 • C and 745 mm). To that date, 2018 was the warmest and sunniest year since the beginning of the weather records [43].

Satellite Data
The S1 and S2 images were downloaded from the Sentinel Hub for January to December 2018 [44]. For S1 A and B satellites, Interferomic Wide Swath (IW) is the pre-defined mode over land and has a swath width of 250 km. It provides dual polarization imagery in VV and VH. S1 data was downloaded as Ground Range Detected (GRD) gamma nought backscatter. We used the orthorectified option in 10 m resolution. Pre-processing steps of the Sentinel Hub include radiometric calibration and thermal noise reduction. Orthorectification is conducted with the MapZen's digital elevation model (DEM) [45]. S1 images on the Sentinel Hub are neither terrain-flattened, nor specklefiltered, nor co-registered. To minimize the resulting imprecisions, we used an object-based approach later for field-level analysis.
S2 provides 13 bands in the optical and infrared domains. The pixel size varies between 10 and 60 m depending on the spectral bands. We used visible bands 2 and 4, vegetation red-edge band 6, and near-infrared band 8. These bands have a resolution of 10 m. S2 was downloaded as Level-2A products, which were already atmospherically corrected with the Sen2Cor processor and PlanetDEM digital elevation model. We utilized only cloudless or near cloudless (<5%) S2 scenes and corresponding S1 scenes at the same date to minimize changes in external influences such as soil moisture, harvest, etc. This resulted in between 2 and 12 S1 and S2 scenes for the different test sites. Figure 2 summarizes the acquisition dates for each test site including the S1 relative orbit number and orbit direction, and the sum of acquisition dates per test site. Overall, 35 different dates were used. We call the combination of a

Satellite Data
The S1 and S2 images were downloaded from the Sentinel Hub for January to December 2018 [44]. For S1 A and B satellites, Interferomic Wide Swath (IW) is the pre-defined mode over land and has a swath width of 250 km. It provides dual polarization imagery in VV and VH. S1 data was downloaded as Ground Range Detected (GRD) gamma nought backscatter. We used the orthorectified option in 10 m resolution. Pre-processing steps of the Sentinel Hub include radiometric calibration and thermal noise reduction. Orthorectification is conducted with the MapZen's digital elevation model (DEM) [45]. S1 images on the Sentinel Hub are neither terrain-flattened, nor speckle-filtered, nor co-registered. To minimize the resulting imprecisions, we used an object-based approach later for field-level analysis.
S2 provides 13 bands in the optical and infrared domains. The pixel size varies between 10 and 60 m depending on the spectral bands. We used visible bands 2 and 4, vegetation red-edge band 6, and near-infrared band 8. These bands have a resolution of 10 m. S2 was downloaded as Level-2A products, which were already atmospherically corrected with the Sen2Cor processor and PlanetDEM digital elevation model. We utilized only cloudless or near cloudless (<5%) S2 scenes and corresponding S1 scenes at the same date to minimize changes in external influences such as soil moisture, harvest, etc. This resulted in between 2 and 12 S1 and S2 scenes for the different test sites. Figure 2 summarizes the acquisition Remote Sens. 2020, 12, 2919 5 of 27 dates for each test site including the S1 relative orbit number and orbit direction, and the sum of acquisition dates per test site. Overall, 35 different dates were used. We call the combination of a single parcel and date an "observation". The combination of parcels and satellite images resulted in a total of 92,328 observations. Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 29 single parcel and date an "observation". The combination of parcels and satellite images resulted in a total of 92,328 observations. Figure 2. Acquisition dates of S1 scenes that coincide with S2 scenes (<5% clouds) for the test sites.

Land Use Types and Phenological Information
We included the agricultural land use type and the phenophases of the vegetation in our analysis to account for the characteristic differences between land use types and phenophases: GL, WW and SB are narrow-leafed, and MZ is broad-leafed. WW is a winter crop, SB and MZ are spring crops, and GL indicates permanent cropland. The phenological information gives implicit insight into the intraannual biophysical changes within the plants and information about the soil cover.
The Land Parcel Identification System (LPIS) [46] from the Integrated Administration and Control System (IACS) provided the information on the agricultural land use and parcel delineations for this study. IACS is the management tool of CAP. LPIS is a corresponding system for the geospatial identification of agricultural plots in the EU. Some of the agricultural land use types in this study comprised several IACS codes: MZ includes maize, maize for biogas, and silage maize. GL includes non-permanent grassland (field grass) and permanent grassland (meadows and hay meadows).
The German Meteorological Service (Deutscher Wetterdienst, DWD) provides phenological observations throughout Germany for many agricultural land use types. Using the nearest DWD phenological observation for each parcel, we derived the vegetation period of the observed land use types for 2018 [47]. The date of the phenological stage "emergence" determined the start; the first harvest date plus an extra 14 days determined the end of the growing period except for GL, for which we considered the whole year (see Table 1). Furthermore, we assigned all other phenophases to each parcel-date observation. For crops, we considered the phase between sprout and flowering as "growing", the phase between flowering and ripening as "green", and the phase between ripening and harvest as "senescent biomass". For GL, we assigned "growing" to the time before 1 June 2018. The reference date of 1 June was chosen because by then the first cut had been made almost everywhere in Lower Saxony according to the phenological observations. Unfortunately, the observations only cover the first but not the following cuts. Additionally, 2018 was an extremely dry year from about mid-April, which meant in many places no significant growth could be observed after the first cut. Therefore, the remaining time of the vegetation period of GL was aggregated under "senescent biomass" instead of distinguishing between green and senescent biomass. It is possible that this categorization did not fit locally. Acquisition dates of S1 scenes that coincide with S2 scenes (<5% clouds) for the test sites.

Land Use Types and Phenological Information
We included the agricultural land use type and the phenophases of the vegetation in our analysis to account for the characteristic differences between land use types and phenophases: GL, WW and SB are narrow-leafed, and MZ is broad-leafed. WW is a winter crop, SB and MZ are spring crops, and GL indicates permanent cropland. The phenological information gives implicit insight into the intra-annual biophysical changes within the plants and information about the soil cover.
The Land Parcel Identification System (LPIS) [46] from the Integrated Administration and Control System (IACS) provided the information on the agricultural land use and parcel delineations for this study. IACS is the management tool of CAP. LPIS is a corresponding system for the geospatial identification of agricultural plots in the EU. Some of the agricultural land use types in this study comprised several IACS codes: MZ includes maize, maize for biogas, and silage maize. GL includes non-permanent grassland (field grass) and permanent grassland (meadows and hay meadows).
The German Meteorological Service (Deutscher Wetterdienst, DWD) provides phenological observations throughout Germany for many agricultural land use types. Using the nearest DWD phenological observation for each parcel, we derived the vegetation period of the observed land use types for 2018 [47]. The date of the phenological stage "emergence" determined the start; the first harvest date plus an extra 14 days determined the end of the growing period except for GL, for which we considered the whole year (see Table 1). Furthermore, we assigned all other phenophases to each parcel-date observation. For crops, we considered the phase between sprout and flowering as "growing", the phase between flowering and ripening as "green", and the phase between ripening and harvest as "senescent biomass". For GL, we assigned "growing" to the time before 1 June 2018. The reference date of 1 June was chosen because by then the first cut had been made almost everywhere in Lower Saxony according to the phenological observations. Unfortunately, the observations only cover the first but not the following cuts. Additionally, 2018 was an extremely dry year from about mid-April, which meant in many places no significant growth could be observed after the first cut. Therefore, the remaining time of the vegetation period of GL was aggregated under "senescent biomass" instead of distinguishing between green and senescent biomass. It is possible that this categorization did not fit locally. We used a DEM with a resolution of 10 × 10 m provided by the Federal Agency for Cartography and Geodesy, from which slope and aspect were calculated. Elevation was measured in meters above sea level, slope in degrees, and aspect was classified into 9 non-ordered classes (one class is "flat"). Elevation, aspect, and slope have indirect influences on plant growth and development due to temperature and solar radiation differences and run-off [48,49]. The S1 data was not terrain flattened and thus DEM-induced distortions of the backscatter might be diminished with the presence of DEM information.
Furthermore, soil types from a small-scale soil map (1:1,000,000 [50]) were reclassified to amalgamate different soil types to broader classes based on soil type and geological parent material. The respective properties of the soil type, such as bulk density, water holding capacity, hydraulic conductivity, porosity, and pore size distribution, can influence plant development. In addition, organic soils are frequently characterized by a high degree of water saturation, which might explain site-specific differences between SAR and optical data [51,52]. By adding named variables that influence vegetation parameters, we aimed to investigate their relevance by comparing optical and SAR indices.

Pre-Processing
From S2 data, we calculated three well-established vegetation indices because it is advised to use different indices depending on the vegetation development itself [7,53]. The NDVI is the most commonly used vegetation index that is related with various plant parameters, including green biomass, LAI, and canopy photosynthesis [54]. NDVI is a function of the green chlorophyll content and the cell structure of the plant leaves [55]. For S2 data, it was calculated with the red band 4 (0.665 µm) and near infrared band 8 (0.842 µm) (see Table 2 for original formulas, adaptations, and origin). NDWI is sensitive to liquid water content of the vegetation canopy or soil moisture [55][56][57]. It is composed of the near infrared and the middle infrared bands. For S2 data it was calculated using band 8 (0.665 µm) and band 11 (1.610 µm). In addition to NDVI, PSRI is sensitive to senescent biomass [58,59]. The blue band 2 (0.550 µm), red band 4, and the red edge band 6 (0.740 µm) of S2 were used for calculation. NDVI, NDWI, and PSRI are all related to biophysical processes of the vegetation.
We used the radar vegetation index (RVI) for comparison with the optical indices. Kim and van Zyl [60,61] developed the RVI for quad-polarized SAR data. Trudel et al. [62] modified the index to dual-polarized SAR data with the assumption that σ 0 HH ≈ σ 0 VV and σ 0 HV ≈ σ 0 HH [63]. Kumar et al. [19] found that this RVI is less sensitive to environmental condition changes such as soil moisture than single polarization backscatter and therefore can be advantageous for vegetation monitoring. Nasirzadehdizaji et al. [64] recently proposed an adaptation of the modified index for S1 data based on the assumptions of Charbonneau et al. [63]: (1) Remote Sens. 2020, 12, 2919 In addition, we examined the single polarization backscatter in VV and VH polarizations individually, in addition to the ratio of both (VH/VV). The latter was found to be applicable for crop monitoring [34].
Merzlyak et al. [58] For an object-based approach at field level, we calculated the following zonal statistics from all pixels within each parcel for every index at every date: median, lower quartile (25th percentile, Q25), upper quartile (75th percentile, Q75), and standard deviation (stdv). The single polarizations VH and VV were afterwards converted to decibel (dB). For elevation and slope, we used the median value, and for aspect and soil type, we calculated the modal value per parcel. Afterwards we excluded parcels containing NDVI values below −0.1 because negative NDVI values normally correspond to water, ice or snow, and thus are regarded as outliers.
For a first overall inspection of the input data, we calculated the average value per index or backscatter value for each date and land use type over all areas. Additionally, we calculated the lower and upper quartile.

Correlations and Regressions
We applied a stepwise approach to investigate the relationships between the SAR and optical data by first conducting a simple linear correlation and then adding auxiliary data in a multiple linear regression analysis (MRA) with R software (version 4.0.0).
In the first step, we tested the correlations of all optical and SAR index pairs. We applied the bivariate Pearson correlation model to the median value at field level. The Pearson correlation coefficient (r) was used to evaluate the strength and direction of the relationship between S1 and S2 parameter pairs. We evaluated r according to Cohen [66]: r < ±0.1 mean no correlation, r < ±0.3 means low correlation, r < ±0.5 means medium correlation and ≥ ±0.5 means high correlation. The sample sizes were always large (n > 250), so even low correlations yielded significant results [67]. In addition, as we could not assume that our models were completely and correctly specified, and statistical tests such as the t-test or p-value only test on sampling errors, we did not perform any significance tests [68].
We conducted two runs of correlation: (i) Correlation on data set split by land use type and (ii) correlation on data set split by land use type and phenophase. For comparison, we also ran exponential functions in the form of y = a·e x·b . The parameters a and b were determined with the nls function from the stats package (version 4.0.0) in R.
In the second step, we conducted an MRA adding auxiliary data as regressor variables to implement regression analyses. We added the test site id as a distinct spatial variable, land use type, phenophase, elevation, slope, aspect, soil type, and S1 sensing direction (ascending or descending). Of these, test site id, land use type, phenophase, aspect, soil type, and S1 sensing direction were nominally scaled and treated as factors. Additionally, Q25, median, Q75, and stdv of the pixels of each parcel for the SAR indices were included: Median optical~M edian SAR + Q25 SAR + Q75 SAR + Stdv SAR + Test Site ID + Elevation + Slope + Aspect + Soil Type + Land Use Type + Phenophase + Orbit Direction. The MRA was evaluated with the adjusted R 2 to include the number of useless variables in the evaluation.
To gain knowledge about the importance of variables, we calculated the contribution of each variable to R 2 when included last (i.e., usefulness) [69]. This means that all other variables were already included in the MRA and the change in R 2 due to adding the considered variable last was able to be observed. Usefulness does not express the same concept as relative importance or the decomposition of R 2 but is a reliable method when the regressor variables include factor variables [70]. We implemented this with the relaimpo package in R [70,71]. For better comprehension of variable importance, we first grouped all SAR variables (Q25, Q75, median, stdv), all bio-physical variables (elevation, aspect, slope, soil type), and land use type and phenophase. Test site id and orbit direction were treated individually. Then we disaggregated the most important groups to identify the most relevant variables. Finally, we reduced the model to the most relevant variables according to the explanatory value.
The MRA was conducted in three runs: 1.
Run: MRA including all observations from all land use types and phenophases.
a. Assessment of the variable importance analysis for grouped variables. b.
Assessment of the variable importance analysis for individual variables. c.
Reduction of the model to the most relevant variables.

2.
Run: Reduced MRA for data split by land use type.

3.
Run: Reduced MRA for data split by land use type and phenophase.
The R 2 values of the three runs could not be compared directly because, due to variance decomposition, dividing data into groups can change the mean value of the respective groups in such a way that this alone is sufficient to improve the R 2 . However, to compare the different runs and to clarify whether dividing the data by land use type, or by land use type and phenophase, improved the results, we calculated the Residual Sum of Squares (RSS) for each model for all runs. Summing the RSS of all models by index pair for each run made it possible to compare the performance of the runs directly.

Temporal Profiles of Optical and SAR Indices by Land Use Type
We investigated the relationship of different S1 and S2 indices over agricultural areas. The temporal profiles of the indices give an overview about the development of all tested parameters for each land use type (Figure 3).
In general, the profiles show temporal patterns that can be clearly linked to the phenophases. The crops (MZ, SB, WW) show typical bell-shaped curves for NDVI and NDWI over the growing period. They begin to increase in April for MZ and SB starting from very low values (NDVI ≈ 0.2 and NDWI ≈ −0.2). Due to a lack of cloudless images before April in 2018, WW could not be studied over the complete growing season. Thus, NDVI and NDWI start at higher values in April (NDVI ≈ 0.5 and NDWI ≈ 0.0). NDVI and NDWI increase until July, mid-June, or June for MZ, SB, and WW respectively. level off or start to decrease gradually. With the beginning of the senescent phase, NDVI and NDWI decrease rapidly for all crops until they reach the initial value of NDVI of ≈0.2 and NDWI of ≈−0.2. The PRSI of the crops runs conversely to NDVI and NDWI. It decreases during the growing phases of the crops from PSRI ≈ 0.2, plateaus during the green phase at ≈ 0 and rapidly increases during the senescent phase up to PSRI ≈ 0.4. Ratio and RVI curves generally develop similarly to NDVI and NDWI for the crops. They increase during the growing phase, level off or slightly decrease during the green phase, and then strongly decrease during senescence. For MZ, ratio and RVI fluctuations are more pronounced. During the growing phase, there is a harsh decrease in both series, which is not apparent in the optical indices. In addition, ratio and RVI decrease more rapidly during the green phase than the optical indices for MZ. Converse to this pattern, for SB they level off far into the senescence phase. For MZ, the VH evolution is very similar to NDVI. It does not show the harsh decrease during growing like ratio and RVI. The SB VH pattern shows similarities with the optical indices during growing and senescence but not during the green phase. VH of WW even decreases during growing and only increases during the green phase. It also decreases strongly during senescence. For MZ, the VV curve shows no similarities with the optical indices during the growing season. For SB and WW, the VV values also decrease similarly to PSRI during the growing phase, temporarily increase during the green phase, and then, like NDVI and NDWI, decrease during senescence.
The first acquisitions of GL in April start at NDVI ≈ 0.6 and NDWI ≈ 0.1, respectively, and increase from this date. After the first cut, which occurred from around mid-May to 1 June, NDVI and NDWI tend to decrease with short-term fluctuations, whereby NDWI decreases strongly. PSRI again shows a reverse pattern compared to NDVI and NDWI. The ratio and RVI values for GL also increase until the end of the growing phase but fluctuate more often and show some short-term peaks during senescence. Visually, their fluctuations are not in line with the optical indices. The highest peaks in VH of GL (beginning of July, beginning of September, and beginning of October) overlap with peaks in the optical indices but otherwise VH is not like the courses of the optical indices; instead the general course is more comparable to RVI and ratio. The peaks of VH for GL area are also detected in VV backscatter and are much more prominent.

Correlation Analysis between Optical and SAR Indices
3.2.1. Correlations between Optical and SAR Indices by Land Use Type

Correlations between Optical and SAR Indices by Land Use Type
We correlated all optical indices with all SAR indices differentiated by land use type resulting in 12 index pairs per type. For all index pairs, we calculated the Pearson correlation coefficients (r) ( Table 3). Depending on the land use type, the correlation heights differed strongly. The hierarchy from highest to lowest correlation was MZ, WW, SB, and GL. The height and direction of the correlations between optical and SAR indices also differed depending on the index pairings. RVI or VH always had the highest linear correlations with the optical indices. Pairings with VV backscatter were always markedly worst. The relationships between the SAR indices with NDVI and NDWI were positive, while the relationships with PSRI were negative. There were no major differences in r between the optical indices for the same SAR index.
For each optical index and its respective best correlating SAR index, Figure 4 presents the scatterplots by land use type. Using data from the complete growing season of the land use types leads to a wide scattering of optical and SAR values. Visually, the relationships of optical and SAR index pairs appear to be non-linear, but the results of the exponential model calculations show that in most cases exponential models did not improve the relationship between the index pairs (Table A1). Since the exponential models could not better describe the relationships between optical and SAR indices, we do not consider them further in the following.

Correlations between Optical and SAR Indices by Land Use Type and Phenophase
In a second step of the correlation analysis, the data were further separated based on the phenophase. This increased most correlations except for SB and WW during the green phase (Table 4). Again, RVI or VH always showed the highest correlations with the optical indices for all land use types and phenophases.
The best optical and SAR index pairs in terms of r are shown as scatterplots Figure 5. Although the data was split by phenophase, the ranges of the data values of both optical and SAR indices are still wide and mostly cover the whole data range. Saturation effects of the SAR indices can visually be observed for these land use type-phenophase combinations: growing phase of MZ and senescent phase of all land use types except for WW.   There are no data for grassland (GL) during the green phase. Plots with RVI have a header in light grey and plots with VH in dark grey. NDVI has a header in light blue, NDWI in dark blue, and PSRI in green.

Multiple Regression Analysis
For the MRA, multiple linear regression models were fitted to the median values of the three optical indices. The models contained four SAR backscatter variables and different auxiliary variables as regressors. This allows us to understand which variables influence the regression models.

Multiple Linear Regression Analysis for All Optical and SAR Indices
Results of run 1a, which included all observations from all land use types and phenophases, are depicted in Figure 6. It shows the total R 2 per model and the share of the total R 2 of the variable groups for the MRA models of each optical and SAR index pair. For most of the models, the total R 2 of all MRA ranged between 36% and 42%. For these models, the SAR variables improved the R 2 the most by an increase of 75% to 86%, followed by the land use type and phenophase information and then the test site id. For these models, biophysical parameters and the orbit direction improved the R 2 only slightly. The exceptions were regressions with VV, with an overall R 2 below 18%, which was mostly explained by land use type and phenophase information.
Dissecting the models further by splitting the variable groups into individual variables in run 1b revealed that Q25 is the most relevant SAR variable, followed by the median backscatter ( Figure  A1). Subsequent to these, the phenophase information was more relevant than the land use type information. For most index pairs, the test site id was more relevant than land use type information. There are no data for grassland (GL) during the green phase. Plots with RVI have a header in light grey and plots with VH in dark grey. NDVI has a header in light blue, NDWI in dark blue, and PSRI in green.

Multiple Regression Analysis
For the MRA, multiple linear regression models were fitted to the median values of the three optical indices. The models contained four SAR backscatter variables and different auxiliary variables as regressors. This allows us to understand which variables influence the regression models.

Multiple Linear Regression Analysis for All Optical and SAR Indices
Results of run 1a, which included all observations from all land use types and phenophases, are depicted in Figure 6. It shows the total R 2 per model and the share of the total R 2 of the variable groups for the MRA models of each optical and SAR index pair. For most of the models, the total R 2 of all MRA ranged between 36% and 42%. For these models, the SAR variables improved the R 2 the most by an increase of 75% to 86%, followed by the land use type and phenophase information and then the test site id. For these models, biophysical parameters and the orbit direction improved the R 2 only slightly. The exceptions were regressions with VV, with an overall R 2 below 18%, which was mostly explained by land use type and phenophase information.
Dissecting the models further by splitting the variable groups into individual variables in run 1b revealed that Q25 is the most relevant SAR variable, followed by the median backscatter ( Figure A1). Subsequent to these, the phenophase information was more relevant than the land use type information. For most index pairs, the test site id was more relevant than land use type information. The stdv and Q75 were not relevant. Therefore, in the last step, we reduced the regressors to median, Q25, land use type, phenophase, and test site id, which worsened the R 2 by at most 2%. The stdv and Q75 were not relevant. Therefore, in the last step, we reduced the regressors to median, Q25, land use type, phenophase, and test site id, which worsened the R 2 by at most 2%.

Multiple Linear Regression Analysis for All Optical and SAR Indices by Land Use Type
Based on the results of the first run, in run 2, the MRA model was reduced to Medianoptical ~ MedianSAR + Q25SAR + Phenophase + Test Site ID. The land use type parameter was not used in the model because the data was split by land use type. The explanatory power of the regression model improved for all index pairs in this run compared to the first run (see Table 5). The allocation of the respective best SAR index did not change when compared to the simple linear correlation analysis (compare Tables 3 and 5): VH and RVI always led to the highest regressions. Considerable improvements could be found for WW and VV backscatter. The MRA model could explain up to 42% of the variance of PSRI, whereas there was no correlation for the simple linear model (see Table 3).
As in the previous run, we also examined the variable importance for this run ( Figure A2). We looked at the variables individually and not in groups. The phenophase information improved the R 2 for all optical and SAR index pairs. Additionally, the test site id was highly relevant for GL and MZ models.

Multiple Linear Regression Analysis for All Optical and SAR Indices by Land Use Type
Based on the results of the first run, in run 2, the MRA model was reduced to Median opticalM edian SAR + Q25 SAR + Phenophase + Test Site ID. The land use type parameter was not used in the model because the data was split by land use type.
The explanatory power of the regression model improved for all index pairs in this run compared to the first run (see Table 5). The allocation of the respective best SAR index did not change when compared to the simple linear correlation analysis (compare Tables 3 and 5): VH and RVI always led to the highest regressions. Considerable improvements could be found for WW and VV backscatter. The MRA model could explain up to 42% of the variance of PSRI, whereas there was no correlation for the simple linear model (see Table 3). As in the previous run, we also examined the variable importance for this run ( Figure A2). We looked at the variables individually and not in groups. The phenophase information improved the R 2 for all optical and SAR index pairs. Additionally, the test site id was highly relevant for GL and MZ models.

Multiple Linear Regression Analysis for All Optical and SAR Indices by Land Use Type and Phenophase
For run 3 the MRA model was further reduced to Median optical~M edian SAR + Q25 SAR + Test Site ID, and the data was split by land use type and phenophase. R 2 again increased for most of the phenophases compared to the data set only split by land use type. Table 6 shows the R 2 of all models by optical and SAR index pair, land use type, and phenophase. The highest R 2 was achieved for the senescent phase of MZ (R 2 = 71%) with NDWI~VH. Table 6. Adjusted R 2 in percent from multiple linear regression for optical and SAR vegetation indices differentiated by land use type and phenophase. The highest R 2 per optical index is in bold, the highest R 2 per phenophase and land use type is underlined. The analysis of the variable importance confirmed the relevance of the test site id. It was even more important than the SAR variables for GL during the growing phase and SB during the green phase (see Figure A3). For the other models, the SAR variables were the most relevant. However, for each model, it was always the case that only one of the two SAR variables-Q25 or median-explicitly improved the model. Adding the other variable added little information because median and Q25 were correlated. The described patterns of variable importance did not differ between the optical indices. Generally, the higher the R 2 of the final model the more useful were the SAR variables, and the lower the R 2 the more useful was the test site id.

Land Use
To compare the three runs of the MRA directly, we calculated the RSS for each model and summed them by index pair and run. We took the first run as the baseline and calculated the relative changes of the other runs. When splitting the data by land use type in run 2, RSS was reduced by 5% to 12% depending on the index pair compared to the first run. The third run reduced RSS by 13% to 26% (see Table 7). Thus, splitting the data by land use type, or splitting by land use type and phenophase, improved the MRAs.

Discussion
The overall correlations between optical and SAR indices showed large differences and variability depending on the considered conjugated indices. Besides, the strength of the correlations varied according to the examined land use type or phenophase and generally increased with a higher level of data splitting. For the interpretation of results, we first discuss the differences and similarities of the physicochemical nature of signal origin in optical and SAR data and their impact on the correlations. Then we discuss the variability in correlations between SAR and optical indices. Furthermore, we discuss the differences in correlations between different land use types and phenophases. Finally, we discuss the extent to which auxiliary variables can support the explanation of relations between optical and SAR data.
The signals that are recorded by optical and SAR systems evolve from overall different principles and interaction of electromagnetic radiation with objects on the ground. Thus, a purely statistical correlation, rather than a causal relationship, between them can be determined and assessed. Each of the three optical indices analysed was relatively strongly correlated with the SAR indices in our study. The correlation of the SAR indices with NDVI can be related to biomass as the confounding factor. Although optical and SAR radiation are influenced by different chemical and physical parameters of vegetation, the back radiation allows conclusions to be drawn about the same superordinate parameters. Both the optical NDVI and SAR signals are sensitive to vegetation biomass. The sensitivity of the NDVI to biomass is due to the sensitivity of the radiation to the green chlorophyll content. Furthermore, although optical radiation cannot penetrate the vegetation stand and is only influenced by the upper part of the stand, many studies show high correlations of NDVI and biomass for agricultural vegetation stands [7,72]. SAR radiation, in turn, penetrates the plant stand and is scattered back depending on, amongst other factors, polarization and biomass, and therefore also allows conclusions to be drawn about the biomass of the stand [32,33]. Since both NDVI and SAR signals are influenced by biomass, NDVI correlated with the SAR indices. The confounding factor between NDWI and SAR backscatter is the VWC. Different studies have shown the correlation of both NDWI and SAR indices with VWC [32,33,73,74]. The middle infrared radiation of the NDWI reflects changes in VWC and the near infrared reflects leaf dry matter content [75]. SAR data can be used for vegetation water content and soil moisture estimation because it is sensitive to the dielectric constant of the irradiated surface [76][77][78]. Because both types of radiation are sensitive to VWC, they are correlated for the land use types considered. The relationships of PSRI with SAR indices are less specific. The PSRI represents the degradation of chlorophyll over carotenoids, resulting in yellowing of the plants during senescence [58]. Also, senescence also sees a loss in VWC [79]: fresh biomass and its associated water content converge to dry biomass without water in senescent plants [80]. This VWC change is detectable with SAR backscatter. The outer plant structure changes little due to senescence and can therefore not explain a change in SAR backscatter. Although the underlying mechanisms of PSRI and SAR backscatter have fundamentally different origins, the superior parameter of the senescent biomass is nevertheless captured by both, which leads to their correlation. A comparison of the S1 indices with the complete combination of S2 spectral band ratios in the future could help to understand the relationships between SAR and optical data [81].
In this study, either the RVI or VH backscatter had the strongest correlation with the optical indices. SAR backscatter from vegetated surfaces is a combination of three scattering mechanisms: Canopy scattering or direct scattering from the vegetation, backscatter from the soil, and backscatter from the soil-plant interaction [82]. VH backscatter is dominated by stem-ground double scattering and volume scattering and is therefore sensitive to crop structure within the total canopy [83]. This is because VH backscatter measures the repolarization from the vertical linear wave into the orthogonal horizontal polarization [84]. Cross-polarizations are also less sensitive to planting row direction or look direction than co-polarizations and might, therefore, be more robust for vegetation monitoring [85]. In dry conditions, such as those occurring in 2018, the contribution of ground-vegetation interaction to VH backscatter may decrease compared with the volume contribution leading to an increase in VH backscatter [34].
In our study, VV backscatter was least correlated with the optical indices. VV backscatter of vegetated surfaces is mostly dominated by canopy and ground contribution and is also influenced by soil moisture [83,86]. Furthermore, VV is affected by the interaction of incidence angle and crop row-direction-at least for crop residue-which might also be true for the growing phase of crops [87,88]. Other studies also found less value of VV backscatter for retrieval of plant parameters such as biomass, LAI, or plant height compared to other polarizations [59,[89][90][91]. Generally, however, there is no consensus about the correlation between VV and biomass [11].
The combination of several SAR polarizations can improve the performance of agricultural biomass, LAI, or crop height monitoring [18,33,34,60,[92][93][94]. Several polarizations were either included in a physical model or were added to a ratio or an index due to its reduced sensitivity for environmental influences such as soil moisture. According to these studies, both backscatter signals, VH, and VV, are attenuated with increasing biomass or crop height, and VH is attenuated less than VV. Therefore, the ratio of VH and VV increases with increasing biomass. In our study, the RVI always outperformed the simple VH/VV ratio. Nasirzadehdizaji et al. [63] found no significant differences in the relationship of VH/VV ratio or S1 RVI with crop height. Kim et al. [18] stated that the quad-polarization RVI introduced by Kim and van Zyl [64] is correlated with NDVI and VWC, but they did not test and compare the correlation of NDVI and VWC with other SAR indices. We assume that one reason for the advantage of the RVI over the VH/VV ratio could be the principle of the RVI formula, which puts a stronger weight on the VH backscatter.
The results of the correlation analysis improved most when the data was split by land use type and by phenophase. Depending on the considered land use type, different SAR indices had the highest correlations with the optical indices. This was also found by Nasirzadehdizaji et al. [63]. However, within one land use type or phenophase, the same SAR indices always showed the highest correlation with the optical indices. The differences in the relationships by land use type can be attributed to differences in canopy structure: First, MZ is a broad-leaf crop and SB and WW are narrow-leaf crops-thus, the contribution of the leaves to the backscatter is higher for MZ, leading to a higher sensitivity of VH than VV to changes [6]. Furthermore, broad-leaf crops are more sensitive to VWC [33]. Second, the land use types assessed in our study differ substantially in growth height. VV backscatter attenuates more with increasing height than VH and is therefore less sensitive to large MZ stands. Third, WW and SB have a predominantly vertical structure in comparison to MZ, therefore causing less volume scattering [6]. In conclusion, VV is less sensitive to stand changes in MZ. With decreasing relevance of VV for MZ, the relevance of the RVI also decreases [95]. Compared to the three crop types, correlations between optical and SAR indices were weak for GL. There are several reasons for this: Cultivated GL, like the plots in our study, does not have a pronounced seasonal cycle. Pronounced changes in GL stands are normally due to anthropogenic management rather than different natural phenophases. Therefore, time series of remote sensing signals can look very different, even for adjacent GL parcels, due to different management practices or intensities. Moreover, 2018 was a very dry year. In many places, there was no relevant biomass accumulation after the first cut. Therefore, we assume that soil influences in the SAR signal were very high during the senescent phase. Small precipitation events had temporary and substantial influences on the soil conditions. This could explain short-term peaks in VH and VV time series for GL [78]. These peaks were not as prominent in ratio and RVI. Other studies confirmed that SAR indices are less affected by fluctuations in soil moisture compared to single polarization data [11,19].
Optical and SAR indices showed different levels of correlation depending on phenophase. High correlations between optical and SAR indices could be found for most crops and phenophases, except for the green phases of SB and WW. This could be since differences in vegetation development, such as heading or flag leave development, changes the structure of the vegetation canopy surface without a major impact on the biomass. Therefore, although the SAR signal is influenced, the optical signal remains relatively constant [32,83]. Furthermore, the dry weather in 2018 could have resulted in an untypical green phase with little greening and many dry areas in WW and SB. For the senescent phase of all land use types, VH backscatter had the highest correlations with optical indices. With decreasing water content, the backscattered signal contains more scattering from lower parts of the plants and the soil contribution increases, which leads to a general decrease in backscatter values [96]. This is in line with decreasing NDVI and NDWI and increasing PSRI. Additionally, VV backscatter is more influenced by the soil in general. The increasing presence of soil during the senescent phase, therefore, reduces the explanatory power of the VV backscatter. Vegetation development of MZ did not appear to hamper the relationships of optical and SAR indices.
In our study, using multiple linear regressions always improved the performance of the relationship between SAR and optical data compared to simple linear regressions. Generally, the most important regressors were, in descending relevance, the SAR variables, the phenophase and land use type, and the test site id. The test site ID is a site-specific variable. Its importance for the model implies that spatially close observations behave similarly. Furthermore, statistical methods can be site-specific [97,98] and thus adding a site-variable can improve remote sensing models [99]. The relevance of the agricultural land use type is in line with the paper of Kim et al. [100], who found that the land use type plays a crucial role in the relationship between VWC and the quad-polarization RVI they investigated. Other studies investigating SAR backscatter behaviour in agricultural landscapes also consider crops separately e.g., [32][33][34]. The high relevance of the phenophase is due to major changes in the plant structure, biomass volume, and differences in VWC and chlorophyll content. Veloso et al. [34] found a good agreement between NDVI and the VH/VV ratio in their study on the temporal behaviour of crops in general, but depending on the phenophase, the signals were not always congruent; for example, during the senescent phase of maize and the growing season of barley and wheat. Other studies also found that specific phenological developments, such as flowering without major biomass increase, lead to differences in the backscatter signal, which can have a stronger influence on the signal than biomass growth [33,92,94]. In addition, in optical remote sensing the relationship between biomass and the optical signal has been reported to change depending on the phenophase [101]. The relevance of the lower quartile of SAR values for some land use types and phenophases shown in our study is like the results of Harfenmeister et al. [32] who found that minimum backscatter values show higher sensitivity to vegetation parameters than maximum values. They reasoned that maximum backscatter values are continuously increasing without a distinct maximum value, whereas the minimum values had a defined lowest value. Unlike initially expected, information on soil, aspect, slope, and elevation did not improve the regressions. Nevertheless, the topography of most test sites in our study is relatively flat and thus the causal influences of aspect, slope, and elevation on the indices assessed might be minor. Furthermore, the S1 orbit direction information did not improve the relationships despite the results of other studies that proved the impact of incidence angle and acquisition time on the backscatter signal of land use types [11,83,97,[102][103][104][105]. However, the best results were achieved with VH or the RVI, which are less affected by incidence angles and planting direction.
Other studies suggested using exponential regression instead of linear regression models when relating SAR indices with vegetation parameters, and report saturation effects of the SAR indices at certain crop heights or certain LAIs e.g., [32,34]. In our study, exponential correlations were not able to better describe the relationships between optical and SAR indices compared to linear correlations. Nevertheless, certain saturation of the SAR values in scatterplots could be detected visually for a few land use types and phenophases. Because optical indices tend to saturate at certain biomasses e.g., [106], the saturation effect of the SAR indices might have been levelled out.
Even with the auxiliary data in the MRA, our models could only explain about 37% to 41% of the variance of the optical indices, when all land use types and phenophases were considered together. Thus, the generic correlation between optical and SAR indices for the land use types was low. A problem in our study might have been the large data set that induced high variances of values and low control of what is happening in the fields. For example, it is known that the LPIS parcel boundaries and information can be flawed [107]. Visual spot tests, for example, revealed that some LPIS parcels used in this study consisted of more than one cultivation or management practice. Additionally, the DWD phenological observations have a low spatial resolution and can only be used as approximations of the real phenology of our test sites. This is especially true for GL, because we had no information on management practices and intensity of the parcels.

Conclusions
The objective of this study was to enhance the understanding of the relationship between optical and SAR indices over agricultural areas represented by winter wheat, spring barley, maize, and grassland. We explored the correlations between three optical indices (NDVI, NDWI, and PSRI) and SAR indices (VH and VV backscatter, VH/VV ratio, and RVI) of four different agricultural land use types over 22 10 × 10 km sites in Lower Saxony considering growing, green and senescent phenophases.
There was no generic correlation between optical and SAR indices in our study. However, when the data were split by land use type and phenophase, the correlations increased remarkably but differed between land use type and phenophases. The relationships were based on the fact that each radiation type reacts to the same superordinate vegetation parameters, such as biomass or VWC, although the underlying mechanisms of radiation backscatter are different. The correlations of the indices for grassland and the green phases of VV and SB were lower than those for the other land use types and phenophases. This was likely due to the abnormally dry weather conditions in 2018. RVI and VH both outperformed backscatter ratio and VV backscatter in this study. Therefore, the RVI, which has been little explored in previous research, showed a high potential for applications of agricultural vegetation monitoring.
The results of our study were independent of external environmental conditions (e.g., soil type, topography). However, relationships were site-specific. Hence, a transferability of results to other regions or crop types needs more in-depth research.
In the future, the high correlations between the optical and SAR indices could be exploited to model optical indices from SAR indices, for example, to fill gaps in optical satellite time series. Although the results of such modelling might not be perfect, they could still have the potential to represent relative changes in vegetation stands accurately enough that abrupt changes become visible. This would be enough, for instance, to identify harvest dates. At the same time, however, the optical and SAR indices do not correlate to such an extent that they would be completely interchangeable. The joint use of both types of data still offers many opportunities for agricultural monitoring, because it allows more information to be acquired on the ground than would be possible using only one data source.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. R 2 in percent of exponential regression calculations. The highest R 2 per optical index is in bold, the highest R 2 per phenophase and land use type is underlined.       Figure A3. Variable importance of the multiple linear regression analysis for all optical index and SAR index pairs by land use type and phenophase.