Evaluation of the Consistency of Simultaneously Acquired Sentinel-2 and Landsat 8 Imagery on Plastic Covered Greenhouses

Remote sensing techniques based on medium resolution satellite imagery are being widely applied for mapping plastic covered greenhouses (PCG). This article aims at testing the spectral consistency of surface reflectance values of Sentinel-2 MSI (S2 L2A) and Landsat 8 OLI (L8 L2 and the pansharpened and atmospherically corrected product from L1T product; L8 PANSH) data in PCG areas located in Spain, Morocco, Italy and Turkey. The six corresponding bands of S2 and L8, together with the normalized difference vegetation index (NDVI), were generated through an OBIA approach for each PCG study site. The coefficient of determination (r2) and the root mean square error (RMSE) were computed in sixteen cloud-free simultaneously acquired image pairs from the four study sites to evaluate the coherence between the two sensors. It was found that the S2 and L8 correlation (r2 > 0.840, RMSE < 9.917%) was quite good in most bands and NDVI. However, the correlation of the two sensors fluctuated between study sites, showing occasional sun glint effects on PCG roofs related to the sensor orbit and sun position. Moreover, higher surface reflectance discrepancies between L8 L2 and L8 PANSH data, mainly in the visible bands, were always observed in areas with high-level aerosol values derived from the aerosol quality band included in the L8 L2 product (SR aerosol). In this way, the consistency between L8 PANSH and S2 L2A was improved mainly in high-level aerosol areas according to the SR aerosol band.


Introduction
The rapid growth of the world population over the past 70 years is requiring higher agricultural yields [1]. This increase in agricultural production is being partially supported by plastic covered agriculture (also known as plasticulture or protected agriculture) [2], including greenhouse, medium or low tunnels and plastic mulch. Plasticulture has shown to be a reliable option to move up the first harvest and increase the yield of crops [2,3]. These advantages have increased the use of plastic materials worldwide during the past decades [4]. The total area of greenhouse-based agriculture in the world was around of 30,190 km 2 in 2016, mainly localized in China (91.4%), Korea (1.9%),

Study Sites
The four study sites (SS) are located in agricultural greenhouse areas from Spain (SS1), Morocco (SS2), Italy (SS3) and Turkey (SS4) (Figure 1a). All of them have a rectangular shape with an area of ca. 8000 ha (8000 × 10,000 m) and the topography is quite flat.  SS1 study site is in the province of Almeria (Southern Spain). It is centered on the geographic coordinates (WGS84) 36.7824 • N and 2.6867 • W (Figure 1b), being just at the core of the greatest concentration of PCG in the world. The most representative greenhouse horticultural crops are tomato, pepper, cucumber, aubergine, melon and watermelon.
SS2 is located in the Souss-Massa plain, in Agadir (Morocco), where more than 10,000 ha are dedicated to under greenhouse horticultural crops, being the tomato the most representative crop (50% of the total PCG area). The center of this study site is on the geographic coordinates (WGS84) 30.1478 • N and 9.4386 • W (Figure 1c).
SS3 is in the Apulia Region (Southern Italy), close to Bari. It is centered on the geographic coordinates (WGS84) 41.0166 • N and 16.9119 • E (Figure 1d). In this area, there is a monoculture in vineyard, where they grow using the traditional grape cultivation system characterized by a supporting structure covered with plastic sheets in spring and summer.
SS4 is in Kumluca district of Antalya, Turkey, which includes intensive greenhouse areas. It is centered on the geographic coordinates (WGS84) 36.3622 • N and 30.2931 • E (Figure 1e). SS4 presents a similar production model as SS1 and SS2, being addressed to the horticultural crops.

Datasets and Pre-Processing
Sixteen cloud-free multi-temporal and simultaneously acquired L8 OLI (Landsat Collection 1 Level-1 TOA (L1T) and Landsat Collection 1 Level-2 BOA (L2) products) and S2 MSI L2A (BOA) images (S2 data includes both Sentinel-2A and 2B images [20]) taken in 2019 were freely downloaded from the U.S. Geological Survey (USGS) website through the EarthExplorer tool [29] and the European Space Agency (ESA)-Copernicus Scientific Data Hub tool [31], respectively (Table 1). Six pairs of these L8 and S2 scenes were taken on the study site SS1, three on SS2, two on SS3 and five on SS4. These were the only pairs taken the same day during 2019 in the study areas. The information about the L8 (Path and Row) and S2 (tile and orbit) used for each study site can be found in Figure 1a. Each of the pairs of L8 and S2 scenes were taken the same day, with the difference between their acquisition times ranging from 11 min and 5 s to 48 min and 36 s. All the L8 (L1T and L2) and S2 L2A scenes were downloaded with a dynamic range of 12 bits.
The L8 OLI L2 product is considered as the official surface reflectance product for L8 images. It is generated through the LaSRC code [30] at a 30 m Ground Sample Distance (GSD), containing seven MS bands (B1-Coastal aerosol (C, 430-450 nm), B2-Blue (B, 452-512 nm), B3-Green (G, 533-590 nm), B4-Red (R, 533-590 nm), B5-Near infrared (NIR, 851-879 nm), B6-Shortwave infrared-1 (SWIR1, 1566-1651 nm) and B7-Shortwave infrared-2 (SWIR2, 2107-2294 nm)). It also contains three quality bands consist of the pixel quality, radiometric saturation and Surface Reflectance aerosol (SR aerosol). The first two quality bands are copied verbatim from the L8 L1T product, while SR aerosol band is delivered with L8 L2 product to provide low-level detail  pixel values) about the aerosol retrieval that may have influenced the final product result. The SR aerosol bit values higher than 194 are qualified as high-level aerosol [42]. L8 L2 product is not run for a scene with a solar zenith angle greater than 76 • , and the efficacy of surface reflectance correction will be likely reduced in areas where atmospheric correction is affected by adverse conditions such as hyper-arid or snow-covered regions (bright areas), low sun angle conditions, coastal regions where land area is small relative to adjacent water and areas with extensive cloud contamination [42].
The S2 L2A is the official product from S2 images. It provides surface reflectance images divided in 100 × 100 km 2 UTM/WGS84 projected tiles. This product has been systematically generated using Sen2Cor processor [28] over Europe since March 2018, and was extended to a global scale in December 2018. The S2 MSI sensor collects up to thirteen bands with three different geometric resolutions ranging from 60 m to 10 m GSD. In this study, we mainly used the six original image bands downloaded with the S2 L2A product at 20 m resolution that match the spectral configuration of the OLI sensor, i.e., B2-Blue (B, 458-523 nm), B3-Green (G, 543-578 nm), B4-Red (R, 650-680 nm), B8a-Narrow NIR (NNir, 848-881 nm), B11-shortwave infrared-1 (SWIR1, 1565-1655 nm) and B12-shortwave infrared-2 (SWIR2, 2100-2280 nm). The RGB S2 images at 10 m GSD were also used to extract ground control points to perform the co-registration process. The other product from L8 OLI used in this work was the L8 L1T one, with the same MS 30 m GSD bands aforementioned for L8 L2 (but referred to TOA values) and one panchromatic band (B8: PAN, 503-676 nm) with 15 m GSD. Considering that the original 30 m spatial resolution of L8 is not suitable to resolve individual fields in many agricultural landscapes [43], particularly in PCG areas, the fusion of the information of PAN and MS L8 L1T bands through applying the PANSHARP module implemented into Geomatica v. 2018 (PCI Geomatics, Richmond Hill, ON, Canada) was applied. The L8 L1T pansharpened images were generated at 15 m GSD and 12-bit depth for the six bands that match the spectral configuration of S2 MSI (B, G, R, NIR, SWIR1 and SWIR2). The next processing step consisted of performing an atmospheric correction to the L8 L1T pansharpened images by applying the ATCOR module, based on the MODTRAN (MODerate resolution atmospheric TRANsmission) radiative transfer code [44] included in Geomatica v. 2018. It is important to note that the cloud masking, water masking and haze removal processes were omitted at this step. The goal was to attain pansharpened L8 15 m GSD images containing surface reflectance values (L8 PANSH) in order to be compared with L8 L2 or S2 L2A data. More details about this step can be found in Aguilar et al. [10].
The images used in this work (Table 1) were clipped according to their corresponding study area. After that, the simultaneously acquired products (i.e., L8 L2, S2 L2A and L8 PANSH) were co-registered using a first order polynomial transformation computed on 10 planimetric ground control points evenly distributed over each working area and extracted from the S2 L2A 10 m GSD images. The alignment between L8 (L2 and PANSH) and S2 images was performed for each acquisition time. Bearing in mind that the study sites were areas with a fairly flat topography, the co-registration turned out to be excellent. In fact, the residual offsets ranged between 2.52 and 4.20 m in terms of planimetric root mean square error (RMSExy), similar values to those reported by Stumpf et al. [45].

Inter-Sensor Radiometric Coherence
Atmospheric correction is especially important in those cases where multi-temporal or multi-sensor images are analyzed [35]. On this basis, a study of inter-sensor radiometric (BOA) coherence between S2 (S2 L2A 20 m GSD) and L8 (L8 L2 30 m GSD and L8 PANSH 15 m GSD) products simultaneously acquired for sixteen dates and four study sites (Table 1) was performed.
The inter-sensor radiometric coherence was evaluated using 100 well distributed polygons for each study site, all of them located inside of individual PCG. This number of polygons was already successfully used by Aguilar et al. [10], being representative of the kind of existing greenhouses in each study area. Furthermore, each polygon was manually digitized, adapting to the shape of each greenhouse and leaving an inside buffer of at least 10 m with respect to the boundaries of the corresponding greenhouse. This technique tried to avoid potentially mixed pixels located at the borders of the sampled PCG, which is a very usual issue when working on medium-resolution satellite imagery. Moreover, the digitization process was performed looking at the three products tested and trying to be representative for all of them. For instance, Figure 2 shows three of the polygons (P1, P2 and P3) digitized in Agadir (SS2) on L8 PANSH (Figure 2a for SS1, SS2, SS3 and SS4, respectively (i.e., covering more than two pixels of the L8 L2 30 m product). It is noteworthy that the explained process forced one to choose only large enough greenhouses to digitize a polygon inside. In this sense, it was difficult to find 100 samples, especially in SS4, where the average size of the greenhouses was quite small.
The mean surface reflectance values for the six common L8 and S2 bands were extracted from the 100 polygons and the three types of data tested (i.e., S2 L2A, L8 L2 and L8 PANSH) for each study site. In addition, and in the same way as in other investigations [34,38,46], the effects of image consistency on the normalized difference vegetation index (NDVI) [47] values were also studied in this work, since it is the most used spectral index in agriculture applications. At this point, it is important to note Remote Sens. 2020, 12, 2015 7 of 22 that the S2 MSI sensor is capable of capturing two NIR bands, Near Infrared (B8) and Narrow NIR (B8a), with the Relative Spectral Response Function (RSRF) of B8a being much more similar to the L8 NIR band [33,35,36]. Because of that, the S2 B8a band was used in this work for comparison with the L8 NIR band (B5).
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 30 (20 m and 15 m GSD, respectively) were halved four times to obtain a final GSD of 1.25 m and 0.937 m, respectively. In the case of L8 L2 30 m GSD, five halvings were needed to obtain a final GSD of 0.937 m. In this way, the segmentation for each image fit better to the original polygons regardless of the original product GSD. Finally, the mean surface reflectance values, expressed as a percentage, for each of the six L8 and S2 common bands and for the three tested products were computed using all the pixels (with an enhanced spatial resolution of about 1 m) within the OBIA segments. NDVI was also computed for each polygon, date and study site, using the mean values attained from Red (B4 for both for S2 and L8 datasets) and NIR bands (B5 for L8 and B8a for S2).

Comparison Metrics
The use of coefficient of determination (r 2 ) or coefficient of correlation (r), and root mean square error (RMSE) or root mean square difference (RMSD), is very usual when dealing with radiometric consistency studies between satellite sensors [34][35][36][37][38][39]46]. In this case, and according to Padró et al. [35], the comparison between surface reflectance values from the different satellite data simultaneously acquired was carried out by computing r 2 and RMSE, the last expressed in reflectance units (%) when comparing bands. The two radiometric coherence measures were computed for each date and study site according to two procedures: (i) Including all the bands together to provide a synthetic view of the consistency between products; (ii) band-by-band to count on a detailed view of the particular behavior of each band.  Object based image analysis (OBIA) Trimble eCognition Developer v. 9.4 software was employed for the extraction of the mean surface reflectance values in each polygon from L8 and S2 products. To this end and for each study site, the chessboard segmentation algorithm included in eCognition was applied on a thematic layer composed of a previously digitized vector file with 100 polygons representing PCGs. Using this approach, the software only projects the vector file onto the images in order to obtain a boundary adapted to the pixels (sample) to be extracted inside the polygons. One eCognition project was conducted for each image product in the sixteen dates showed in Table 1, halving recursively (i.e., dividing each original pixel into four derived pixels) the original GSD on the created project tab in order to increase the spatial resolution of the images without altering the original surface reflectance digital values. This method, devised to apply a very detailed segmentation included in a vector file for medium resolution images using an OBIA approach, was already carried out by Aguilar et al. [10]. In this way, the S2 L2A and L8 PANSH original resolutions (20 m and 15 m GSD, respectively) were halved four times to obtain a final GSD of 1.25 m and 0.937 m, respectively. In the case of L8 L2 30 m GSD, five halvings were needed to obtain a final GSD of 0.937 m. In this way, the segmentation for each image fit better to the original polygons regardless of the original product GSD.
Finally, the mean surface reflectance values, expressed as a percentage, for each of the six L8 and S2 common bands and for the three tested products were computed using all the pixels (with an enhanced spatial resolution of about 1 m) within the OBIA segments. NDVI was also computed for each polygon, date and study site, using the mean values attained from Red (B4 for both for S2 and L8 datasets) and NIR bands (B5 for L8 and B8a for S2).

Comparison Metrics
The use of coefficient of determination (r 2 ) or coefficient of correlation (r), and root mean square error (RMSE) or root mean square difference (RMSD), is very usual when dealing with radiometric consistency studies between satellite sensors [34][35][36][37][38][39]46]. In this case, and according to Padró et al. [35], the comparison between surface reflectance values from the different satellite data simultaneously acquired was carried out by computing r 2 and RMSE, the last expressed in reflectance units (%) when comparing bands. The two radiometric coherence measures were computed for each date and study site according to two procedures: (i) Including all the bands together to provide a synthetic view of the consistency between products; (ii) band-by-band to count on a detailed view of the particular behavior of each band.

Inter-Sensor Radiometric Coherence between L8 Products
The first inter-sensor radiometric comparison was performed between the L8 PANSH 15 m GSD images and the official L8 L2 30 m GSD product. Figure 3 depicts an excellent goodness-of-fit (r 2 ) between the surface reflectance values from these two L8 products, showing that the pre-processing of L8 L1T images (i.e., pansharpening, co-registration and ATCOR atmospheric correction) provided a high coherence level with the official L8 L2 product in our unique study landscape constituted of PCG land cover. While the band-by-band comparison for each date yielded variable results, the best correlation was usually attained for SWIR1 and SWIR2 bands. On the other hand, the most important discrepancies were found in the visible region, mainly in the blue band.
Analyzing the dates with the worst consistency between L8 L2 and L8 PANSH products, such as SS1 January 9, SS2 July 16 or SS4 December 18, it was found that those polygons presenting higher surface reflectance discrepancies were always in areas with high-level aerosol (pixel values ranging from 194 to 228) derived from the aerosol quality band included in L8 L2 product (SR aerosol). Figure 4 shows this clear correspondence in a detailed area extracted from SS1 on January 9. Note that the polygon below (P5 in Figure 4), located in a zone of high aerosol (mean value of 225.44 inside this polygon), presented mean reflectance values of 10.01% and 18.83% for the blue band in L8 L2 and L8 PANSH, respectively. By contrast, the polygon above (P4 in Figure 4), located in a low-level aerosol area (98.63 inside this polygon), showed very similar surface reflectance values in the blue band for the two studied products (25.96% and 24.60% for L8 L2 and L8 PANSH, respectively). In the case of the L8 L2 product, the decrease mainly affected the blue reflectance values, but also the green and red bands, pointing to how high SR aerosol values can explain the abnormal display of the RGB image ( Figure 4b). These anomalous reflectance values are not present in L8 PANSH ( Figure 4d) and S2 L2A products ( Figure 4a). Moreover, significantly better r 2 values are achieved if only the polygons with SR aerosol values lower than 194 are considered (i.e., free of high-level of aerosol according to SR aerosol band) ( Figure 5). The cases depicted in Figure 5 were exclusively those having at least one polygon with high-level of aerosol. It is worth highlighting that the particular distribution of the polygons over each test site can quantitatively affect the outcome of comparison results. However, the main goal here was to detect why some PCGs showed reduced values of reflectance for the visible bands of the L8 L2 images. In that sense, the OBIA approach based on a reduced sample of 100 polygons (corresponding to individual greenhouses) for each study site allowed us to study, in a very detailed way, their radiometric characteristics, giving us the opportunity to detect unique coherence problems.
between the surface reflectance values from these two L8 products, showing that the pre-processing of L8 L1T images (i.e., pansharpening, co-registration and ATCOR atmospheric correction) provided a high coherence level with the official L8 L2 product in our unique study landscape constituted of PCG land cover. While the band-by-band comparison for each date yielded variable results, the best correlation was usually attained for SWIR1 and SWIR2 bands. On the other hand, the most important discrepancies were found in the visible region, mainly in the blue band.     Analyzing the dates with the worst consistency between L8 L2 and L8 PANSH products, such as SS1 January 9, SS2 July 16 or SS4 December 18, it was found that those polygons presenting higher surface reflectance discrepancies were always in areas with high-level aerosol (pixel values ranging The worse results in NIR, SWIR1 and SWIR2 bands in SS3 (Figure 5c) could be related to the high spectral homogeneity of the PCGs in an area dedicated to a vineyard monoculture where the reflectance values varied in a very short range. In fact, the best consistency statistics in terms of RMSE (0.860%) and r 2 (0.995) were achieved in SS3 when all the surface reflectance values (L8 PANSH vs. L8 L2) for the six common bands and the total dates per study site over the 100 polygons (i.e., 3600 data points in SS1, 1800 in SS2, 1200 in SS3 and 3000 in SS4) were used together. RMSE values of 1.635%, 1.348% and 1.277% and r 2 of 0.989, 0.989, 0.987 were attained for SS1, SS2 and SS4, respectively. Figure 6 shows the representation of all surface reflectance values for each study site (3600 data entries in SS1, 1800 in SS2, 1200 in SS3 and 3000 in SS4) from simultaneous official products of S2 (S2 L2A 20 m GSD) and L8 (L8 L2 30 m GSD). The coefficient of determination (r 2 ) indicates the goodnessof-fit between S2 L2A and L8 L2 surface reflectance data (black color straight line), while the corresponding RMSE estimates the disagreement between the surface reflectance position represented by grey crosses, and the theoretically perfect fit (1:1 line in red). In this case, both the RMSE and r 2 values turned out to be quite worse than those obtained for the consistency study between L8 PANSH and L8 L2 products. Obviously, two different sensors (S2 and L8) were compared in this section, but the results turned out to be very variable depending on the study site. The best results were achieved in SS3 (Figure 6c), where the PCG land cover was very homogeneous, without   The worse results in NIR, SWIR1 and SWIR2 bands in SS3 (Figure 5c) could be related to the high spectral homogeneity of the PCGs in an area dedicated to a vineyard monoculture where the reflectance values varied in a very short range. In fact, the best consistency statistics in terms of RMSE (0.860%) and r 2 (0.995) were achieved in SS3 when all the surface reflectance values (L8 PANSH vs. L8 L2) for the six common bands and the total dates per study site over the 100 polygons (i.e., 3600 data points in SS1, 1800 in SS2, 1200 in SS3 and 3000 in SS4) were used together. RMSE values of 1.635%, 1.348% and 1.277% and r 2 of 0.989, 0.989, 0.987 were attained for SS1, SS2 and SS4, respectively. Figure 6 shows the representation of all surface reflectance values for each study site (3600 data entries in SS1, 1800 in SS2, 1200 in SS3 and 3000 in SS4) from simultaneous official products of S2 (S2 L2A 20 m GSD) and L8 (L8 L2 30 m GSD). The coefficient of determination (r 2 ) indicates the goodness-of-fit between S2 L2A and L8 L2 surface reflectance data (black color straight line), while the corresponding RMSE estimates the disagreement between the surface reflectance position represented by grey crosses, and the theoretically perfect fit (1:1 line in red). In this case, both the RMSE and r 2 values turned out to be quite worse than those obtained for the consistency study between L8 PANSH and L8 L2 products. Obviously, two different sensors (S2 and L8) were compared in this section, but the results turned out to be very variable depending on the study site. The best results were achieved in SS3 (Figure 6c), where the PCG land cover was very homogeneous, without any bright areas with very high reflectance values as there were in the other three study sites (see Figure 1d). The following two best-fit study sites were SS2 and SS4 (Figure 6b,d), while the worst consistency between the official products of S2 and L8 was registered in SS1 (Figure 6a). In fact, several reflectance values from S2 L2A greatly overestimated the L8 L2 ones in SS1, even having a couple of grey crosses with surface reflectance values higher than 100%. However, there are also greenhouses presenting the opposite effect (i.e., much higher reflectance values in L8 L2 than in S2 L2A).

Consistency between L8 and S2 Products
Remote Sens. 2020, 12, x FOR PEER REVIEW 11 of 30 any bright areas with very high reflectance values as there were in the other three study sites (see Figure 1d). The following two best-fit study sites were SS2 and SS4 (Figure 6b,d), while the worst consistency between the official products of S2 and L8 was registered in SS1 (Figure 6a). In fact, several reflectance values from S2 L2A greatly overestimated the L8 L2 ones in SS1, even having a couple of grey crosses with surface reflectance values higher than 100%. However, there are also greenhouses presenting the opposite effect (i.e., much higher reflectance values in L8 L2 than in S2 L2A). A more in-depth band-by-band analysis can be performed using Tables 2-5 (study sites SS1, SS2, SS3 and SS4, respectively). Overall, the higher the band wavelength, the higher the coherence between S2 L2A and L8 L2 surface reflectance data. Thus, better consistency (low RMSE values) was achieved for SWIR1 and SWIR2 bands, while the lowest consistency was attained, in this order, for blue, green, red and NIR. It is noteworthy that the RSRF of the SWIR bands are the most similar ones for L8 and S2A [35]. However, there were two exceptional dates in SS1 in which the coherence for SWIR bands turned out to be extremely low in terms of both RMSE and r 2 (June 18, 2019) or only r 2 (June 25, 2019). This unexpected result seemed to be related to the sun glint effects as we will discuss later.
It is important to point out that the values of r 2 were not indicative of coherence between surface reflectance values since they only referred to relative adjustments between data (precision A more in-depth band-by-band analysis can be performed using Tables 2-5 (study sites SS1, SS2, SS3 and SS4, respectively). Overall, the higher the band wavelength, the higher the coherence between S2 L2A and L8 L2 surface reflectance data. Thus, better consistency (low RMSE values) was achieved for SWIR1 and SWIR2 bands, while the lowest consistency was attained, in this order, for blue, green, red and NIR. It is noteworthy that the RSRF of the SWIR bands are the most similar ones for L8 and S2A [35]. However, there were two exceptional dates in SS1 in which the coherence for SWIR bands turned out to be extremely low in terms of both RMSE and r 2 (18 June 2019) or only r 2 (25 June 2019). This unexpected result seemed to be related to the sun glint effects as we will discuss later.
It is important to point out that the values of r 2 were not indicative of coherence between surface reflectance values since they only referred to relative adjustments between data (precision estimation). In fact, high r 2 values could be associated with high RMSE figures if bias is present [34]. For instance, Figure 6a,d depicts a quite good relative fit (black trend line). However, the coherence (adjustment to 1:1 line in red) between surface reflectance values from the two sensors, quantitatively given by the RMSE values, was not so good.  Figure 7 shows the coherence between the L8 PANSH and S2 L2A for all surface reflectance values in each study site. The results were slightly better than those depicted in Figure 6, though they point to a similar trend. L8 PANSH images used in our PCG study areas yielded surface reflectance values slightly more coherent with the S2 L2A ones than the L8 L2 official product. This fact is mainly linked to the removal, in the case of L8 PANSH images, of the surface reflectance artifacts detected in the L8 L2 product associated with points with high-level aerosol values provided by the L8 L2 SR aerosol quality band. Furthermore, the better spatial resolution of L8 PANSH (15 m GSD) against L8 L2 (30 m GSD) may also have increased the improvement achieved.    Figure 7 shows the coherence between the L8 PANSH and S2 L2A for all surface reflectance values in each study site. The results were slightly better than those depicted in Figure 6, though they point to a similar trend. L8 PANSH images used in our PCG study areas yielded surface reflectance

NDVI Comparison
The compared results of NDVI values for L8 L2 vs. S2 L2A and L8 PANSH vs. S2 L2A are plotted for SS1 (Figure 8a,e), SS2 (Figure 8b,f), SS3 (Figure 8c,g) and SS4 (Figure 8d,h). They show that S2 and L8 NDVI values are generally consistent, especially when L8 PANSH images were used. Indeed, L8 PANSH images corrected the low values of surface reflectance in the red band observed in the L8 L2 product just in areas affected by high-level of aerosol. In this way, the NDVI values derived from L8 PANSH were more coherent with those from S2 L2A. When L8 PANSH vs. S2 L2A NDVI values were compared, r 2 values ranging from 0.982 to 0.992 were attained. Moreover, extremely good values were also achieved in terms of RMSE, ranging from 0.046 to 0.018.
values slightly more coherent with the S2 L2A ones than the L8 L2 official product. This fact is mainly linked to the removal, in the case of L8 PANSH images, of the surface reflectance artifacts detected in the L8 L2 product associated with points with high-level aerosol values provided by the L8 L2 SR aerosol quality band. Furthermore, the better spatial resolution of L8 PANSH (15 m GSD) against L8 L2 (30 m GSD) may also have increased the improvement achieved.

NDVI Comparison
The compared results of NDVI values for L8 L2 vs. S2 L2A and L8 PANSH vs. S2 L2A are plotted for SS1 (Figure 8a,e), SS2 (Figure 8b,f), SS3 (Figure 8c,g) and SS4 (Figure 8d,h). They show that S2 and L8 NDVI values are generally consistent, especially when L8 PANSH images were used. Indeed, L8 PANSH images corrected the low values of surface reflectance in the red band observed in the L8 L2 product just in areas affected by high-level of aerosol. In this way, the NDVI values derived from L8 PANSH were more coherent with those from S2 L2A. When L8 PANSH vs. S2 L2A NDVI values were compared, r 2 values ranging from 0.982 to 0.992 were attained. Moreover, extremely good values were also achieved in terms of RMSE, ranging from 0.046 to 0.018.

Sunglint Effects on PCG Areas
Turning to the specific cloud-free scenes taken in SS1 on June 18 and 25, they showed extremely poor coherence statistics (e.g., r 2 values in Table 2 for all the six bands of 0.340 and 0.600, respectively). Regarding these particular cases, it can be verify that several sun glint effects could be detected when sunlight was reflected off specific plastic sheet surface covering greenhouses at approximately the same angle as the satellite was looking at this surface. Thus, the component of sensor-received radiance due to specular reflection of light from the plastic surface could be much greater than the real one. In this regard, the two pairs taken in SS1 during June 2019 were affected by sun glint effects, which can be appreciated in Figure 9.

Sunglint Effects on PCG Areas
Turning to the specific cloud-free scenes taken in SS1 on June 18 and 25, they showed extremely poor coherence statistics (e.g., r 2 values in Table 2 for all the six bands of 0.340 and 0.600, respectively). Regarding these particular cases, it can be verify that several sun glint effects could be detected when sunlight was reflected off specific plastic sheet surface covering greenhouses at approximately the same angle as the satellite was looking at this surface. Thus, the component of sensor-received radiance due to specular reflection of light from the plastic surface could be much greater than the real one. In this regard, the two pairs taken in SS1 during June 2019 were affected by sun glint effects, which can be appreciated in Figure 9. Remote Sens. 2020, 12, .97)). The red and blue circles point to a greenhouse showing a high glint effect (also marked in the same colors in Figure  10). The range (maximum to minimum) and the mean values (in brackets) of reflectance (%) for RGB in the marked greenhouse are shown.
For June 18, 2019, sun glint problems can be made out over a few greenhouses in the S2 L2A image (Figure 9b), while these specular reflections were totally avoided in the case of L8 PANSH (Figure 9a). The reflectance values corresponding to this date for S2 L2A and L8 PANSH can be seen in Figure 10a. The greenhouse marked with the red circle in Figure 9b, affected by sunlight reflection, presented the highest reflectance value in the S2 L2A product corresponding to 138.79% for the SWIR2 band (Figure 10a), while this same greenhouse had SWIR2 reflectance values of 25.42% and 25.82% for L8 L2 and L8 PANSH, respectively. While the greatest differences were detected in the SWIR1 and SWIR2 bands, there were also errors in the visible bands (see RGB reflectance values in Figure 9). These radiometric differences between scenes due to sun glint led to poor coherence results. By contrast, on June 25, 2019 (Figure 10b), the SWIR2 reflectance values for this greenhouse were  Figure 10). The range (maximum to minimum) and the mean values (in brackets) of reflectance (%) for RGB in the marked greenhouse are shown.
For 18 June 2019, sun glint problems can be made out over a few greenhouses in the S2 L2A image (Figure 9b), while these specular reflections were totally avoided in the case of L8 PANSH (Figure 9a). The reflectance values corresponding to this date for S2 L2A and L8 PANSH can be seen in Figure 10a. The greenhouse marked with the red circle in Figure 9b, affected by sunlight reflection, presented the highest reflectance value in the S2 L2A product corresponding to 138.79% for the SWIR2 band (Figure 10a), while this same greenhouse had SWIR2 reflectance values of 25.42% and 25.82% for L8 L2 and L8 PANSH, respectively. While the greatest differences were detected in the SWIR1 and SWIR2 bands, there were also errors in the visible bands (see RGB reflectance values in Figure 9). These radiometric differences between scenes due to sun glint led to poor coherence results. By contrast, on 25 June 2019 (Figure 10b), the SWIR2 reflectance values for this greenhouse were 87.50% for L8 L2 (blue circle in Figure 9c) and 35.57% in the S2 L2A case. It is important to note that the sun glint effect over plastic sheets was related to the sensor viewing geometry and its relationship with the sun position, always located southeast of the area of interest for the L8 and S2 images used here ( Table 1). In that sense, the L8 PANSH image shown in Figure 9a was acquired from the Landsat path 199 (see Figure 1), leaving the sun on its back and, therefore, minimizing the sun glint effects. On the contrary, the S2 L2A scene (Figure 9b) was captured just in front of the sun position from the orbit R094 (see Figure 1), thus causing sun glint problems in some plastic roofs with a specific tilt in relation to the sun position and elevation. Just the opposite happened in Figure 9c,d, where the S2 image was acquired from orbit R051 (leaving the sun on its back) while the L8 product was taken in front of the sun from the Landsat path 200 (Figure 1). In all the other study sites, the simultaneously acquired pairs were downloaded from only a centered path in the case of L8 and from two S2 orbits (the sun in front and the sun behind orbits). In SS2, only the S2 L2A image taken in January was acquired from the bad orbit (i.e., theR037 orbit situated in front of the sun) (see Figure 1). For SS4, the S2 image taken in November was the only one acquired from the orbit situated in front of the sun (R107). In the case of the SS3 study site, there was one S2 image taken on July from the bad orbit (R079). Clear effects of radiometric differences in PCGs due to solar reflections could not be seen in SS2, SS3 and SS4. However, it was possible to make out a few radiometric changes in bright areas, for instance, in some industrial buildings located in the industrial estate of Carpuso (Bari, SS3) in the pair taken on July 1, 2019.
In light of these findings, we strongly recommend using scenes taken from orbits located on the same side as the sun (i.e., R051 and 199 for L8 or S2A in SS1, R137 in SS2, R036 in SS3 and R064 in SS4), especially when working with high sun elevations (from April to September). Figure 11 depicts a direct relationship between the RMSE values (S2 L2A vs. L8 PANSH) for all bands computed for each of the sixteen simultaneous scenes and the sun elevation at each date. Note that the highest RMSE values indicating poor coherence were obtained for the scenes taken from April to September in all the study sites, except in SS3 (Bari). In this period the sun elevations ranged from 52 to 70 degrees. the sun glint effect over plastic sheets was related to the sensor viewing geometry and its relationship with the sun position, always located southeast of the area of interest for the L8 and S2 images used here ( Table 1). In that sense, the L8 PANSH image shown in Figure 9a was acquired from the Landsat path 199 (see Figure 1), leaving the sun on its back and, therefore, minimizing the sun glint effects. On the contrary, the S2 L2A scene (Figure 9b) was captured just in front of the sun position from the orbit R094 (see Figure 1), thus causing sun glint problems in some plastic roofs with a specific tilt in relation to the sun position and elevation. Just the opposite happened in Figure 9c,d, where the S2 image was acquired from orbit R051 (leaving the sun on its back) while the L8 product was taken in front of the sun from the Landsat path 200 (Figure 1). In all the other study sites, the simultaneously acquired pairs were downloaded from only a centered path in the case of L8 and from two S2 orbits (the sun in front and the sun behind orbits). In SS2, only the S2 L2A image taken in January was acquired from the bad orbit (i.e., theR037 orbit situated in front of the sun) (see Figure 1). For SS4, the S2 image taken in November was the only one acquired from the orbit situated in front of the sun (R107). In the case of the SS3 study site, there was one S2 image taken on July from the bad orbit (R079). Clear effects of radiometric differences in PCGs due to solar reflections could not be seen in SS2, SS3 and SS4. However, it was possible to make out a few radiometric changes in bright areas, for instance, in some industrial buildings located in the industrial estate of Carpuso (Bari, SS3) in the pair taken on July 1, 2019.
In light of these findings, we strongly recommend using scenes taken from orbits located on the same side as the sun (i.e., R051 and 199 for L8 or S2A in SS1, R137 in SS2, R036 in SS3 and R064 in SS4), especially when working with high sun elevations (from April to September). Figure 11 depicts a direct relationship between the RMSE values (S2 L2A vs. L8 PANSH) for all bands computed for each of the sixteen simultaneous scenes and the sun elevation at each date. Note that the highest RMSE values indicating poor coherence were obtained for the scenes taken from April to September in all the study sites, except in SS3 (Bari). In this period the sun elevations ranged from 52 to 70 degrees.
The worst consistency was registered in SS1, just in the simultaneous images taken on June 18, 2019, showing an RMSE value of 13.717% ( Figure 11) for a sun elevation of 67.73 (L8) to 71.63 (S2) degrees. In this date, the high sun elevation, together with taking the image from the bad S2 orbit and an extremely bright study area (Figure 1b), led to these remarkably poor results. In contrast, SS3 had a very homogeneous PCG land cover with a single monoculture and without plastic sheets painted white (whitewashed) during summer to protect plants from excessive radiation and to reduce heat inside the PCG [48], so being considered as a low bright PCG area (Figure 1d). The worst consistency was registered in SS1, just in the simultaneous images taken on 18 June 2019, showing an RMSE value of 13.717% ( Figure 11) for a sun elevation of 67.73 (L8) to 71.63 (S2) degrees. In this date, the high sun elevation, together with taking the image from the bad S2 orbit and an extremely bright study area (Figure 1b), led to these remarkably poor results. In contrast, SS3 had a very homogeneous PCG land cover with a single monoculture and without plastic sheets painted white (whitewashed) during summer to protect plants from excessive radiation and to reduce heat inside the PCG [48], so being considered as a low bright PCG area (Figure 1d).
Remote Sens. 2020, 12, x FOR PEER REVIEW 17 of 30 Figure 11. The relationship between the sun elevation and the coherence of S2 L2A vs. L8 PANSH measured as RMSE (all the six bands) for each date and study site.

Discussion
While an excellent overall radiometric coherence between L8 L2 and L8 PANSH (r 2 > 0.987, RMSE < 1.635%) was achieved, some important discrepancies were found in the visible region (mainly in the blue band). These discrepancies were related to areas with high-level aerosol values flagged by the SR aerosol L8 L2 quality band. Aerosols can alter the signal intensity that is reflected or emitted from a surface and eventually received by the satellite sensor [49]. In general, the shorter the wavelength, the stronger the scattering by aerosols and gaseous molecules [49,50]. Thus, the uncertainties of the visible bands are much greater than the NIR and SWIR bands as reported by Li et al. [46]. According to the product guide of L8 LaSRC [42], the L8 OLI Band 2 (Blue) and Band 1 (Coastal aerosol) should not be used for analysis, since they are already used within the algorithm to perform aerosol inversion tests, making them potentially unreliable.
When the official surface reflectance products from S2 and L8 (S2 L2A vs. L8 L2) were compared, the differences resulted to be bandwidth dependent (they were higher in the blue band and gradually decreased for longer wavelengths). This finding was already reported in several works [35,46,51]. Similar r 2 values of 0.900 and 0.929 were reported by Vuolo et al. [52] and Padró et al. [35], assessing the same official products from S2 and L8. The reflectance values were usually overestimated in S2 L2A with respect to those provided by L8 L2, especially for high surface reflectance values. Note that, after the first results of ACIX initiative, a new version of the S2A RSRF was released by ESA in December 2017, particularly changing the responses of bands B1 (Coastal aerosol), B2 (Blue) and B8 (NIR) [40]. According to Zhang et al. [36], data from L8 and S2 have different spectral resolution and, therefore, their data cannot be reliably used together. In addition, the works dealing with sensors consistency usually compared L8 observations collected within ±1 day of the S2 [46,51] even more [34]. This factor was crucial considering that the higher was the time distance between compared images, the weaker was the correlation between S2 and L8 [34]. We tried to avoid this problem only using simultaneously acquired pairs of both sensors.
There were four main factors that increased the radiometric discrepancies registered by the two sensors in PCG areas: (i) Aerosol level, (ii) solar elevation, (iii) sun glint and, (iv) bright areas.
A highly accurate atmospheric correction is required for deriving the surface reflectance products from L8 and S2 TOA reflectance data, due to the scattering and absorbing effects of atmospheric aerosols and gases, such as water vapor, ozone and carbon dioxide [46]. PCG areas presented low consistency related to high-level aerosol. While Sen2Cor and LaSRC consider the aerosol optical thickness (AOT), they produce different surface reflectance values especially for the blue band. Usually, L8 L2 underestimated the surface reflectance values within the blue band in areas with high values of aerosol provided by the SR aerosol band. These errors can be avoided by

Discussion
While an excellent overall radiometric coherence between L8 L2 and L8 PANSH (r 2 > 0.987, RMSE < 1.635%) was achieved, some important discrepancies were found in the visible region (mainly in the blue band). These discrepancies were related to areas with high-level aerosol values flagged by the SR aerosol L8 L2 quality band. Aerosols can alter the signal intensity that is reflected or emitted from a surface and eventually received by the satellite sensor [49]. In general, the shorter the wavelength, the stronger the scattering by aerosols and gaseous molecules [49,50]. Thus, the uncertainties of the visible bands are much greater than the NIR and SWIR bands as reported by Li et al. [46]. According to the product guide of L8 LaSRC [42], the L8 OLI Band 2 (Blue) and Band 1 (Coastal aerosol) should not be used for analysis, since they are already used within the algorithm to perform aerosol inversion tests, making them potentially unreliable.
When the official surface reflectance products from S2 and L8 (S2 L2A vs. L8 L2) were compared, the differences resulted to be bandwidth dependent (they were higher in the blue band and gradually decreased for longer wavelengths). This finding was already reported in several works [35,46,51]. Similar r 2 values of 0.900 and 0.929 were reported by Vuolo et al. [52] and Padró et al. [35], assessing the same official products from S2 and L8. The reflectance values were usually overestimated in S2 L2A with respect to those provided by L8 L2, especially for high surface reflectance values. Note that, after the first results of ACIX initiative, a new version of the S2A RSRF was released by ESA in December 2017, particularly changing the responses of bands B1 (Coastal aerosol), B2 (Blue) and B8 (NIR) [40]. According to Zhang et al. [36], data from L8 and S2 have different spectral resolution and, therefore, their data cannot be reliably used together. In addition, the works dealing with sensors consistency usually compared L8 observations collected within ±1 day of the S2 [46,51] even more [34]. This factor was crucial considering that the higher was the time distance between compared images, the weaker was the correlation between S2 and L8 [34]. We tried to avoid this problem only using simultaneously acquired pairs of both sensors.
There were four main factors that increased the radiometric discrepancies registered by the two sensors in PCG areas: (i) Aerosol level, (ii) solar elevation, (iii) sun glint and, (iv) bright areas.
A highly accurate atmospheric correction is required for deriving the surface reflectance products from L8 and S2 TOA reflectance data, due to the scattering and absorbing effects of atmospheric aerosols and gases, such as water vapor, ozone and carbon dioxide [46]. PCG areas presented low consistency related to high-level aerosol. While Sen2Cor and LaSRC consider the aerosol optical thickness (AOT), they produce different surface reflectance values especially for the blue band. Usually, L8 L2 underestimated the surface reflectance values within the blue band in areas with high values of aerosol provided by the SR aerosol band. These errors can be avoided by performing our own atmospheric corrections without using cloud/water masking and haze removal processes. Moreover, the filter designed by Zhang et al. [36] to exclude outliers by comparing L8 and S2 blue band reflectance (Equation (1)) could be used [51], with Blue L8 being the reflectance value of the L8 OLI band 2 and Blue S2 the S2 MSI band 2 reflectance.
The sun glint effect has been extensively discussed in aquatic satellite scenes [53][54][55], being previously reported for PCG roofs in a study about the quality of digital surface models extracted from very high resolution satellite stereo pairs (WorldView-2 and WorldView-3) [56]. In this case, the stereo pairs viewing geometry and its relationship with the sun position could induce specular reflection of sun light, thus causing unusually bright pixel digital values (sun glint effect) and thus contributing to increase the number of missing image matching points. The sun glint effect observed in the present work was related to the sun elevation, appearing with sun elevation higher than 52 degrees (i.e., mainly in summer season). Working in urban areas (generally bright and anthropic areas such as PCG ones), Nie et al. [38] reported worse consistency in summer data pairs, presenting the major discrepancies in the SWIR2 band. In addition, an interesting relationship between the sun glint on some plastic sheets and the satellite orbit (orbits leaving the sun on their back minimized this solar reflections) was found.
For detecting the outliers due to the sun glint effect observed in Almeria (SS1), a similar filter to the one proposed by Zhang et al. [36] was tested on June 18 and 25 (Equation (2)). In that case, since SWIR2 was the band most affected by sun glint, the filter was based on the radiometric difference for this specific band. We used the mean value of L8 PANSH SWIR2 band (Mean SWIR2 L8 ) and the mean value of S2 L2A SWIR2 band (Mean SWIR2 S2 ), both extracted from each polygon corresponding to an individual greenhouse.
Mean SWIR2 L8 − Mean SWIR S2 Using Equation (2), 28 and 5 outliers (greenhouses/polygons with sun glint effect) were detected on June 18 and 25, respectively. The new data without these outliers notably improved the r 2 and RMSE values shown in Figure 10. In fact, values of 0.892 (r 2 ) and 7.272% (RMSE), and 0.808 (r 2 ), 7.858% (RMSE), were achieved for the pairs taken on June 18 and June 25, respectively. However, both the cut-off value (0.4) and the equation based on SWIR2 reflectance values should be developed more carefully by including more pairs of images around the world presenting glint effect.
Li et al. [46] reported that the S2 L2A surface reflectance values were generally overestimated in bright areas. In these zones, the high AOT has a strong extinction effect and leads to the overestimated surface reflectance [49,50]. Moreover, the accuracy of surface reflectance correction for L8 L2 is likely reduced in bright areas [42]. When these bright areas are composed of non-Lambertian surfaces such as plastic sheets, the surface reflectance discrepancies registered between L8 and S2 can be even more important.
Regarding the NDVI comparison statistics, they were generally better than those reported by other authors, especially when L8 PANSH images were used. Lessio et al. [34] reported RMSE values of NDVI ranging from 0.10 to 0.21 even after applying a bias removal method proposed by them. However, they did not use simultaneously acquired image pairs from L8 and S2. Better NDVI consistency statistics between L8 and S2 were reported by Nie et al. [38] in an urban area in which RMSE values from 0.035 to 0.081 were computed. Li et al. [46] reported that the NDVI calculated using the Sen2Cor surface reflectance of bands B8A and B04 was also reliable. The main problems found in this work affected the blue band (aerosol) and SWIR bands (sun glint), and any of these bands participate in the calculation of NDVI values.
As mentioned in the introductory section, there is a great interest in the spectral harmonization [41] by minimizing the differences between the surface reflectance data acquired from the OLI and MSI sensors. A recently published work has reported on a new method for spectral harmonization using separate transformation functions for various spectral clusters to adequately transform the acquired L8 signal into the S2 spectral multivariate linear regression domain [57]. This method, available as a Python package in a scientific software [58], could be tested in PCG areas in future works.
So far, and to the best knowledge of the authors, this is the first research work dealing with the consistency between L8 and S2 using an OBIA approach. This methodology can be more useful in some specific landcovers (e.g., PCG areas) than the classical pixel by pixel comparison. The applied OBIA approach turned out to be very suitable since the reduced sample of 100 polygons for each study site (all of them corresponding to individual greenhouses) allowed us to study their radiometric characteristics in great detailed.

Conclusions
The inter-sensor radiometric coherence study in PCG areas between surface reflectance values from the L8 and S2 simultaneously acquired products used in this work provided good scores for both their six common bands (r 2 > 0.840, RMSE < 9.917%) and NDVI derived values (r 2 > 0.917, RMSE < 0.048). Overall, the L8 reflectance values were lower than those provided by S2, especially for objects presenting high reflectance values. Regarding the L8 tested products, an excellent fit (r 2 > 0.987, RMSE < 1.635%) between L8 PANSH and the official L8 product (L8 L2) was observed, though some important discrepancies related to areas with high-level aerosol values flagged by the SR aerosol L8 L2 quality band were found in the visible region (mainly in the blue band). In this sense, the L8 PANSH product showed better consistency with S2 L2A than L8 L2 data.
Comparing almost simultaneous scenes from L8 and S2, a very interesting relationship was found between occasional sun glint effects on plastic greenhouse roofs, the orbit of the sensor, sun position and sun elevation. In order to minimize these undesirable effects, especially frequent when sun elevation presented high values (particularly in summer), we strongly recommend to use scenes captured from orbits where the sun is located behind the sensor with respect to the target area (i.e., leaving the sun on their back). The discrepancies between surface reflectance values from both sensors were also related with the bright areas present at each study site.
The main discrepancies in this radiometric consistency study were due to (i) the reflectance attenuation observed in the visible bands of L8 L2 images in in areas with high-level aerosol, and (ii) the sun glint effects detected on plastic sheets. Both sources of error could be minimized removing outliers by using filters based on blue and SWIR bands reflectance values.
While the conclusions are based on a thorough work carried out in four PCG study sites around the world, further works are needed to learn much more about the consistency between the most commonly used medium-resolution satellites to enhance their proper use in PCG areas. The application of methods for spectral homogenization between L8 and S2 as a step prior to the classification of greenhouses based on information coming from time series of both sensors could be a promising research line to explore in further works. Funding: This research was funded by Spanish Ministry for Science, Innovation and Universities (Spain) and the European Union (European Regional Development Fund, ERDF) funds, grant number RTI2018-095403-B-I00.