Preliminary Comparison of Sentinel-2 and Landsat 8 Imagery for a Combined Use

The availability of new generation multispectral sensors of the Landsat 8 and Sentinel-2 satellite platforms offers unprecedented opportunities for long-term high-frequency monitoring applications. The present letter aims at highlighting some potentials and challenges deriving from the spectral and spatial characteristics of the two instruments. Some comparisons between corresponding bands and band combinations were performed on the basis of different datasets: the first consists of a set of simulated images derived from a hyperspectral Hyperion image, the other five consist instead of pairs of real images (Landsat 8 and Sentinel-2A) acquired on the same date, over five areas. Results point out that in most cases the two sensors can be well combined; however, some issues arise regarding near-infrared bands when Sentinel-2 data are combined with both Landsat 8 and older Landsat images.


Introduction
Thanks to the availability of the new generation of medium resolution multispectral sensors installed on board the Landsat 8 and Sentinel-2 satellite platforms, new opportunities for long-term high-frequency monitoring applications have been disclosed. Thanks to the 12-bit quantization, these sensors also provide an improved radiometric quality, which can expand the range of applications for ocean and inland water monitoring (e.g., [1]).
Landsat 8 was launched in 2013, and its Operational Land Imager (OLI) provides high quality multispectral images at the resolution of 30 metres (15 for panchromatic) and a revisiting time of 16 days [2,3]. It aims to provide data continuity to the Landsat Earth observation programme, started in the 1970s.
The Sentinel-2 mission provides for a combination of two satellites-Sentinel-2A and Sentinel-2B-equipped with identical Multispectral Instruments (MSI) capable of acquiring data in 13 bands at different spatial resolutions (between 10 m and 60 m). It is also intended to provide continuity to the SPOT missions [4]. The orbits are designed to ensure a revisiting time of about five days at the equator, considering both satellites. Sentinel-2A was launched in June 2015, and it is now operational, while the launch of Sentinel-2B is scheduled for 2017. The MSI has a very wide field of view (290 km swath width, which is significantly larger than the 185 km of the OLI).
Even if the acquisition of usable data is highly affected by the local meteorological conditions (it has been estimated that Sentinel-2 will supply one cloud-free image per month on average [5]), the joint use of these constellations offers the opportunity to build time series with an unprecedented frequency [6]. However, the combined use of different sensors poses a number of conceptual and technical challenges. The platform and sensor combinations differ in their orbital, spatial, and spectral configuration. As a consequence, measured physical values and radiometric attributes of the imagery are affected [7]. For example, a root mean square error (RMSE) greater than 8% in the red band was found when comparing MSI and Landsat-7 simulated data, due to the discrepancies in the nominal relative spectral response functions (RSRF) [7].
Regarding the spectral configuration, the specifications of the new sensors are designed in such a way that there is a significant match between the corresponding spectral bands (see Figure 1 for near-infrared bands); however, the RSRFs of the instruments are not identical, so some differences are expected in the recorded radiometric values. Clearly, the importance of these differences depends on the application and on the approach adopted to perform time series analyses or change detection. Methods based on physical quantities retrieved by remote sensing reflectance or empirical approaches based on multispectral indices are more affected by the problem [8]. Conversely, methods based on separate classification of every image are less affected, if the training is also independent [9,10]. Additionally, the different spatial resolution of the two sensors affects their combined use for time series analyses, especially when a resampling procedure is required and the observed surfaces are heterogeneous.
The present paper aims at pointing out the spectral differences between Landsat 8 OLI and Sentinel-2 MSI sensors, in the perspective of a combined use of the two for time series analyses. Furthermore, the differences with a predecessor of the Landsat series-the Thematic Mapper 5 (TM5)-have been outlined in order to evaluate the integration of the new data in the existing and established time series. TM5 was chosen because of the huge amount of data collected in more than 25 years of operation. The free availability of Sentinel data, Landsat 8, and older Landsat archives constitutes an obvious advantage in the adoption of these data for long time series analysis at medium spatial resolution.

Materials and Methods
Landsat 8 and Sentinel-2 corresponding bands and some band combinations which can be used for land and water monitoring were compared.
The analyses were performed in six areas, located in Australia, Bolivia, China, Iraq, and Italy ( Figure 2). The areas show different land covers and climatic conditions. In all the selected areas, a pair of real Landsat 8 and Sentinel-2A images was collected. In Italy, the comparison was also performed on simulated images derived from hyperspectral data.

Simulated Data
For a test area in Italy, one hyperspectral Hyperion image was collected to generate a coherent set of simulated OLI and MSI images, under identical conditions of illumination, geometry of acquisition, and atmospheric conditions. The Hyperion scene was acquired on 4 July 2005 between 9:46 and 9:51 am, under clear sky conditions. It covers an area of approximately 610 km 2 , including a portion of the lagoon of Venice and the nearby inland ( Figure 2), characterised by a wide variety of land cover types, including agricultural fields, inland, and open shallow water.
The simulated images were produced in the ENVI-IDL™ environment. The original Hyperion image was pre-processed to mitigate some artefacts which are typical of this pioneering hyperspectral sensor, in particular the occurrence of bad pixels and the streaking effect. The pre-processing was performed with a software package developed by CSIRO [13].
The atmospheric correction was realised by means of the Visual SixS software [14], which is an implementation of the 6SV code (version 1.1) [15]. MODIS atmospheric products MOD07 and MOD04 were used to set water vapour and ozone contents and the aerosol optical thickness.
The corrected Hyperion image was used to simulate Landsat 8 and Sentinel-2 images by means of a spectral convolution of the hyperspectral data cube with the relative spectral response functions (RSRFs) of the instruments.

Real Images
Sentinel-2 and a Landsat 8 images were collected for five areas of interest, both acquired on the same day (Table 1). For each pair, a test area of about 1500 km 2 , including different land cover types, was selected to perform the analyses. The short time leaps between the two acquisitions (from 3 to 17 min only) minimise the differences in radiometry due to varying atmosphere and illumination conditions.
The first test site was located in South Australia, in a semi-arid area including the ephemeral salt lakes Gairdner and Macfarlane. A second site was located in South America, and encompassed a portion of the Bolivian plateau from the south-eastern bank of Lake Titicaca to the snow-capped peaks in the northern section of the Cordillera Real (Andes). The third zone was located at the northern border of the Taklamakan Desert in China, and included a segment of the Tarim River. The fourth site encompassed a portion of the Tigris valley in Iraq, in the area of the artificial lake near Mosul. Unfortunately, a consistent haze over the lagoon of Venice (Italy) hampered the analyses over exactly the same test area of the Hyperion image. Therefore, a different test area was selected as the fifth case in a very similar environment, located north-east of the first one, including a portion of the Adriatic coast from Caorle to the lagoon of Marano.
The pre-processing of the multispectral images included atmospheric correction and geometric co-registration.
As is well known, the signal recorded by a sensor is a function of the radiance coming from the surface and the atmospheric effects. Both contributions to the recorded signal depend on the RSRF, which controls broadband quantities. Furthermore, over low-reflectance surfaces such as water, the contribution of atmospheric scattering may be dominant [16]. The comparisons were therefore performed on surface reflectance values obtained by atmospheric correction.
Visual SixS software was also used for the real images. Contemporary MSI and OLI images were corrected assuming exactly the same atmosphere and aerosol models, setting the main atmospheric parameters (water vapour content, ozone, and optical depth) for each test area on the basis of MODIS atmosphere second level products (MOD04 and MOD07).
It is worth mentioning that, for this study, the accuracy of the atmosphere characterisation is not particularly relevant, because residual errors systematically affect all data sets and do not influence the validity of the comparisons among simulated data. What is important here is that both images underwent the same process, to minimise possible discrepancies stemming from the correction.
Recent studies [17] investigated the standard geolocated Landsat 8 L1T and Sentinel-2 L1C products, finding a misalignment of several pixels between the two. Contemporary images were therefore co-registered geometrically, using ten tie-points and a conformal transformation; the retrieved global RMSE on the coordinates of the tie-points was about 6 metres (lower than the MSI pixel size). Finally, MSI data were downsampled at the same resolution of OLI images (30 m) with an averaging algorithm.

Sensor Comparison
The corresponding bands of both simulated and real images were compared (bands 1, 2, 3, 4, 5, 6, and 7 for OLI, and bands 1, 2, 3, 4, 8A, 11, and 12 for MSI), and three spectral indices were furthermore computed. The evaluated indices-chosen to explore specific bands of interest and common land cover types-are as follows.
where NDVI is the well-known normalised difference vegetation index; NDWI is the normalised difference water index, intended for open water feature delineation [18]; and FII is the ferrous iron index [19]. Band and index correlations were evaluated by computing the coefficients of a linear regression and the Pearson correlation coefficient. Very low reflectance values-sometimes occurring over water surfaces-may cause numerical stability problems when computing band ratios, thus generating outlier pixels. They were excluded from these computations by setting thresholds based on the 2nd and 98th percentiles, computed on the distribution of index values.
For the NDWI analyses, a mask was generated to distinguish water from land pixels on the basis of a simple threshold on the near-infrared (NIR) reflectance.
Finally, since the resampling procedure applied to MSI data creates discrepancies in the spectral values of corresponding pixels (especially over heterogeneous surfaces), the local coefficient of variation was computed for each pixel, considering a kernel of 5 × 5 neighbour pixels. A threshold was set on the coefficient of variation (0.07) to include only pixels belonging to the most homogeneous areas in the scenes in the comparison.

Results and Discussion
Comparisons based on the simulated and real data obtained from the Hyperion image demonstrate potentials and challenges of the combined use of multispectral Sentinel-2 and Landsat 8 images. As pointed out by recent works [8], although a clear similarity between the image products of Landsat and Sentinel sensors can be observed by a visual inspection, when numerical values are compared, some band combinations can show differences, which are to be evaluated.
Correlation and regression coefficients between the corresponding bands are reported in Table 2 for all the studied areas. All of the compared bands show a good linear correlation, and the regression lines depart slightly from the identity line. For example, a reflectance of 10% in the red band of the OLI sensor is expected to range between 9.6% and 10.2% in the corresponding MSI band, according to the regression lines computed from the real images analysed here. In some cases, correlation coefficients appear slightly lower for band 1, likely because of the lower spatial resolution (60 m for MSI). In general, however, the presented results are also in good agreement with the ones presented by Vuolo et al., [20], who found determination coefficients ranging from 0.90 to 0.96 for the six homologous bands (B1 was not considered).  The performed tests highlight that one issue is the choice of the near-infrared band. In fact, comparisons of reflectance and index values confirm that MSI band 8A is the optimal choice from the radiometric point of view when Sentinel-2 images are to be coupled with Landsat 8 ones. Instead, MSI band 8 is to be preferred for a joint use with older Landsat series, such as Landsat-5 (Table 3). However, band 8A comes with a different spatial resolution (20 m) from the other visible and near-infrared bands (10 m); thus, issues related to resampling procedures must be considered. Looking at Figure 3, NDVI values obtained from a Sentinel-2 scene change depending on the choice of the NIR band. This may be relevant when analysing a long time series which includes both OLI images and older Landsat TM data. Results of all the comparisons between spectral indices computed with different sensors are reported in Table 4. In all cases, results from real images are in good agreement with simulated ones, even if variance is slightly larger, as could be easily foreseen. When comparing Sentinel-2 and Landsat 8 data, this fact can be explained considering residual effects of spatial heterogeneity generated by the resampling procedure (which are not present in simulated data), residual co-registration errors, bi-directional reflectance problems (especially for vegetation), and specular reflections for water, which arise from the different azimuth and elevation of the sensors and are not removed by the adopted calibration process. Correlations are poorer for NDWI if only water pixels are considered; otherwise, they are similar to the NDVI ones. This is probably caused by the low reflectance of water, which produces a decrease in the signal-to-noise ratio. In the presented analyses, the only exceptions were observed in Australia and China: in the first case, there is a strong signal from the floor of the salt lakes; in the second one, lake waters are more turbid than in the other sites, thus reflectance values are higher and homogeneous. The poor correlation in clear water surfaces is also confirmed by the scatter plots shown in Figure 4, reporting the Bolivian case, where water pixels exhibit a more noisy pattern in spectral indices (negative values for NDVI, positive values for NDWI). In general, as the number of bands involved increases, the correlation coefficient may decrease, because noises sum up. This may explain the slightly lower coefficient found for FII.
The problem of resampling related to the different spatial resolution of MSI and OLI sensors is certainly a relevant source of errors for all the procedures that evaluate changes on a pixel basis. Some tests were performed in order to analyse the spectral discrepancies generated by the resampling procedure over spatially heterogeneous surfaces. The pixels in the images were sliced according to the local coefficient of variation, setting the thresholds on the basis of the deciles of its distribution. The Pearson correlation coefficient between corresponding bands or indices was therefore computed considering only one slice at a time. As an example, the behaviour of the NDVI correlation against the heterogeneity of the scene represented by the coefficient of variation is reported in Figure 5. This problem may be overcome using an object-oriented approach [21] in some applications, especially when the change detection is performed in post-classification.
When problems related to spatial heterogeneity are avoided, the linear correlations are generally good, confirming the potential of a combined use of Landsat and Sentinel products. However, the regression lines are slightly divergent from the identity line, probably as a consequence of the differences in the RSRFs. The opportunity to compensate for this effect in time series analyses should be evaluated for each specific application. This compensation can be accomplished with different models, such as univariate or multivariate regression models [7]. The regression coefficients retrieved from real images in the five test areas presented here are slightly changing in value, likely because of the differences in land cover types and consequently in target reflectance values. For this reason, a site-specific model might be the most appropriate choice, regardless the mathematical model adopted.

Conclusions
The availability of new-generation multispectral sensors on board the Landsat 8 and Sentinel-2 satellite platforms offers an unprecedented possibility to perform high-frequency time series analyses, which greatly expand the opportunities to carry out multi-temporal change detection studies on phenomena showing a significant dynamic behaviour (for example, high-frequency mapping for disaster management) or on locations facing frequent cloud cover problems. However, the radiometric characteristics of these new sensors-though similar-are not identical, and can produce appreciable differences in the retrieved radiometric quantities.
Some tests performed on simulated data and on real images, acquired with a time gap lower than 20 min, demonstrate the very good correlation between corresponding bands (Pearson coefficient generally higher than 0.98), however regression lines slightly diverge from the identity line. The impact of the radiometric differences between the images acquired by the two sensors are to be carefully evaluated, in order to determine whether the discrepancies in reflectance values are relevant or not for each specific application, depending of course on the methodology adopted and the aim of the study.
Author Contributions: Emanuele Mandanici: work design and data processing. Gabriele Bitelli: supervision of research activity and overall paper editing.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: