Mosaicking Landsat and Sentinel-2 Data to Enhance LandTrendr Time Series Analysis in Northern High Latitude Permafrost Regions

: Permafrost is warming in the northern high latitudes, inducing highly dynamic thaw-related permafrost disturbances across the terrestrial Arctic. Monitoring and tracking of permafrost disturbances is important as they impact surrounding landscapes, ecosystems and infrastructure. Remote sensing provides the means to detect, map, and quantify these changes homogeneously across large regions and time scales. Existing Landsat-based algorithms assess different types of disturbances with similar spatiotemporal requirements. However, Landsat-based analyses are restricted in northern high latitudes due to the long repeat interval and frequent clouds, in particular at Arctic coastal sites. We therefore propose to combine Landsat and Sentinel-2 data for enhanced data coverage and present a combined annual mosaic workﬂow, expanding currently available algorithms, such as LandTrendr, to achieve more reliable time series analysis. We exemplary test the workﬂow for twelve sites across the northern high latitudes in Siberia. We assessed the number of images and cloud-free pixels, the spatial mosaic coverage and the mosaic quality with spectral comparisons. The number of available images increased steadily from 1999 to 2019 but especially from 2016 onward with the addition of Sentinel-2 images. Consequently, we have an increased number of cloud-free pixels even under challenging environmental conditions, which then serve as the input to the mosaicking process. In a comparison of annual mosaics, the Landsat+Sentinel-2 mosaics always fully covered the study areas (99.9–100 %), while Landsat-only mosaics contained data-gaps in the same years, only reaching coverage percentages of 27.2 %, 58.1 %, and 69.7 % for Sobo Sise, East Taymyr, and Kurungnakh in 2017, respectively. The spectral comparison of Landsat image, Sentinel-2 image, and Landsat+Sentinel-2 mosaic showed high correlation between the input images and mosaic bands (e.g., for Kurungnakh 0.91–0.97 between Landsat and Landsat+Sentinel-2 mosaic and 0.92–0.98 between Sentinel-2 and Landsat+Sentinel-2 mosaic) across all twelve study sites, testifying good quality mosaic results. Our results show that especially the results for northern, coastal areas was substantially improved with the Landsat+Sentinel-2 mosaics. By combining Landsat and Sentinel-2 data we accomplished to create reliably high spatial resolution input mosaics for time series analyses. Our approach allows to apply a high temporal continuous time series analysis to northern high latitude permafrost regions for the ﬁrst time, overcoming substantial data gaps, and assess permafrost disturbance dynamics on an annual scale across large regions with algorithms such as LandTrendr by deriving the location, timing and progression of permafrost thaw disturbances.


Introduction
Permafrost is warming in the northern high latitudes [1], which has considerable impacts on Arctic terrestrial landscapes. Permafrost is ground that remains at or below 0 • C for at least two consecutive years and it covers roughly 25% of the northern hemisphere [2]. Being defined as a thermal state of the subsurface makes permafrost directly vulnerable to changes in the energy balance of the land surface, in particular the long-term changes in mean annual air temperature, precipitation regimes, and snow distribution and thickness that are currently observed across the high latitudes [3,4]. Continued warming causes widespread permafrost thaw, resulting ultimately in near-surface permafrost loss at local to regional scales [5]. Wide landscape changes can be associated with permafrost thaw and loss and have impacts on ecosystems [6,7], hydrological systems [8,9], soil carbon accumulation and decomposition [10], greenhouse gas emissions [11], and stability of infrastructure in permafrost areas [12]. Considering the vast area permafrost covers across the northern hemisphere and its strong linkages to the Arctic biophysical system [13], changes in the distribution of permafrost due to thaw is of high relevance for the Arctic amplification of global climate change [4,14].
To make more precise estimations of the impact of permafrost thaw on landscapes and biogeochemical cycles, a more detailed remote sensing-based assessment of permafrost disturbances is necessary. Many thaw-related permafrost disturbances in the northern latitudes such as thermokarst formation, thermokarst lake drainage, flooding, rapid soil erosion, and wildfires [15,16] can be tracked with remote sensing methods that provide key information on the distribution, magnitude, duration, and impacts of disturbances [17,18]. However so far, the extent of remotely sensed permafrost disturbance assessments are limited spatially and temporally. Abrupt permafrost disturbance features [15] have been mainly mapped on local to regional scales, such as retrogressive thaw slumps [18][19][20], fires [21], and lake changes [18]. These studies either derived disturbance dynamic parameters from few individual time snapshots [22][23][24][25], or determined trends across a period of time [18,19]. However, knowledge on the annual disturbance dynamics is required on regional to pan-arctic scales. This is crucial for identifying main drivers of abrupt thaw events and including these processes in large-scale climate models.
A number of remote sensing algorithms that were developed for land cover change and forest disturbance applications to detect landscape changes at regional to global scales, capture the temporal dynamics of disturbance features. Since the start of the open access policy of the Landsat data archive in 2008 [26], Landsat is often chosen for high spatial and temporal resolution land cover and change detection assessments. For example, Hansen et al. [27] presented a global forest gain and loss algorithm and Pekel et al. [28] a surface water assessment and its changes with Landsat data. Landsat has the longest available data record, covering more than 48 years, which presents a unique opportunity for long-term assessment studies. Landsat satellites are multispectral sensor systems, providing a spatial resolution of 30 m at nominal revisit times of 16 days covering the whole globe [29]. The Landsat archive therefore allows to assess processes at high spatial resolution and with a reliable continuous time series for more than several decades by now.
The most widely-used algorithms are the Landsat-based detection of Trends in Disturbance and Recovery (LandTrendr) [30], the vegetation change tracker (VCT) [31], the Break detection For Additive Season and Trend (BFAST monitor) [32], the Continuous Change Detection and Classification model (CCDC) [33], and the forest gain and loss algorithm by Hansen et al. (2013) [27], according to a compilation by Zhu (2017) [34] based on algorithm characteristics, such as temporal frequency, change indices, and gradual or abrupt change detection. They all have in common the smallest assessment unit as a pixel but differ in the required temporal data frequency, change index application, the form of disturbance they assess (gradual or abrupt), and whether they are used for near real-time monitoring or past change detection. Since data availability in the northern high latitudes is limited drastically by frequent cloud cover [35], most of the widely-used algorithms that require a very high temporal frequency of input data are not suitable for Arctic applications.
An exception is LandTrendr, as it creates yearly mosaics from cloud-free Landsat images by aggregating cloud-free pixels from a defined period of time, e.g., the summer months [30]. With this approach, LandTrendr ensures a continuous disturbance assessment with an annual resolution. Using aggregated data as input is a promising method to bypass data limitations in the northern high latitudes [36,37]. Moreover, LandTrendr is flexible in change index choice and captures both abrupt and gradual changes. LandTrendr applies a segmentation algorithm for change and trend detection to the time series of annual Landsat mosaics. The segmentation algorithm splits the time series into vertices, points of abrupt spectral change, and time stretches in between of vertices with change, no change or gradual trends, which then represent abrupt and gradual land cover change, respectively. The algorithm results provide information on the disturbance year, magnitude of change, and recovery duration [30]. Since 2018, LandTrendr is incorporated into Google Earth Engine (GEE) [38] and is therefore easily transferable to new application areas, benefiting from GEEs processing power and easily available large-scale satellite data sets [38]. The application of LandTrendr to the high latitudes would allow assessing permafrost disturbance processes for the first time with a continuous time series, deriving temporal dynamics on multiyear to annual scales, derived from more than 20 years of data, at high spatial resolution, which has not been achieved before.
However, even the annual LandTrendr mosaics, based on the moderate repeat frequency of Landsat images, contain considerable gaps due to lack of cloud-free images and pixels in some years, in particular at higher Arctic and coastal sites. Additional disadvantages include high latitude environmental factors such as frequent snow and ice cover, a short growing season length, and a challenging solar geometry [29,39]. As annual cloud-and gap-free mosaics are the prerequisite of a reliable and high-quality time series implementation such as LandTrendr, the single-sensor based algorithm remains prone to local and regional observation gaps in the high latitudes [40]. These are major challenges for permafrost disturbance studies as these same regions are also underlain by thaw vulnerable, ice-rich and organic-rich permafrost [2,41], where first local studies already show extensive permafrost thaw [24,25]. Hence, it is crucial to enhance the input database for a LandTrendr time series analysis for northern high latitude applications.
Active remote sensing sensors such as Synthetic Aperture Radar (SAR) sensors can penetrate clouds and do not rely on sun illumination. Therefore, they are less restricted by weather and illumination conditions, which is very useful in northern high latitudes [42]. Varying SAR methods have been applied to northern high latitudes for land cover mapping [43], surface deformation assessments in permafrost regions [44][45][46] and active layer thickness estimations [47]. While the range of application possibilities of SAR data is broad and promising, one major drawback is its overall sparse data availability due to irregular acquisition schemes from historical SAR sensors such as ERS-1/2 SAR, ALOS-1 PALSAR-1, TerraSAR-X and Radarsat-2 [48]. This has only changed recently since the launch of the Copernicus Sentinel-1 satellites in 2014 and 2016, and their short image acquisition rate in polar areas every 6 to 12 days [49]. With five years of continuous temporally high-resolution data, the available time series for landscape analysis from Sentinel-1 is comparably short and hence the longer optical remote sensing time series of Landsat is still the preferred choice for long time series analysis assessments as exemplified here.
We here propose to combine Landsat and Sentinel-2 images in the mosaicking workflow to ensure data coverage for a continuous time series and a reliable input basis for change assessment algorithms, such as LandTrendr. The Sentinel-2 system has a comparable sensor to Landsat but consists of two near-identical satellites, which have a combined revisit time of only five days [50]. Combining Landsat 8 and Sentinel-2 shortens the overall revisit time in the northern high latitudes drastically, increasing the likelihood of obtaining good quality, cloud-free images [51] and potentially enhancing the database for large scale mosaics [52]. The two sensors are overall compatible and spectral band adjustment coefficients ensure high spectral comparability [53,54], which are also available for Arctic-Boreal regions specifically [55]. Therefore, with the launch of the first Sentinel-2 satellite (Sentinel-2A) there is large potential to improve the annual cloud-and gap-free mosaics from the second half of 2015 onward.
Despite the availability of already existing harmonised Landsat/Sentinel-2 products, such as the Harmonized Landsat and Sentinel-2 surface reflectance data set [54], the Landsat+Sentinel-2 analysis ready data [56] or Sen2Like, a tool to generate Sentinel-2 harmonised surface reflectance products with other sensors [57], these global data sets often show discrepancies to regionally derived and adjusted products [53,55,58]. Hence, a regionally fitted harmonisation approach is favoured for regional applications.
The overall objective of this study is to assess the combination of Landsat and Sentinel-2 data to create good quality mosaics for northern high latitudes as input for high temporal assessment applications such as LandTrendr in GEE. To address this main objective, we focus on three sub-objectives: (1) adapting a mosaicking workflow to incorporate both Landsat and Sentinel-2 images; (2) evaluating the improved data availability as a basis for annual mosaics when using Landsat and Sentinel-2 data; and (3) demonstrating the Landsat+Sentinel-2 mosaic coverage improvements and the preservation of good mosaic quality. We apply and test this framework to twelve diverse study sites in the high latitude permafrost regions across northern Siberia.

Study Sites
This assessment focuses on the terrestrial northern high latitudes and in particular on northern Siberia. We selected twelve study sites, spanning from Taymyr Peninsula in the west (80 • E) to Chukotka Peninsula in the east (172 • W) and from the northern margins of the Siberian mainland (77 • N) to central Yakutia in the south (61 • N) ( Figure 1). The study sites exemplify a variety of landscapes and cover the range of climatic, geologic, geomorphologic and vegetational conditions in the North Siberian permafrost regions. While all sites except Chukotka are within the continuous permafrost zone [2], we included sites which are inland or coastal, with various terrain specifications and elevation levels, within the taiga or tundra ecoregion [59], and with different ground ice contents and thus thaw vulnerability (Table 1). Climatically the study sites lie within the arctic to boreal climate zones and in both maritime and continental areas which defines the rather broad climate region of northern Siberia [60]. The mean annual air temperature (MAAT) for January and July and mean annual total precipitation (MATP) values were extracted from ERA5 data for each study site [61]. Furthermore, specific study sites were chosen as they accommodate long field data records and represent frequently visited research areas, such as the Lena Delta, Batagay, Chokurdakh and the Pleistocene Park near Cherskiy.
For our analysis, we defined each study site by determining a centre point and a 25 × 25 km square area around the centre point. From here on, the term study site refers to the general study location while the term AOI refers to the coordinates of the centre point in our processing workflow (Section 2.3) and AOIarea to the 25 × 25 km areal (Section 2.5) extent of every study site. So, every study site consists of an AOI and an AOIarea.

Data
In our study, we worked with Landsat and Sentinel-2 data. The Landsat program is the longest running civilian optical remote sensing operation, continuously acquiring images of our Earth since 1972. Temporally overlapping missions ensured data continuity for more than 48 years already [62]. The missions' sensors changed moderately over the years but general specifications sustained. Landsat satellites are sun-synchronous, polar-orbiting and have a repeat cycle of 16 days at the equator. The Landsat sensors are multi-spectral imagers, covering a wide range of the electromagnetic wavelength spectrum with their bands, from visible to thermal infrared, and acquiring images with 30 m spatial resolution [63]. In 2015 and 2017, the European Space Agency (ESA) launched the Sentinel-2 satellites from their Copernicus program. The Sentinel-2 mission, consisting of two nearly identical satellites, Sentinel-2A and Sentinel-2B, also operates multi-spectral instruments. Like Landsat, the Sentinel-2 satellites are sun-synchronous and polar-orbiting but have a repeat cycle of 10 days individually and both satellites together of 5 days. Thirteen bands cover the visible to shortwave infrared wavelengths at varying spatial resolutions (10, 20 and 60 m) [50,64]. Landsat 8's Operational Land Imager (OLI, since 2013) and Sentinel-2's Multi-Spectral Instrument (MSI) are overall comparable and their simultaneous operation provides unique opportunities for Earth Observation [51,63,65]. Combined Landsat and Sentinel-2 studies benefit foremost from Landsat's multi-decadal, continuous acquisition scheme and Sentinel-2 high temporal resolution, decreasing gaps in time series from 2015 onward.
We accessed and processed the satellite data on GEE. GEE is a cloud-based platform, providing access to an extensive remote sensing data catalogue, containing a wide range of pre-processed, calibrated, and corrected products, and enabling planetary-scale geospatial analysis supported by Google's computational capacities [66]. For our analysis we accessed the Top-Of-Atmosphere (TOA) image collections from Landsat 5, 7, 8 and Sentinel-2 on GEE (USGS Landsat 5 TM Collection 1 Tier 1 TOA Reflectance; USGS Landsat 7 Collection 1 Tier 1 TOA Reflectance; USGS Landsat 8 Collection 1 Tier 1 TOA Reflectance; Sentinel-2 MSI: MultiSpectral Instrument, Level-1C). The Landsat TOA Tier 1 collections on GEE contain data with calibrated reflectance values [67] of the highest geometric and radiometric quality and thus are considered suitable for time series analysis [68]. Likewise, the Sentinel-2 Level-1C product is also calibrated, geometrically corrected, high quality data [64]. While GEE also incorporates the surface reflectance (SR) products from the Landsat missions, the Sentinel-2 surface reflectance product (Level-2A) is not yet processed globally for the whole Sentinel-2 image archive [69]. ESA has started to produce Sentinel-2 Level-2A images with their atmospheric correction processor Sen2Cor ( [70,71]) but has not processed the full image archive yet. Currently, only Level-2A images from the most recent acquisitions until March 28th, 2017 are provided with regional differences. For Siberia, only Sentinel-2 Level-2A images from late February, 2019 until now are processed and available on GEE (last checked on July 13th, 2020). As we target to develop an automated workflow taking advantage of easy filtering of continental-scale data sets in GEE, we conducted our analysis with the TOA data for now. While the radiometric quality is improved in the SR products [72], the TOA approach is still applicable and relevant, since the TOA spectral properties reflect the ground characteristics well and capture disturbances reliably with prominent spectral signatures. Once the SR product from Sentinel-2 is available, the same workflow can be transferred easily in GEE.

Data Processing and Mosaicking Workflow
The LandTrendr library, provided by Kennedy et al. 2018 [38] on GEE, was the foundation of this work and was successively adapted to the specific needs. In the following, we describe the mosaicking workflow, adaptations and subsequent analysis steps.
We adopted the LandTrendr library and modified the script from image collection filtering, image harmonisation, cloud and cloud shadow masking to mosaic generation. Our code is publicly available: https://code.earthengine.google.com/8ff1ee9ad801549df413184820bd7f39.
The biggest modification to the adopted mosaic workflow is the change of input collections. We use the Landsat TOA collections and additionally the Sentinel-2 Level-1C collection as input ( Figure 2). Following this adjustment, a harmonisation step for the Sentinel-2 images was included to obtain a consistent spectral database. The remaining overall workflow was maintained and only modified where necessary. The Landsat and Sentinel-2 input was processed in parallel and only combined in the last step before creating a combined mosaic ( Figure 2, 'buildMosaic: medoid').
The image collections were filtered by AOI, by date and by cloud cover to obtain a subset of the image collection representing the database for the annual mosaics. The AOI was the centre point coordinates of the study sites (Table 1), the date filter was set to the summer months (1st July-31st August) to restrict deficiencies due to seasonality and the cloud cover to less than 80%. All collections, Landsat and Sentinel-2, contain pre-processed metadata information on cloud cover extent per scene (Landsat metadata property: cloud cover, Sentinel-2 metadata property: cloud coverage assessment) and cloud and cloud shadow information per pixel (Landsat band: BQA, Sentinel-2 band: QA60), which were used first for general cloud image filtering and later for cloud and cloud shadow pixel masking, respectively [73,74].
We implemented Kennedy et al. [38] harmonizationRoy function, applied to the filtered image collections, to harmonise Landsat 8 OLI images to Landsat 7 ETM+, ensuring consistent spectral properties across the Landsat sensors in the time series. In this case, the TOA spectral transformation coefficients were applied [75] but otherwise the harmonizationRoy function remained unchanged. Six bands are included in the band transformation, blue (B1), green (B2), red (B3), near infrared (B4), shortwave infrared 1 (B5) and 2 (B7). The band selection is based on Landsat 7's bands and the band availability across the different Landsat sensors. Landsat 8's corresponding bands are selected and renamed in the harmonizationRoy function. Following the example of the harmonizationRoy function, we implemented a similar harmonisation scheme for Sentinel-2 data to guarantee consistent spectral reflectances across Landsat and Sentinel-2 data (harmonizationS2L8). First, we harmonised the Sentinel-2 images to Landsat 8, by applying spectral transformation coefficients for Arctic-Boreal regions for TOA data and resampled them to 30 m spatial resolution, resembling Landsat data. We derived Sentinel-2 to Landsat-8 spectral transformation coefficients for TOA data for Arctic-Boreal regions from the same study set up and workflow as described by Runge and Grosse 2019 [55] but used not the SR same-day images but the same TOA images as data input. We applied GEEs' reproject() and resample() functions for images and reprojected the Sentinel-2 images to Landsats' projection (WGS 84, UTM), a scale of 30 m and applied the bicubic resampling method. Secondly, we transformed the adjusted Sentinel-2 images further to Landsat 7 with the harmonizationRoy function, as described above for TOA Landsat data. Until here, we operated on the image level. In the next steps, we processed the images in the image collections further on the pixel level. The cloud and cloud shadow masking were achieved by using the provided quality bands. For Landsat we used the quality assessment bitmask (BQA) [73] and for Sentinel-2 we used the QA60 Bitmask (QA60) [74]. Both quality bands are bitmasks that contain information on the presence of clouds and cloud shadows at the pixel level. Consequently, we masked out all pixels with a cloud or cloud shadow, based on the information in the quality band bitmask. The remaining cloud-free pixels form the input database for the mosaic process described next.
After image filtering and pixel masking the last step was to generate annual mosaics based on our processed database, which can be applied to every study site and assessment year. Kennedy et al. 2010 and 2018 [30,38] create medoid mosaics in their LandTrendr workflow. A medoid is similar to a median and represents the point with the minimal summed distance to all points in a data set but considers multiple dimension in its selection, which are the different bands in this case [76]. Medoid mosaics are resilient against extreme values, such as outliers from unmasked cloud or cloud shadow pixels, and provide representative mosaics of a defined time period [76]. As the cloud and cloud shadow masking is not impeccable [77] and our assessment strongly relies on a specific narrow time period, the summer months July and August, we maintained the medoid mosaic approach and applied the medoid mosaic function from the LandTrendr library [38]. Input of the medoid mosaic function are the annual Landsat and Sentinel-2 image collections that already are filtered, harmonised and cloud masked at this point. Both collections are merged and then the medoid mosaic function identifies the medoid for every pixel within the merged collection. Output of the function is a mosaic image, which is cloud-free and spectrally representative for the summer months as we used the medoid function.
The above described workflow runs systematically for every year of the assessment period, generating a cloud-free mosaic. These annual mosaics are combined in a mosaic collection, representing a mosaic time series for a study site. For our assessment we created a 21-year mosaic collection from 1999 to 2019. This time period includes seventeen years of Landsat-only mosaics, from 1999 to 2015, when Landsat data started to be reliably and consistently available for Siberia, and the last four years of combined Landsat and Sentinel-2 mosaics. We extended the assessment period beyond the overlapping Landsat and Sentinel-2 period to draw comparisons between Landsat-only and combined Landsat+Sentinel-2 mosaics.

Data Availability Assessment
Prerequisite of good quality annual mosaics are a good input database. The above described processing workflow defines the framework for high quality mosaic outputs, minimising artefacts from seasonality, merging of multi-sensors or clouds and cloud shadows. We further assess the quality of the combined Landsat+Sentinel-2 mosaic output by looking at the input data availability and the spectral characteristics of the mosaic outputs.
We assessed the data availability on both image and pixel level. As a general rule, the bigger and more robust the input database is in form of number of images and number of cloud-free pixels, the better the mosaic output. Therefore, we derived for every study site the image list of available images with the GEE filterCollection function, for Landsat and Sentinel-2 separately. The image list contains all images fulfilling the filter requirements from the mosaicking workflow (AOI, date and cloud cover). The result is an available image count for the specific study site per year, specifying images that cover the AOI, were acquired in July and August, and are not completely cloudy. The image count number illustrates the general input database for the mosaicking workflow.
The previously derived image list is input for a cloud-free pixel analysis. The filtered image list might still contain images with a high cloud or cloud shadow cover across the AOIarea, as the filtering is based on metadata information (Landsat metadata property: cloud cover, Sentinel-2 metadata property: cloud coverage assessment), which were derived for the full Landsat (185 × 185 km) or Sentinel-2 (290 × 290 km) scenes. Input to the medoid mosaicking process should be cloud-free pixels only, hence we apply pixel-based cloud and cloud shadow masking as described in Section 2.3. Therefore, not only the available image count is of interest but more importantly the number of cloud-free pixels. To assess this, we applied the buildClearPixelCountCollection function from the LandTrendr library to every study site [38]. The buildClearPixelCountCollection function takes the filtered image collection as input, applies the cloud and cloud shadow masking to every image in the collection, which operates on the pixel level, and counts the remaining number of clear pixels for every pixel at the study site, for every year of the assessment period. The output of the function is an image collection, containing one band image per year (1999-2019) with the cloud-free pixel count for every location. For a more representative picture, we did not look at the clear pixel count at every pixel but calculated the average number of clear pixels for every study site, aggregated across the AOIarea. Within the 25 × 25 km square of a study site the cloud-free pixel count slightly varies spatially, mainly because of overlapping sensor swaths, the Landsat 7 ETM+ scan line corrector failure and partial cloud cover, but not significantly.
Hence, an aggregated average per study site is a good representation. This averaged cloud-free pixel count is derived for every year in the assessment period.

Mosaic Coverage and Quality Assessment
Eventually, the main goal is to generate good quality mosaics. We assessed the quality of the resulting combined Landsat+Sentinel-2 mosaics by study area coverage and spectral characteristics. For the four years that we have both Landsat and Sentinel-2 data available, 2016 to 2019, we created two mosaics for every year and study site. One mosaic with Landsat TOA data only (Landsat-only mosaic) and the second one with Landsat and Sentinel-2 data together (Landsat+Sentinel-2 mosaic), following the above mosaic workflow. For both mosaics we calculated the percentage they cover of the study site area (AOIarea). The annual mosaics can contain holes from data gaps, given that no cloud-free pixel was available at a location for the mosaic. With the area calculation we can compare which mosaicking result, Landsat-only or Landsat+Sentinel-2 mosaic, more reliably produces a full study site coverage [78].
As described in Section 2.3 we ensure spectral consistency across the different sensors by applying harmonisation functions, including spectral transformations. Nevertheless, to confirm that the combined Landsat and Sentinel-2 data workflow produces representative and radiometric consistent good quality mosaics, we conducted a spectral comparison between cloud-free images and the Landsat+Sentinel-2 mosaic [37].
For every study site we selected individual cloud-free Landsat 8 and Sentinel-2 images, fulfilling the previously stated filtering requirements, by AOI and date range, except lowering the maximum cloud cover value to less than 30%. We selected the images with no cloud or cloud shadow across the AOIarea. For every study site we found for one year a cloud-free Landsat 8 and Sentinel-2 image fulfilling these requirements ( Table 2) and conducted the spectral comparison for that years' Landsat+Sentinel-2 mosaic and the selected Landsat 8 and Sentinel-2 images. For comparability, we applied the same harmonisation approach that is also part of the mosaic workflow, to the individual Landsat 8 and Sentinel-2 images. The Landsat 8 images were transformed with the harmonizationRoy function and the Sentinel-2 images in two steps, first with the Sentinel-2 to Landsat 8 band transformation (harmonizationS2L8) and then also with the harmonizationRoy function. We then sampled 5000 random pixel locations (GEE sample function) per study site (AOIarea) and extracted the spectral band values of the Landsat image, the Sentinel-2 image and the Landsat+Sentinel-2 mosaic. The random pixel locations were only taken from the areas where both images and the mosaic overlapped. We downloaded the data sets with the spectral pixel values from GEE and conducted a spectral band comparison in Jupyter Notebook. For every band we compared the corresponding spectral values between the Landsat image and the Landsat+Sentinel-2 mosaic and the Sentinel-2 image and the Landsat+Sentinel-2 mosaic separately and calculated the correlation coefficients (r-value), indicating the strength of linear relationship between image and mosaic pixel spectral values [79]. Six bands were included in this assessment, blue (B1), green (B2), red (B3), near infrared (B4), shortwave infrared 1 (B5) and 2 (B7).

Data Availability Assessment
We assessed the number of available Landsat and Sentinel-2 images by study site for 1999 to 2019 (Figure 3). For 1999 to 2015, only Landsat images from the sensors Landsat 5 TM, 7 ETM+, and 8 OLI are available. The absolute number of available Landsat images varies across study sites but overall the number of Landsat images increases within the study period. The lowest number of images available for most study sites is in 1999, ranging from 0 Landsat images in East Taymyr  Overall, the number of available images increases drastically with the addition of Sentinel-2 images and likewise the number of cloud-free pixels, enhancing the database for the mosaicking workflow greatly.

Mosaic Coverage and Quality Assessment
For the predefined 25 × 25 km study site areas (AOIarea) we calculated the mosaic coverage percentages for 2016-2019 and created the Landsat-only mosaics and also the Landsat+Sentinel-2 mosaics for all twelve study sites. Table 3 shows the area coverage results exemplarily for Kurungnakh for 2016-2019, and for those study sites where one or both mosaics did not fully cover the AOIarea. Figure 4 illustrates the difference between the Landsat-only and Landsat+Sentinel-2 mosaic for Kurungnakh in 2019. Mosaic gaps are clearly visible in the Landsat-only mosaic and discolouring from cirrus and haze. Contrary to this, the Landsat+Sentinel-2 mosaic shows full coverage and a homogeneous mosaic result. The study sites and years not shown in Table 3 have an area coverage of >99.9% for both mosaics. We consider a difference of <0.1% between the Landsat-only and Landsat+Sentinel-2 mosaic coverage of a study site as negligible.
While the Landsat+Sentinel-2 mosaics consistently cover the whole study areas, the Landsat-only mosaics frequently contain holes and gaps (Table 3). Sobo Sise has the lowest Landsat mosaic coverage with 27.2% in 2017 and East Taymyr a considerably low coverage of 58.1% in 2017 as well. For the same years, the combined Landsat+Sentinel-2 mosaic achieves a full coverage of the study sites.
While Table 3 clearly points out the benefits of the Landsat+Sentinel-2 mosaics in terms of area coverage for several sites, it also shows that there are study sites with only very small area coverage discrepancy between Landsat-only and Landsat+Sentinel-2 mosaic. We conducted the area comparison in mosaics for four years (2016-2019) for each study site, which results in 48 area comparisons in total. Out of the twelve study sites, six study sites (Kurungnakh, Bykovsky, Sobo Sise, East Taymyr, Krestovaya, Chukotka) show differences in area coverage of >0.1% for thirteen years of area comparison together ( Table 3). The other six study sites (West Taymyr, Yakutsk, Verkhoyansk, Batagay, Chokurdakh, and Pleistocene Park) show no differences in the same four years. Our analysis clearly shows that combining Landsat and Sentinel-2 images in the mosaic workflow always reliably produces a full study area coverage of 99.9-100%.  To assess the quality of the Landsat+Sentinel-2 mosaics we conducted spectral band comparisons between the mosaics and single cloud-free Landsat and Sentinel-2 images (Table 2), assessing the radiometric consistency of the mosaics. We here depict the spectral band comparisons as scatter plots from Kurungnakh as a representative example of all study sites ( Figure 5). In Table 4 we also display the correlation coefficients (r-values) of the spectral band comparisons of the other study sites.
There is a strong agreement between the Kurungnakh Landsat+Sentinel-2 2018 mosaic and the Landsat 8 image as well as the mosaic and the Sentinel-2 image, only showing minor scatter along the one-to-one line ( Figure 5). The spectral band comparisons show a strong correlation between both images and the mosaic, across all six bands. The r-values illustrate this as well with the lowest r-value of 0.91 and 0.92 (Band 1) and the highest of 0.97 and 0.98 (Band 4) for the Landsat and Sentinel-2 image, respectively.
We found similar close correlations for the other study sites as well (Table 4). An exception is the Krestovaya spectral comparison of Band 1 of the Sentinel-2 image and the Landsat+Sentinel-2 mosaic with an r-value of 0.28, indicating poor correlation. However, all other Sentinel-2 band correlations for Krestovaya are much higher, ranging from 0.60 (Band 2) to 0.93 (Band 5), signalling medium to good agreement. It is noticeable, that the general correlation between the images and mosaics for Band 1 are the lowest, for both Landsat and Sentinel-2 comparisons, followed by Band 2 and Band 3. The comparisons for Band 4-7 are overall much higher and indicate strong correlations. The noticeable lower correlation coefficients for Band 1, 2 and 3, the blue, green and red wavelength spectrum respectively, are likely on account of the poorer quality of TOA products [80] as well as the stronger effect of atmospheric conditions, clouds and cirrus, on the visible wavelength ranges [37,81,82].
There are no distinct differences or trends across the twelve study sites. All study sites show a good correspondence between the mosaic, the Landsat, and Sentinel-2 image. Furthermore, there is also no noticeable difference between the Landsat -mosaic and Sentinel-2 -mosaic comparison. The correlation coefficient values between these two comparisons for one study site are always very close to one another. This underlines that there is no specific trend or sensor bias in the Landsat+Sentinel-2 mosaic as neither Landsat image nor Sentinel-2 image shows higher agreement to the output mosaic. Table 4. The correlation coefficients (r-values) from the spectral band comparisons for each study site between the Landsat image and the Landsat+Sentinel-2 mosaic and between the Sentinel-2 image and the Landsat+Sentinel-2 mosaic (see Table 2).

Discussion
The mosaic results and the study area coverage highly depend on the available images and cloud-free pixels in one year, which in turn is dependent not only on the scene acquisition rate but also on climatic and local weather conditions during July and August. In general, both Landsat and Sentinel-2 are highly influenced by local weather conditions as they are both optical sensor systems. Yet, the combined Landsat 8 and Sentinel-2 revisit time is very short, at around 2.9 days at the equator and even shorter in the northern high latitudes [51]. Their combined use highly increases the likelihood of obtaining cloud-free images during July and August in a year, potentially providing a more robust database for the mosaics, which is the most important prerequisite for large scale mosaics [36,37]. Our results confirm this, as not only the number of images but especially also the number of cloud-free pixels increases greatly for all twelve study sites when combining Landsat and Sentinel-2, enhancing the input database for the mosaic workflow.
When the usable image database is sufficient, the Landsat-only mosaics can achieve a full coverage of a study area. In 2018 for example this is possible for Kurungnakh. The same applies for the six study sites mentioned in Section 3.2 where the Landsat-only mosaics achieve a coverage of 99.9-100%. These findings suggest that the existing Landsat-only approach may be sufficient in some cases. However, at the same time half of our study sites were not fully covered by Landsat-only mosaics and the mosaic coverage strongly benefited from the adapted Landsat and Sentinel-2 mosaic workflow. The increased number of images and especially also the increased number of cloud-free pixels when adding Sentinel-2 data to the mosaicking workflow ensured a full coverage of the study areas. Hence, a full study site coverage can most reliably be achieved by the combined Landsat+Sentinel-2 mosaics.
In our assessment we can observe a north to south gradient in both Landsat image and cloud-free pixel availability as well as in the study area coverage. The total number of Landsat images for every study site extends from Sobo Sise in the north with the least number of images to Yakutsk, the most southern study site, with the highest number of usable Landsat images. This north-south gradient in usable image numbers is observable across the whole Landsat acquisition period due to the differing data acquisition strategies of Landsat sensors, satellite on-board memory capacities, and satellite and sensor geometries [35,40]. In addition, there is a south to north gradient in cloudiness from more drier continental sites farther inland with less cloudiness to sites with more maritime influence at the Arctic coast with more frequent clouds and fog. Hence, predominantly the most northern study sites are not fully covered by the Landsat-only mosaics in certain years, such as Kurungnakh, Bykovsky, Sobo Sise, East Taymyr, Krestovaya and Chukotka. These areas are especially prone to cloud cover and fog, which decreases the number of usable images considerably, making the mosaicking more challenging [37]. The two platforms of Sentinel-2 help counterbalancing this effect along the north-south gradient, ensuring a more stable database across the latitudes. Thus, assessments in the northern high latitudes especially benefit from the Landsat+Sentinel-2 mosaic workflow as it ensures full spatial coverage and a reliable time series.
Our analysis primarily focused on the spatial dimension of mosaic coverage but the spatial completeness of annual mosaics also directly impacts time series assessments. In time series analysis the complete annual mosaics ensure a temporally consistent database without data gaps across years. In contrast, spatially incomplete mosaics result in missing years and thus temporal gaps in the time series. For example, there are no images and therefore also no cloud-free pixels available for East Taymyr in 1999 and 2002, and in six more years (2001,2003,2004,2005,2008,2010) there are only few images available and therefore on average less than 1 cloud-free pixel across the AOIarea. A Landsat-only time series analysis for the high Arctic coastal site East Taymyr is therefore questionable since the yearly database is irregular and unreliable. Only from 2016 onward with the additional availability of Sentinel-2 data does the number of cloud-free pixels increase substantially, allowing for spatially complete annual mosaics and indicating the first stretch of temporal consistent data useful for a potential time series analysis. Our findings underline the advantage of combining data from multiple sensors on the overall data availability for continuous, good quality time series analyses [40].
Being able to create a reliable annual mosaic database starting as early as 1999 for time series analysis is already an extensive improvement. However, for certain ecosystem or biophysical dynamics, such as changes in vegetation phenology, an annual time stamp is not sufficient but a much higher intra-annual temporal resolution is required [83,84]. While we do not assess this here specifically, we assume that this combined mosacking workflow of Landsat and Sentinel-2 data has the potential to produce intra-annual mosaics. As the combined repeat cycle of Landsat and Sentinel-2 data is less than 2.9 days in the northern high latitudes [51], the enhanced Landsat and Sentinel-2 image database may allow to create multiple mosaics for a year, covering for example spring, summer and autumn time. While we cannot obtain a very high temporal resolution as for example MODIS does yet, we however achieve a much higher spatial resolution with Landsat and especially Sentinel-2 (10 to 30m), delivering new insights into small-scale changes in heterogeneous permafrost landscapes.
In this study we closely follow the workflow from the GEE LandTrendr algorithm. However, changing the input data sets from the SR product to the TOA collections and adding another sensor, potentially introduces inaccuracy to the mosaic output. One step to minimise errors is to apply a stringent workflow and apply per-pixel compositing rules [36], which was already established in the LandTrendr workflow and is maintained in this adaptation. The overall spectral comparisons between Landsat+Sentinel-2 mosaic and Landsat and Sentinel-2 images show close agreement between the input images and the resulting mosaic. The strong correlation to both independent, single-date Landsat and Sentinel-2 images underlines that there are no systematical artefacts introduced in the mosaicking process. The applied image transformation and harmonisation steps, which were further extended to the Sentinel-2 images, result in a compatible database. Since the correlation coefficients of both comparisons do not show a better fit to either Landsat or Sentinel-2 image, the medoid mosaic output evidently manages to create representative, sensor-unbiased annual output mosaics [37,76]. Overall, surface reflectance images are of higher quality than TOA images, because of their advanced calibration and radiometric corrections. In order to detect and monitor more subtle changes, that are reflected in smaller spectral change signatures such as shifts in vegetation, it is necessary to use SR data instead of TOA [85]. We are confident that once Sentinel-2 SR data is available the workflow can be easily adapted to SR data and deliver mosaic results with similarly good spectral characteristics, as the TOA spectral comparison results are already very positive.
The huge potential of earth observation applications combining Landsat and Sentinel-2 have been comprehensively discussed even before the launch of Sentinel-2 [50,86,87]. An important step is the harmonisation and transformation of the sensor products that enables the combined use of Landsat and Sentinel-2 [53][54][55]58,88]. The possibilities of sensor combination range from data fusion approaches [65], temporal aggregations [89,90] to combined spatial annual mosaics for large area coverage, such as Canada [37], Germany [52], or Africa [91]. Creating large scale mosaics, especially in areas with data limitations, remains challenging [36,37,78]. Especially the northern high latitudes benefit from earth observation approaches combining multiple platforms and sensors, as they are otherwise very restricted in data availability, limiting remote sensing time series analysis hugely.
This study presents an approach to achieve full spatial coverage of northern high latitude permafrost regions with remote sensing data on an annual basis for the first time. The combined Landsat and Sentinel-2 mosaic workflow accomplishes both a high spatial and also temporal data coverage, needed for studying a range of ecosystem and landscape dynamics in rapidly changing Arctic and Boreal regions [13]. A high temporal resolution of data is important for understanding disturbance evolution, frequency, and impacts [92]. An annual temporal resolution of time series analysis allows to assess highly dynamic permafrost pulse disturbances and possibly even identify the main disturbance drivers. Thermokarst lake dynamics and particularly lake drainage events have currently only been identified and described based on multiple-date images or trend analysis [93][94][95][96]. An annual time series analysis will help to narrow down change drivers and identify inter-annual lake dynamics, such as partial or full drainage, or intermittent infilling, which otherwise remain undiscovered in multi-year trend data. This similarly applies to retrogressive thaw slumps, which occur widely across the terrestrial Arctic [18]. They are highly dynamic permafrost disturbances but an annual monitoring was not possible so far across large regions, limiting understanding of potential links to climatic and weather conditions that drive the slumping dynamics [19,[22][23][24][25]. We therefore anticipate that these processes can be detected and monitored in greater detail when applying LandTrendr with Landsat+Sentinel-2 annual mosaics. The LandTrendr analysis output will deliver information on the location and timing of thaw disturbances, the temporal progression on an annual basis since 1999, and possibly forms a foundation for volumetric change analysis in areas where multi-year digital elevation models are available. These results will give new estimations of the impact of permafrost thaw on landscapes, biogeochemical cycles, and the biophysical systems. Since this approach is fully scalable, the more detailed and homogeneous remote sensing-based assessment of permafrost disturbances and their annual dynamics on regional to pan-arctic scales will provide crucial information for identifying main drivers of abrupt thaw events in the permafrost domain. The results will also help including these processes in large-scale climate models, something that has been lacking so far and was identified as an important knowledge gap [16,97,98].

Conclusions
The overall objective of this study was to assess the combination of Landsat and Sentinel-2 data in a mosaic workflow to create good quality annual mosaics for the growing season (July and August) as input for high temporal time series assessments in northern high permafrost latitudes. We found that the number of available images, as well as the number of cloud-free pixels, increases drastically when combining Landsat and Sentinel-2 images. The combined Landsat and Sentinel-2 input database for mosaics is thus highly enhanced, which improves the mosaic results as the combined Landsat+Sentinel-2 mosaics cover the spatial extent of study areas reliably without data-gaps. Especially northern, coastal sites benefit from the combined Landsat+Sentinel-2 approach, where Landsat-only mosaics show poorer results. A stringent mosaic workflow, including spectral harmonisation functions between Landsat and Sentinel-2, ensures data consistency across sensors and achieves homogeneous, good quality mosaics from Landsat and Sentinel-2 as shown by spectral comparison between images and mosaic outputs. Applying the Landsat+Sentinel-2 mosaic approach by aggregating image data annually for a set time period accomplishes not only high spatial but also continuous temporal coverage enabling large scale time series assessments. Based on our findings we conclude that our approach combining Landsat and Sentinel-2 data makes LandTrendr in principle suitable for permafrost disturbance assessments also in challenging environments of northern high latitudes, where frequent clouds, fog, and snow and ice cover limit the acquisition of good quality multispectral imagery. The application of LandTrendr with our combined Landsat-and Sentinel-2 mosaic approach to the high latitudes will allow assessing permafrost disturbance processes for the first time with a continuous time series, deriving temporal dynamics, such as disturbance location, timing and temporal progression since 1999, on an annual scale which has not been achieved before. Following large-scale or circumpolar permafrost disturbance assessments will deliver highly needed consistent and scalable input for climate models. Funding: This research was supported by BMBF KoPf (grant 03F0764B) and ESA CCI+ Permafrost.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study, in the analyses and interpretation of data, and in the writing of the manuscript.

Abbreviations
The following abbreviations are used in this manuscript: