Large-Area Gap Filling of Landsat Reflectance Time Series by Spectral-Angle-Mapper Based Spatio-Temporal Similarity ( SAMSTS )

Landsat time series commonly contain missing observations, i.e., gaps, due to the orbit and sensing geometry, data acquisition strategy, and cloud contamination. A spectral-angle-mapper (SAM) based spatio-temporal similarity (SAMSTS) gap-filling algorithm is presented that is designed to fill small and large area gaps in Landsat data, using one year or less of data and without using other satellite data. Each gap pixel is filled by an alternative similar pixel that is located in a non-missing region of the image. The alternative similar pixel locations are identified by comparison of reflectance time series using a SAM metric revised to be adaptive to missing observations. A time series segmentation-and-clustering approach is used to increase the search efficiency. The SAMSTS algorithm is demonstrated using six months of Landsat 8 Operational Land Imager (OLI) reflectance time series over three 150 × 150 km (5000 × 5000 30 m pixels) areas in California, Minnesota and Kansas. The three areas contain different land cover types, especially crops that have different phenology and abrupt changes due to agricultural harvesting, which make gap filling challenging. Fillings on simulated gaps, which are equivalent to 36% of 5000 × 5000 images in each test area, are presented. The gap filling accuracy is assessed quantitatively, and the SAMSTS algorithm is shown to perform better than the simple closest temporal pixel substitution gap filling approach and the sinusoidal harmonic model-based gap filling approach. The SAMSTS algorithm provides gap-filled data with five-band reflective-wavelength root-mean-square differences less the 0.02, which is comparable to the OLI reflectance calibration accuracy.


Introduction
Since the free availability of Landsat data, there has been a rapid increase in the use of Landsat data for time series analyses, typically for change detection, but also for classification and biophysical parameter retrieval [1,2].The Landsat satellites provide the longest, 45+ years, moderate spatial resolution record of the terrestrial surface.Since the 1982 launch of Landsat 4, each Landsat mission has acquired 30 m data with a 16-day repeat cycle and, for extended periods, there have been two Landsat satellites in orbit nominally providing an eight-day repeat cycle [3].Landsat data are provided in a global path/row World Reference System (WRS-2), and each approximately 180 × 180 km WRS-2 path/row location is overpassed by a Landsat sensor 22 or 23 times per year [4].However, a number of sensor, ground station and data communication issues, and variable mission acquisition strategies, reduce the acquisition frequency [3,[5][6][7].These effects, combined with cloud obscuration at the time of Landsat overpass [8,9], result in Landsat reflectance time series that have missing observations at various aperiodic times of any year.Gap filling of reflectance satellite time series is complex because of a number of factors, including surface cover and condition changes, residual clouds, shadows, atmospheric contamination, and temporal reflectance variations caused by surface reflectance anisotropy and directional illumination and viewing variations [10][11][12].Landsat-specific methods that fill gaps at individual acquisition dates have been developed.For convenience, they are categorized as temporal interpolation (TI) and alternative similar pixel (ASP) gap filling approaches and are reviewed in Section 2.
In this paper, a new ASP gap filling algorithm, which is designed specifically to fill large gaps in Landsat data and work reliably over complex (spatially-heterogeneous and temporally dynamic) surfaces and using one year or less of Landsat data, is described and demonstrated.It uses a similarity metric based on a form of the spectral-angle-mapper (SAM) metric that is adaptive to missing observations in time series and has been demonstrated over temporally-dynamic agricultural areas [13].It searches for alternative similar segments using a time-series-based segmentation-and-clustering method, and the ASP search window size is not spatially constrained, which makes it suitable for large-area gap filling.The proposed SAM based spatio-temporal similarity (SAMSTS) gap-filling algorithm is demonstrated using 26 weeks (i.e., six months) of Landsat 8 time series data projected into a tiled georeferenced coordinate system.Three 150 × 150 km (5000 × 5000 30 m pixels) test areas over agricultural regions in Minnesota, Kansas and California are considered, which include various crops and other land cover types, including grassland, shrub, forest and urban areas.Simulations removing different pixel areas followed by gap-filling are undertaken to provide insights into the gap-filling algorithm performance.The results are compared qualitatively and quantitatively with the more-straightforward closest temporal pixel substitution gap filling approach and the sinusoidal harmonic model temporal interpolation gap filling approach.

Satellite Gap-Filling Literature Review
Methodologies to "fill" gaps in satellite-retrieved reflectance time series were developed initially for systematically-acquired coarse-resolution satellite data acquired by the Advanced Very High Resolution Radiometer (AVHRR) and the Moderate Resolution Imaging Spectroradiometer (MODIS) that provide near-daily global coverage due to their polar low earth orbit and wide (110 • ) field of view.Compositing procedures were developed to provide reflectance and vegetation index datasets that represent the surface over consecutive n-day time periods [14][15][16].Compositing approaches have also been developed for Landsat data, but are typically less reliable because the cloud-free observation frequency is lower than provided by near-daily coarse resolution polar orbiting satellite data [17][18][19].Compositing approaches that invert a bi-directional reflectance distribution function (BRDF) model against n-days of reflectance observations to estimate the reflectance at nadir view and for a consistent solar zenith have been developed as well [20,21].However, these more physically-based compositing approaches do not work reliably with Landsat data because of the narrow (15 • ) Landsat field of view that precludes representative sampling of the surface BRDF [22,23].
Coarse-resolution satellite data have been used to help fill gaps in Landsat time series.For example, Roy, D.P. et al. [24] developed a semi-physical fusion approach that used the MODIS BRDF 500 m product to predict 30 m Landsat spectral reflectance for a desired date.Other empirical approaches have been developed based on the spatial and temporal adaptive reflectance fusion model (STARFM) that blends 16-day 30 m Landsat with near-daily or daily 500 m MODIS data to generate synthetic daily Landsat-like 30 m reflectance data [25].These include methods that are somewhat adaptive to land surface change such as the spatial temporal adaptive algorithm for mapping reflectance change [26], and the spatial and temporal data fusion approach (STDFA) [27].The enhanced STARFM [28] was developed to handle complex and heterogeneous landscapes by incorporating spectral unmixing and has been demonstrated over agricultural areas [29][30][31].However, coarse-resolution satellite data that may not always be available or cloud-free, for example, prior to the launch of MODIS in 1999, global coverage near-daily coarse-resolution data suitable for gap-filling Landsat are not available.
Temporal interpolation (TI) gap-filling approaches have been developed that fit time series statistical models to predict reflectance or vegetation index values on a given day.Linear, logistic, or sum of sinusoidal models have been used [32][33][34][35][36][37][38][39].The model fits are conducted on individual pixel time series.Model fitting is sensitive to the quality, number of available observations, and the seasonality of missing observations.TI methods are typically less reliable for surfaces that have abrupt changes, for example, due to land cover change, flooding, fire, or for surfaces with complex phenology, such as double or triple agricultural cropping [35,40,41].Time series change detection methods that account for both abrupt and gradual trends in coarse spatial resolution data have been developed [41], but are less well suited for Landsat application as they require a higher observation frequency than provided by Landsat.
The ASP gap-filling approach fills a gap pixel with the values of one or more alternative pixels selected from non-missing pixels found usually in the same image.Alternative similar pixel locations are identified from a reference image that may be the same, a previous, or a subsequent image that has no gap at the location to be filled [42][43][44][45][46][47][48][49][50].The ASP method was first developed to fill gaps in Landsat 7 Enhanced Thematic Mapper Plus (ETM+) images introduced by the Scan Line Corrector (SLC) failure that occurred in 2003.Landsat 7 ETM+ images with the SLC failure, usually referred to SLC-off images, have 22% less ETM+ image pixels with missing wedge-shaped gaps [5].The earliest ASP approach used a segmentation map generated from a pre-2003 Landsat 7 ETM+ image without gaps that was overlain on the spatially-coincident SLC-off image [42].For each band, the missing pixel values were replaced by the majority non-missing pixel values falling in the segment from the same SLC-off image.This was later improved by using a multi-scale segmentation [43].The strength of this approach is that the gaps are filled with coincident spectral values from the same Landsat image.However, it requires a segmentation that should reflect the surface conditions of the image to be gap filled, which is difficult to obtain everywhere and for all times, and so this is not usually a reliable method.Other ASP methods find alternative similar pixels in local search windows centered on the gap pixel location.For example, Pringle et al. [44] derived the gap pixel value by kriging alternative similar pixel values and locations in the gap image, and also by co-kriging the alternative similar pixel values in both the gap image and previous and/or subsequent reference image(s) when insufficient alternative similar pixels were available.Similarly, Chen et al. [45] derived the gap pixel value as the weighted average of the alternative similar pixels on the gap image.The weight was determined by the spatial distances of each alternative similar pixel to the gap pixel location and the root-mean-square difference of the alternative similar pixel values in the gap image and the reference image.More recent approaches have used Landsat time series.Malambo and Heatwole [49] described a pixel-time-series-comparison-based ASP method to estimate missing spectral-index data and provide gap filling under both gradual and abrupt surface change conditions.They used Landsat time series of a spectral index suitable for burned area discrimination, and alternative similar pixels were identified using a similarity metric that measured the maximum difference between two pixels' index time series after removing their mean index difference.The filling was conducted iteratively within a search window no more than 41 pixels around the gap pixel location, with gaps progressively "eroded" until they were completely filled.Other researchers have noted that progressive filling in this manner may propagate filling errors and so is less suitable for filling spatially-extensive gaps with heterogeneous land covers [50][51][52].
Large gaps frequently occur in Landsat time series due to extensive regions of cloud obscuration [8,9] and swaths of missing observations caused by the Landsat sensing and orbit geometry [53].Development of gap-filling methods that can fill large gaps is expected to be particularly difficult over heterogeneous and rapidly-changing areas.Heterogeneous areas have a greater likelihood of containing pixels with fewer available alternative similar pixels [43,47].For example, adjacent agricultural fields may contain the same crop, but may have a variety of spatial surface variations associated with soils, drainage, irrigation, topography, sowing, harvest date, and growth stage differences that are evident at moderate resolution [54,55].Similarly, if time series reference data are used, then the reference images are less likely to be similar to the image to be gap filled if the surface land cover and/or condition changes [26,28].Despite these issues, the ASP approach does not require contemporaneous coarser resolution satellite data that may not be available or cloud-free.Further, unlike TI approaches, the ASP approach makes no assumptions concerning the surface temporal variation and does not require long-term (i.e., annual or multiple year) time series.These observations motivate the SAMSTS algorithm that uses an ASP approach and is described below.

Landsat 8 Data
Landsat 8 Operational Land Imager (OLI) data, which were processed in a similar manner to the recently-available Landsat Analysis Ready Data (ARD) data [56], were used in this study.Images for 26 weeks (six months) sensed from 28 April to 26 October 2013 were used.The reflectance band data were converted to surface reflectance using the established Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) code that uses aerosol characterizations derived independently from each Landsat acquisition and assumes a fixed continental aerosol type and uses ancillary water vapor data [57].The Landsat 8 quality assessment band was used to liberally remove cloud-contaminated pixels, defined in this study as medium and high confidence cloud pixels and high confidence cirrus pixels.The Landsat 8 OLI data are acquired over a 15 • field of view and so small bi-directional reflectance effects (due to viewing and solar geometry variations and surface reflectance anisotropy) are present [23,58], but they were not corrected for in this study.Only the OLI surface reflectance for the green (0.56 µm), red (0.66 µm), near-infrared (NIR) (0.87 µm), and two shortwave infrared (SWIR) (1.61 µm and 2.20 µm) bands were used; the shorter wavelength Landsat 8 OLI blue bands were not used because atmospheric correction is considerably less reliable at shorter blue wavelengths [59,60].
Gridded atmospherically-corrected weekly Landsat 8 OLI data were generated using the Web-Enabled Landsat Data (WELD) processing system [17].The WELD data are defined in the Albers Equal Area projection in 5000 × 5000 30 m pixel tiles, which was adopted to generate the ARD [56,61].These data have been used previously for large-area time series land cover and surface change mapping applications [62][63][64][65][66].At CONUS latitudes, the sensed Landsat 8 image swaths do not spatially overlap in any calendar week [53].In the weekly CONUS WELD time series, like the ARD, a fixed pixel location may be sensed seven or nine days apart due to the orbit and sensing geometry.

Cropland Data Layer
The United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) Cropland Data Layer (CDL) product was used to provide insights into the results and to understand the land surface dynamics in terms of the dominant land cover classes.The CDL is generated annually using moderate-resolution satellite imagery and extensive agricultural ground truth via a supervised non-parametric classification approach.It defines about 110 land cover and crop type classes at 30 m for all the CONUS in the same Albers Equal Area conic projection as the WELD data and the ARD [67,68].The CDL for 2013 was used as it is the same year as the Landsat 8 OLI data used in this study and was obtained from [69].

Test Areas
Three WELD/ARD 150 × 150 km (5000 × 5000 30 m pixel) tiles located in California, Minnesota and Kansas were selected as test areas.The tiles are representative of different land cover types (Figure 1) with seasonally dynamic and different land covers (Figure 2), and have representative amounts of missing data including large spatial gaps (Table 1).The California tile (Figure 1a) is in Northern California, encompassing Colusa and San Joaquin counties, and has the greatest number of land cover classes, but relatively less pronounced phenological variations due to its more temperate climate.The Minnesota tile (Figure 1b) encompasses the southwest corner of Minnesota and parts of Northern Iowa.The Kansas tile (Figure 1c) is in the southwest of Kansas and encompasses parts of Northern Oklahoma.The Minnesota and Kansas tiles are predominantly agricultural and are located in the U.S. corn belt [70,71] and wheat belt [72], respectively.Figure 2 illustrates the mean weekly Normalized Difference Vegetation Index (NDVI) for the major CDL classes in each test area.Phenological variations among the CDL classes occur due to a number of factors, including geographic location, weather and site conditions, crop planting and harvesting dates, and land cover differences.The NDVI profiles typically have unimodal distributions with pronounced spring and summer NDVI peaks.In the California tile, except for the CDL corn and rice classes, there is relatively less pronounced phenological variation.The summary values illustrated in Figure 2 do not clearly reflect gaps, except for weeks 18, 20, 23, and 39 (California), weeks 21, 22, 24, and 34 (Minnesota), and weeks 19, 21, 25, 32, 37, and 42 (Kansas), when there were no Landsat acquisitions due to the location of the tiles relative to the Landsat 8 ground track.
Table 1 summarizes the number of missing pixel observations (i.e., cloudy or not sensed) over the three tiles for the 26 weeks of 2013.The left column summarizes the number of weeks with at least one valid 30 m pixel in the tile, denoted by n.The percentage of missing weekly 30 m pixels is summarized in two ways: as the sum of the number of missing 30 m pixels observations in each weekly tile divided by the product of 26 and the tile spatial dimensions (Table 1, middle column); and as the sum of the number of missing 30 m pixels observations in each weekly tile divided by the product of n and the tile spatial dimensions (Table 1, right column).Over the 26 weeks, about 50% of the study area data were missing but, when considering only those weeks with at least one valid pixel, the percentage is 41% (California), 46% (Minnesota), and 30% (Kansas).

Overview
Figure 3 illustrates the workflow of the SAMSTS algorithm.It can be summarized by several steps.First, the input Landsat multispectral time series images are considered together and segmented (across time and space) into a single segmentation image.The resulting segments are divided into two groups (≥3 pixels and <3 pixels) for the subsequent processing.The segments (in either group) are clustered, and each segment is associated to M clusters.Then, the gap filling is undertaken for a specified target image in the time series that has missing pixels.For each segment (S gap ) with one or more missing pixels in the target image, an alternative similar segment (S alt gap ), which has valid non-missing pixel values in the target image, is identified using the cluster information.Then, for each missing pixel in segment S gap , its alternative similar pixel p alt gap is identified within S alt gap and is used to fill the gap pixel's values.A refined spectral angle mapper similarity metric (SAM r ) is incorporated in the steps of the time series segmentation, segment clustering, and identification of alternative similar segments and pixels in the target image.This measure is based on the spectral angle mapper (SAM) that is used conventionally to determine the spectral similarity between two pixels by calculating the cosine of the angle subtended between their points in feature space and the feature space origin [73][74][75].Previously, we refined SAM to allow the comparison of two single-pixel Landsat multispectral time series with missing temporal observations [13] as: 1, i f x a z not mis sin g and x b z not mis sin g 0, otherwise where a and b are two vectors whose feature space values are defined by x z∈[1 . . .n] , and n (≥2) is the number of feature space dimensions, i.e., the product of the number of spectral bands and the number of images in the time series; χ(x a z , x b z ) is a step-function returning 1 if both x a z and x b z are non-missing observations and returning 0 otherwise.Thus, SAM r0 measures the similarity between two pixels' time series considering only their temporally-corresponding non-missing observations.SAM r0 is bounded in the range [0,1], tends to 1 as the time series values of the two pixels become similar, and is 1 when the time series values are identical.However, where there is a high proportion of missing observations, then only a small number of temporally-corresponding observations will be available, which will reduce the reliability of SAM r0 .To minimize the impact of this issue, the SAM r is used: where SAM r0 ( a , b ) is defined by Equation ( 1), n is the number of temporally-corresponding observations between a and b , n is the number of feature space dimensions, i.e., the product of the number of spectral bands and the number of images in the time series (5 × 26), and the term is the mean difference between the two time series' temporally-corresponding observations.Like SAM r0 , the SAM r is bounded in the range [0,1] and equals 1 when the time series are identical.
In this study, obs 50% in Equation ( 2) is set as the percentage of non-missing tile 30 m pixel observations over the 26 weeks multiplied by the product of the number of weeks (26) and the number of bands (five Landsat bands, Section 3.1) times 0.5.For example, for the California study area, 47.5% of the tile 30 m pixel observations over the 26 weeks are missing (Table 1, middle column) and so obs 50% is 34 (derived as ((100 − 47.5)/100) × 26 × 5 × 0.5).Thus, the misalign( a , b , obs 50% ) penalty term occurs for the California gap-filling when the five-band 26-week vectors a and b have less than 34 temporally-corresponding values.If there are more temporally-corresponding values, i.e., n is high, then the penalty term is not invoked.
The SAM r is used to compare not only single-pixel time series, but also to compare the mean time series values of pixels that are associated together after the application of a segmentation algorithm.For brevity, we define the term "segment signature" as the mean reflectance values for each of the five Landsat bands over the 26 weeks (i.e., a vector of 130 × 1 mean values).The mean values are derived considering all the non-missing pixels in the segment.The SAM r is then calculated as Equation ( 2) with a and b defined by segment signatures or single pixel time series.We also define the term "cluster signature" as the mean of the segment signatures in a cluster.The cluster signature is a vector of 130 × 1 reflectance values.

The Spectral-Angle-Mapper-Based Spatio-Temporal Similarity (SAMSTS) Gap-Filling Algorithm
The SAMSTS gap filling algorithm utilizes a segment-and-clustering approach that is illustrated in Figure 4.The approach is undertaken as follows: (1) the input time series images are considered together and segmented (across time and space) into a single segmentation (e.g., six segments, Figure 4a); (2) the segments are clustered in an unsupervised manner (e.g., into two clusters denoted by hatched lines in Figure 4b); (3) the SAM r time series calculations are undertaken comparing the segment containing the gap pixel (cross, falling in the red segment with diagonal hatching) with alternative similar segments that belong to the same cluster (diagonal hatching), and the most similar alternative segment that maximizes SAM r is selected; and (4) for each gap pixel in the gap segment, its corresponding alternative pixel is individually searched only among the candidate pixels in the corresponding most similar alternative segment.These four steps are described below in detail through Sections 5.2.1-5.2.3

Time Series Image Segmentation
The input time series images, i.e., 26 weeks of five bands, each 5000 × 5000 pixels, are segmented into a single segmentation map composed of 5000 × 5000 pixels.Conventionally, segmentation algorithms cannot work if there are missing data, or the missing data are not segmented.As Landsat time series contain arbitrarily-distributed gaps across space and time, the segmentation should be adaptive to them.The SAM r is used with a simple region-growing image segmentation method [76].Each pixel is associated with a spatially-adjacent pixel (considering an 8-connected pixel neighborhood) if the SAM r value of the two pixels' time series is >0.9995.The 0.9995 threshold is purposefully set high to ensure that pixels belonging to the same segment are highly similar, i.e., have a nearly identical 26 week temporal evolution in the five-band reflectance.This threshold was empirically determined through tests on the input data for the three test areas.Since the agricultural study areas typically have a higher degree of spatio-temporal complexity than most other landscapes [31,51], we anticipate that this threshold will be suitable for general cases when six months of Landsat data are used.
Over-segmentation may occur that reduces the computational efficiency of the algorithm.Consequently, spatially-adjacent similar segments are merged to reduce the number of segments.For each segment, the segment signature is derived as the mean non-missing reflectance for each band over the 26 weeks where the mean is derived considering all the pixels in the segment.Then, for each pixel in the segment, the SAM r between the pixel time series and the segment signature is derived, and for segments composed of more than one pixel, the standard deviation of these SAM r values is derived.Two neighboring segments are considered sufficiently similar to be merged if either segment's SAM r standard deviation value is less than twice the value of 1 -SAM r derived between their segment signatures (as 1 -SAM r quantifies the degree of disimilarity of two signatures).The merging is undertaken in an iterative manner.In each iteration, only unmerged segments are eligible to be merged, and the merged segments are taken as a new segment in the next iteration.A maximum of three iterations was found to be sufficient to avoid over-merging.Large numbers of small-sized segments may exist in the merged segmentation results, primarily due to mixed pixel effects occurring along object edges in the Landsat 30 m data.The segment signatures of small segments are more likely to be affected by mixed pixel effects than large ones.In particular, segments composed of one, two, or three pixels have long perimeters relative to their areas, and so are usually mixed.Therefore, as illustrated in Figure 3, the merged segments are divided into two size groups (>3 pixels and ≤3 pixels) and the groups are treated separately in the following clustering and gap-filling steps to speed up the processing.

Segment Clustering
The segments are clustered, i.e., each segment is associated to a unique cluster, using the unsupervised K-means clustering approach that has, for example, been used previously to cluster satellite time series [77].There are many clustering algorithms [78] and other algorithms developed for high-dimensional data; for example, mean shift [79], expectation maximization [80], or fuzzy clustering [81] could also potentially be used.However, we use the K-means approach as it is straightforward to incorporate the SAM r measure that can handle time series with missing observations.The K-means approach requires initialization with K observations, i.e., K observations that can be treated as candidate clusters.In this study, the candidate clusters are selected in an iterative manner starting with one randomly-selected segment.Then in each iteration, every unselected segment's signature is compared with the signature(s) of the currently selected segment(s), and if the SAM r values are smaller than 0.96, the segment is selected as a new candidate cluster.In this way, the number of candidate clusters typically increases with each iteration.If no new candidate clusters are found and there are fewer than 300 candidate clusters, the process is stopped.If there are more than 300, the process is restarted, but the SAM r 0.96 threshold is decreased by 0.01 so that fewer candidate clusters will be selected.We note that 300 is considerably larger than the number of CDL classes in the 5000 × 5000 30 m pixel test areas (Figure 1), but this is needed to account for subtle land cover and surface differences, and the thresholds set ensure that close to (but less than) 300 clusters can be obtained.All the segments are then clustered using the K-means clustering approach initialized with the K candidate clusters.The SAM r metric is used as the K-means distance measure as it can handle missing Landsat time series observations [13].
After the K-means clustering, an additional merging step is implemented to reduce the number of clusters.Specifically, the SAM r standard deviation of each cluster is derived based on the SAM r value between a cluster signature and the signature of each segment in the cluster; the SAM r between each pair of clusters is also calculated.Two clusters are merged conservatively if they are mutually the most similar cluster to each other and their cluster SAM r standard deviations are relatively small.Here, small is defined as less than 1/2 of the 1 -SAM r value derived between their cluster signatures (note that 1 -SAM r reflects the dissimilarity between two cluster signatures).The merging is repeated up to five times until no merging takes place.After this step, the number of clusters are typically reduced by more than 50%, i.e., to fewer than 150 clusters.
Finally, each segment in the segmentation map is associated to the M most similar clusters with ranked similarities, defined by SAM r .For Landsat time series, M should be sufficiently large to capture perturbations in the time series due to non-surface variations imposed by the remote sensing process (e.g., differences in illumination and observation angles, atmospheric contamination, sensor calibration/degradation changes, sensor noise) while being small enough to be computationally efficient.In this study, M was set as 10 as we found M higher than this did the not provide improved results and reduced the computational efficiency.

Identification of Alternative Segments and/or Pixels to Fill Gaps
As illustrated in Figure 3, after the segmentation and segment clustering steps, the gaps are filled individually for each image in the time series that has missing pixels.For each target image with missing pixels, the gap filling is conducted considering the large-size-segment group (≥3 pixels) and the small-size-segment group (<3 pixels) independently, and the final gap-filling results are obtained by combining the gap-filling results from the two groups together.The detailed algorithms for alternative similar segment identification and alternative pixel identification are provided in Appendixs A and B, respectively.

Gap-Filling Assessment Metrics
Gap filling performance was assessed using simulated gaps generated by artificially removing pixels from a single target image in the time series.The spectral root-mean-square difference (RMSD) between the filled and the original non-missing pixel values was calculated with respect to the five Landsat 8 OLI bands as: where RMSD fill (i, j) is the spectral root-mean-square difference at pixel location (i, j) between the original pixel reflectance p original k (i, j) and the gap-filled p f illed k (i, j) reflectance values considering five Landsat bands, i.e. the green (0.56 µm), red (0.66 µm), near-infrared (NIR) (0.87 µm), and the two shortwave infrared (SWIR) (1.61 µm and 2.20 µm) bands.
In order demonstrate the algorithm performance with respect to surface changes (similar to the gap-filling assessment approach described in [24]), the spectral root-mean-square difference (RMSD) between the reflectance observed on the preceding temporally-closest non-missing pixel observation, and the subsequent temporally-closest non-missing pixel observation was also assessed as: where RMSD preceding (i, j) and RMSD subsequent (i, j) are the spectral root-mean-square differences at pixel location (i, j) between the original pixel reflectance value p (i, j), respectively.As some gap-filling methods may simply take the temporally-closest non-missing pixel observations, regardless of whether the closet observation occurred before or after the gap, the closest RMSD was also derived as: where the equation terms are defined as for Equations (4) and ( 5) and p temporally closest k (i, j) is the temporally-closest observation reflectance value that occurred either before or after the target gap week pixel to be filled.If the preceding and subsequent pixel non-missing observations are equally temporally close, then the preceding observation was selected.

SAMSTS Gap Filling
First, to demonstrate the SAMSTS gap-filling algorithm performance in detail, a 500 × 500 30 m pixel (15 × 15 km) cloud-free area was removed from the target week of the Kansas tile, which was selected because it had pronounced agricultural harvesting.Next, to examine the algorithm performance when large and different portions of the data are missing across the tile, 25 simulated gaps, each 600 × 600 30 m pixels (18 × 18 km), were removed from across the target week of the California, Minnesota and Kansas tiles.This is equivalent to removing 36% of the 5000 × 5000 tile pixels, an so is comparable to the average annual Landsat CONUS cloud cover observed by Landsat 8 [9] or Landsat 7 [4].In each experiment, the SAMSTS algorithm was applied to fill the gaps in the target week considering the surrounding tile pixels and all of the 26 weeks of data, except for the pixels removed from the target week.Target weeks 29 (California), 34 (Minnesota), and 38 (Kansas) were selected as the greater majority (>98%) of their tile data were not missing or cloudy.In addition, they were acquired in different periods of surface changes.The Minnesota week 34 and the Kansas week 38 data were acquired near the peak and in the senescence/harvest seasons, respectively, whereas the California week 29 data were acquired in approximately the middle of the growing season (Figure 2).

Comparison with Sinusoidal Harmonic Model-Based Gap-Filling
The SAMSTS gap-filling algorithm performance was compared with the established sinusoidal harmonic model TI gap-filling approach that has been demonstrated with Landsat time series [34,37].The sinusoidal harmonic model was defined as: where f (t) is the modelled satellite reflectance for a single-pixel location; a 0 is a coefficient describing the mean of f (t) over the time series; a m and b m are coefficients for harmonic component m with M ≥ 1 components; L is the length of the time period; and t is time.The model can reproduce periodic time series, but higher model orders (larger M) are required to fit more complex time series and may result in spurious oscillations, particularly with noisy data and/or if there are gaps in the time series where change occurred [35,82,83].Zhu et al. [38] applied the sinusoudal model as shown in Equation (7) to several years of Landsat data, but used an additional coefficient to model inter-annual changes.
In this study, as only six months of Landsat data were used, the inter-annual change coefficient was not used.The six-months of Landsat data were sensed from 28 April 2013 (DOY = 118) to 26 October 2013 (DOY = 299) and so L was set to 182 (299 -118 + 1).Zhu et al. [38] recommended that there should be at least three times more valid surface observations than the number of model coefficients, and did not run the model but output the median observation value if there were fewer than six observations.Therefore, in this study, the sinusoidal model shown in Equation ( 7) was implemented with M = 2 (i.e., five Fourier coefficients) when the number of valid observations was ≥15 and with M = 1 when there were 5 to 14 observations, and the median observation value was output if there were fewer than five observations.The sinusoidal harmonic model was applied independently at each pixel location in the 25 600 × 600 30 m pixel simulated gaps over the California, Minnesota and Kansas tiles, and was applied to the Landsat time series independently for each of the five Landsat bands and used to predict the reflectance on the dates of the simulated gaps.

Detailed Gap-Filling Example Demonstration
Figure 5 illustrates the gap-filling results for the 500 × 500 30 m pixel subset removed from the 5000 × 5000 30 m pixel Kansas tile.The original 500 × 500 30 m pixels that were removed from the target week 38 are illustrated (b) to enable detailed comparison with the SAMSTS gap-filling results (e).The temporally-closest preceding (d) and subsequent (f) weeks of Landsat data are also shown with the same contrast stretches as the gap-filling results (e).The 2013 CDL image is illustrated (a) to provide the land cover context.The comparison of the original data (b) with the gap-filled results (e) illustrates that the methodology works well, especially considering that such a relatively large agricultural area (15 km × 15 km) in the harvest season is gap filled.The five-band spectral root-mean-square difference (RMSD) histograms are illustrated in (c) and are colored as blue (RMSD fill ), red (RMSD preceding ) and green (RMSD subsequent ).The SAMSTS RMSD fill histogram has a greater frequency of smaller RMSD values than the RMSD preceding and RMSD subsequent histograms.Thus the SAMSTS performs better than simple closest temporal pixel substitution.
The bottom row of Figure 5 shows maps of RMSD preceding (g), RMSD fill (h) and RMSD subsequent (i), and their comparisons illustrate gap-filling errors and sensitivities.This subset is dominated by circular center-pivot irrigation agricultural fields and within some of the circular fields, "pie slice" circular sectors are evident that are associated with harvesting [84].The majority the RMSD preceding pixel values (78.1%) and the RMSD subsequent pixel values (65.3%) are greater than the corresponding RMSD fill values.The mean RMSD fill value is 0.022 (median is 0.016), and 7.2% of the pixels have RMSD fill values > 0.05, and 0.7% have RMSD fill values > 0.1.The largest SAMSTS gap-filling errors occur where there were abrupt surface changes due to agricultural harvesting that is evident when the preceding (d) and subsequent (f) weeks are compared.The mean RMSD subsequent value (0.045) is slightly greater than the mean RMSD preceding value (0.042) due to the greater amount of harvesting that occurred between weeks 38 and 40, compared to the harvesting between weeks 36 and 38.However, despite the rapid surface changes, much of the Kansas SAMSTS gap-filled image (e) is consistent with the original image (b).
The locations in Figure 5, which changed between weeks 36 and 38 and that also changed between weeks 38 and 40, typically have the greatest gap-filling errors.For example, two of the largest gap-filling errors occurred in circular center-pivot irrigation fields (denoted by red circles) where partial harvesting occurred between weeks 36 and 38 and then complete harvesting occurred between weeks 38 and 40.Since the week 38 observations were removed by the gap simulation, information on when the abrupt harvest change occurred was not present in the time series.When short-term abrupt changes occur, gap-filling errors may be introduced if the obtained similar alternative segments do not also contain the abrupt changes.This is not always the case.For example, a flooded pixel example (denoted by the yellow arrow) exhibited flooding effects that lasted from weeks 34 to 40, but was gap filled coherently.Figure 6 illustrates the NDVI time series of the flooded pixel and its corresponding alternative similar pixel from which the gap-filling pixel value was selected.The alternative similar pixel was located 17.5 km (583 30 m pixels) away, which is relatively far from the flooded gap pixel, but quite closely resembles both the NDVI (Figure 6) and the reflectance (Figure 5) of the gap pixel.5, located at 37.70121869 • N, 100.82220237 • W), and the corresponding NDVI time series for the selected alternative similar pixel.For clarity, the NDVI, rather than five-band reflectance time series, is shown.

Large Area Gap-Filling Results
Figures 7-9 illustrate the results of the gap-filling where 25 large 600 × 600 30 m pixel simulated gaps distributed across each 5000 × 5000 30 m pixel tile, denoted by the white square outlines in (b), were removed from week 29 (California), week 34 (Minnesota), and week 38 (Kansas).The gap-filled pixels are shown in (e).The gaps were also independently filled by simple temporal pixel substitution from the temporally-closest preceding (d) and subsequent (f) available pixels.The temporally-closest available non-missing preceding and subsequent observations were acquired from one to five weeks before (a) and after (c) the target week.The bottom rows show RMSD preceding (g), RMSD fill (h), and RMSD subsequent (i) maps.
These results illustrate that the gap-filling using the temporally-closest preceding or subsequent available pixel can be relatively inaccurate.The gaps that were independently filled by simple temporal pixel substitution (Figure 7d,f, Figure 8d,f and Figure 9d,f) are sometimes spatially quite incoherent with obvious gap-filling errors observed as squares in the gap-filled images and high RMSD preceding (g) and RMSD subsequent (i) values.As discussed above, these errors are primarily due to surface changes before or after the target image acquisition.In addition, residual atmospheric contamination and bi-directional effects due to differences in illumination and observation angles [23,57] may also cause temporal differences.
In contrast to the temporally-closest gap-filling results, the gap-filled images provided by the SAMSTS method (Figures 7e, 8e and 9e) have more natural looking because the filled pixels were selected from elsewhere in the target week.Interestingly, despite the quite variable availability of pixel observations in the time series, the RMSD fill values do not obviously increase further from the spatially-closest valid observations, as observed in other ASP gap filling methods [50].The majority of the RMSD fill values are smaller than the corresponding RMSD preceding tile pixels values (62.5%, 69.6%, 72.3% for California, Minnesota and Kansas, respectively), RMSD subsequent tile pixel values (68.8%, 75.6%, 55.2% for California, Minnesota and Kansas, respectively), and RMSD temporally_closest pixel values (61.4%, 69.1%, 71.6% for California, Minnesota and Kansas, respectively).Figure 10 illustrates histograms of the RMSD values, including the RMSD temporally_closest , for each study site.The SAMSTS gap filling method (blue line) has generally smaller errors than the closest temporal pixel substitution approaches.As with the detailed gap-filling results, the gap-filling errors are greatest for the Kansas, then Minnesota, then California tile data.The majority of the RMSD fill errors are less than 0.05 with mean values less than 0.02, i.e., less than the 3% OLI reflectance calibration accuracy [85].The mean tile RMSD values provide a straightforward way to assess the gap filling.The smallest mean RMSD values are consistently observed for the SAMSTS gap-filling method with mean RMSD fill values of 0.014 (California), 0.016 (Minnesota) and 0.018 (Kansas).The mean RMSD preceding , RMSD subsequent , and RMSD temporally_closest values vary considerably, but are greater than the mean RMSD fill values by at least 61% (California), 69% (Minnesota), and 55% (Kansas).For the California and Minnesota tiles, the mean RMSD temporally_closest value is smaller than the mean RMSD preceding and RMSD subsequent values, which is expected as the closest observation is selected from either the subsequent or preceding observation.The Kansas tile has a higher mean RMSD temporally_closest value (0.031) than the mean RMSD subsequent value (0.027), but is only marginally higher than the RMSD preceding value (0.030).This is because the Kansas target week (Figure 9b) is in the harvesting season with considerable preceding and subsequent surface reflectance changes (Figure 2). Figure 11 illustrates the results of the sinusoidal harmonic model gap-filling for the three tiles.Figure 11a-c show the false color gap-filling results.The California gap-filled image appears natural, like the SAMSTS results (Figure 7e), but the Minnesota and particularly the Kansas gap-filled results have obvious gap-filling errors observed as squares in the gap-filled images.Figure 11d-f   Table 2 summarizes the RMSD values of the SAMSTS and the sinusoidal harmonic model gap-filling methods for the three tiles with respect to all gap-filled pixels, and with respect to only the crop and only the non-crop gap-filled pixels defined by the 2013 CDL.For both gap-filling methods, larger RMSD errors were observed for the crop than non-crops pixels over the California and Kansas tiles.This is expected because the Landsat reflectance was observed to change more rapidly over crops than non-crops in these tiles (Figure 2).Over the Minnesota tile, there was less difference between the crop and non-crop RMSD values for either gap-filling method.The SAMSTS method had slightly smaller RMSD values for the crop than non-crop pixels over the Minnesota tile.This is because the non-crop pixels were primarily grassland/pasture, developed and herbaceous wetlands (defined by the CDL classes) that were small and sparsely distributed (Figure 1b), which reduced the availability of similar alternative non-crop segments/pixels in the SAMSTS algorithm.Despite this, for all three tiles, considering crop, non-crop, and both crop and non-crop pixels, the SAMSTS method had smaller RMSD gap-filling values than the sinusoidal harmonic model method.7h, 8h and 9h) and the sinusoidal harmonic model gap-filling results (Figure 11d-f), with respect to all the pixels, the crop pixels, and the non-crop pixels as defined by the 2013 CDL (Figure 1).

Discussion
To date, no ASP approaches have been demonstrated to be capable of filling large-area Landsat gaps over agricultural areas that are spatially heterogeneous and temporally dynamic.Current approaches (reviewed in Section 2) are limited by the requirement for cloud-free reference image(s) and/or the need to search for alternative similar pixels in a relatively small spatial window.These constraints are reduced in the SAMSTS gap-filling algorithm.The factors affecting the SAMSTS gap-filling performance are complicated, and they include the spatio-temporal distribution of gaps relative to surface changes and the reflectance magnitude of the missing and non-missing pixels through time.It is difficult to unambiguously identify and quantify these individual factors because they function in a combined manner.The gap filling experiments described in this paper suggest larger gap-filling errors over surfaces with rapid change, e.g., due to agricultural harvesting or flooding.In addition, errors occur when the gap pixels have few similar pixels in the tile, i.e., in this study, for certain crop classes that occupy a minority of the tile.Similarly, urban pixels in the California tile were less accurately gap filled because they encompassed a variety of land cover types and land uses that were spatially mixed within a pixel.Despite these issues, the SAMSTS gap-filling algorithm demonstrably has the potential to generate gap-free Landsat 8 OLI time series when only six months of Landsat 8 reflectance data are used.
The SAMSTS algorithm was demonstrated for large-area gap filling using six months of Landsat 8 OLI reflectance time series over three 150 × 150 km (5000 × 5000 30 m pixels) tiles.Using larger tiles (or groups of adjacent tiles) could provide more opportunities for finding alternative similar pixels but at a cost of increased computation.The actual location of the alternative similar pixels is not relevant if the tile is sufficiently large to contain alternative similar pixels with non-missing reflectance time series, and so the issues of sensitivity to tile location [86] are expected to be unimportant.Arguably, given a suitable data structure, there is no reason why a gap pixel should not be filled with a value that is very far distant.Conceptually this is similar to supervised land cover classification where a sample of training data are used to classify very large area Landsat time series [32,87,88].Using larger tiles (or groups of tiles) may help overcome the main limitation of the SAMSTS algorithm: if there are no existing alternative similar pixels, then a gap cannot be filled reliably.
The SAMSTS algorithm was compared to the sinusoidal harmonic model per-pixel temporal interpolation approach [35,38].For all three tiles, considering crop, non-crop, and both crop and non-crop pixels, the SAMSTS method had smaller mean five-band reflective wavelength RMSD values than the sinusoidal harmonic model method.The majority of the sinusoidal harmonic model RMSD values were larger than the corresponding SAMSTS gap-filled values (65.1%, 72.9%, and 57.9% of the gap-filled pixels for the California, Minnesota and Kansas tiles, respectively).We note that the sinusoidal harmonic model may perform more reliably at locations with smoother temporal changes.In addition, the sinusoidal harmonic model may perform more reliably when multiple years of Landsat data are used [38], although the likelihood of land surface condition changes that reduce sinusoidal harmonic model gap-filling reliability will increase, for example, due to crop rotation over agricultural areas between years.
In this study, the SAMSTS algorithm was implemented in the C language in Visual Studio 2013 (from Microsoft Corporation located at Redmond, Washington, USA) run on a Windows-7 64-bit operating system with 3.6 GHz CPU and 16 GB memory.The large-area gap-filling experiments over the California, Minnesota and Kansas tiles took 4-5 hours to run per tile.The majority (>90%) of the computation was on the alternative similar segment identification (Section 5.2.3) that was CPU, rather than memory, limited.The search for alternative pixels is initially conducted on a segment basis, and then on a pixel-basis within the identified segments to improve computational efficiency.The time series segmentation-and-clustering step may not be needed if the data had only a small proportion of gaps to be filled (e.g., Landsat SLC-off gaps only).The algorithm computational efficiency could be improved further by parallel processing the similar segment identification across different processors.
The SAMSTS algorithm uses a number of parameters that were set to provide reliable gap-filling over the agricultural study areas.As agriculture typically has a higher degree of spatio-temporal complexity than many other landscapes [30], we expect the algorithm to perform similarly elsewhere.However, in the cases where phenology is more complex or where cloud contamination is very persistent at the time of satellite overpass, the algorithm may perform less well.Further research to analyze the sensitivity of the gap-filling to the algorithm parameterization for other locations may be necessary.Inclusion of other Landsat-like data, such as from the Sentinel-2 Multi Spectral Instrument (MSI), should enable improved SAMSTS gap-filling; this is subject to further research to accommodate sensor spectral resolution differences, provided that the MSI data can be well registered with Landsat which is currently an issue [89,90].
As an ASP (alternative similar pixel) gap-filling method, the SAMSTS algorithm is based on the selection of a similar pixel observation and does not use temporal interpolation or reflectance prediction approaches.Consequently, there is no reason why the two Landsat 8 Thermal Infrared Sensor (TIRS) bands [91] could not be gap filled by copying the TIRS band pixels from the alternative similar pixel found by the current algorithm.However, this would need to be subject to further research and is likely to be challenging due to the greater surface temporal variation at thermal wavelengths compared to reflectance wavelengths [16,92].

Conclusions
This paper presented and assessed a new algorithm for Landsat reflectance time series gap filling that is designed to fill both small and large-area gaps in Landsat data, using one year or less of Landsat data and without using other satellite data.This spectral-angle-mapper (SAM) based spatio-temporal similarity (SAMSTS) gap-filling algorithm follows the approach of alternative similar pixel (ASP) gap filling, whereby a gap pixel value is filled by an alternative similar pixel that is located in a non-missing region of the image.The search is based on comparison of reflectance time series using a revised SAM metric adapted to accommodate missing time series observations.Three 150 × 150 km (5000 × 5000 30 m pixels) test tiles in California, Minnesota and Kansas were considered, which contained heterogeneous land cover types, including those with different phenology and abrupt changes due to agricultural harvesting that make gap-filling challenging.The tests involved the simulation of 36% missing data that are comparable to the average annual CONUS cloud cover in Landsat data.The SAMSTS algorithm performed better than the closest temporal pixel substitution and the sinusoidal harmonic model temporal interpolation gap-filling methods.The SAMSTS algorithm was demonstrated to be capable of filling large gaps in heterogeneous areas in Landsat time series and provided mean five-band reflective wavelength RMSD values less than 0.02, i.e., less than the 3% Landsat 8 OLI reflectance calibration accuracy.

Figure 2 .
Figure 2. Mean CDL class-specific weekly (weeks 18-43, 2013) NDVI values derived from the three WELD/ARD test tiles (Figure 1) California (top), Minnesota (middle), and Kansas (bottom).Only values for the CDL classes that cover more than 2% of the tile are shown.Weeks with no data are not illustrated, but the plotted lines are shown dashed.

Figure 4 .
Figure 4. Overview of the SAMSTS gap filling algorithm.(a) The segmentation map obtained from the time series; (b) The clustered segment (denoted by different hatching); (c) Given a segment with gaps (e.g., the red segment), its alternative segment is searched for considering only the segments in the same cluster (i.e., the purple, cyan and blue segments).The missing-observation-adaptive similarity metric SAM r is used in the segmentation, clustering, and alternative similar segment search.
j) and the preceding temporally-closest observation reflectance value p preceding k (i, j), and the subsequent temporally-closest observation reflectance value p subsequent k

Figure 6 .
Figure 6.Example NDVI time series for the flooded pixel (denoted by the yellow arrows in Figure 5, located at 37.70121869 • N, 100.82220237 • W), and the corresponding NDVI time series for the selected alternative similar pixel.For clarity, the NDVI, rather than five-band reflectance time series, is shown.

Figure 7 .
Figure 7. California tile (5000 × 5000 30 m pixels) gap filling experiments, 25 600 × 600 30 m pixel areas, shown by white squares in the target week (b), were removed to simulate gaps and then filled by the SAMSTS method (e), and by closest preceding (d) and subsequent (f) pixel substitution.Associated RMSD images are shown (g-i) colored as for Figure 5.The temporally-closest non-missing preceding and subsequent observations did not always belong to the same image and were acquired from one to five weeks before (a) and after (c) the target week, colored as black (1), dark gray (2), light gray (3), and white (4 or 5).

Figure 8 .
Figure 8. Minnesota tile (5000 × 5000 30 m pixels) gap filling experiments, 25 600 × 600 30 m pixel areas, shown by white squares in the target week (b), were removed to simulate gaps and then filled by the SAMSTS method (e), and by closest preceding (d) and subsequent (f) pixel substitution.Associated RMSD images are shown (g-i) colored as for Figure 5.The temporally-closest non-missing preceding and subsequent observations did not always belong to the same image and were acquired from one to five weeks before (a) and after (c) the target week, colored as black (1), dark gray (2), light gray (3), and white (4 or 5).

Figure 9 .
Figure 9. Kansas tile (5000 × 5000 30 m pixels) gap filling experiments, 25 600 × 600 30 m pixel areas, shown by white squares in the target week (b), were removed to simulate gaps and then filled by the SAMSTS method (e), and by closest preceding (d) and subsequent (f) pixel substitution.Associated RMSD images are shown (g-i) colored as for Figure 5.The temporally-closest non-missing preceding and subsequent observations did not always belong to the same image and were acquired from one to five weeks before (a) and after (c) the target week, colored as black (1), dark gray (2), light gray (3), and white (4 or 5).
Figure 11  illustrates the results of the sinusoidal harmonic model gap-filling for the three tiles.Figure11a-cshow the false color gap-filling results.The California gap-filled image appears natural, like the SAMSTS results (Figure7e), but the Minnesota and particularly the Kansas gap-filled results have obvious gap-filling errors observed as squares in the gap-filled images.Figure11d-fshow the RMSD values between the gap-filled data and the corresponding original data.The majority of the sinusoidal harmonic model RMSD values are larger than the corresponding SAMSTS gap-filled values (Figure7h, Figure8h, and Figure9h), specifically 65.1%, 72.9%, and 57.9% of the gap-filled pixels for the California, Minnesota and Kansas tiles, respectively.

Figure 11 .
Figure 11.Sinusoidal harmonic model gap-filling results for the 25 600 × 600 30 m pixel simulated gap areas for the California (a), Minnesota, (b) and Kansas (c) tiles.The original images before gap removal are illustrated in Figure 7b, Figure 8b, and Figure 9b.The associated RMSD gap-filling values are shown in (d-f) colored as for Figures 7-9.

Table 1 .
Summary of missing weekly pixel observations (26 weeks from 28 April to 26 October 2013) in the three test tiles.Note that in the same week different Landsat-8 orbits can overpass the eastern and western sides of a 5000 × 5000 30 m tile.

Table 2 .
Mean five-band RMSD values of the SAMSTS (Figures