FORCE—Landsat + Sentinel-2 Analysis Ready Data and Beyond

Ever increasing data volumes of satellite constellations call for multi-sensor analysis ready data (ARD) that relieve users from the burden of all costly preprocessing steps. This paper describes the scientific software FORCE (Framework for Operational Radiometric Correction for Environmental monitoring), an ‘all-in-one’ solution for the mass-processing and analysis of Landsat and Sentinel-2 image archives. FORCE is increasingly used to support a wide range of scientific to operational applications that are in need of both large area, as well as deep and dense temporal information. FORCE is capable of generating Level 2 ARD, and higher-level products. Level 2 processing is comprised of state-of-the-art cloud masking and radiometric correction (including corrections that go beyond ARD specification, e.g., topographic or bidirectional reflectance distribution function correction). It further includes data cubing, i.e., spatial reorganization of the data into a non-overlapping grid system for enhanced efficiency and simplicity of ARD usage. However, the usage barrier of Level 2 ARD is still high due to the considerable data volume and spatial incompleteness of valid observations (e.g., clouds). Thus, the higher-level modules temporally condense multi-temporal ARD into manageable amounts of spatially seamless data. For data mining purposes, per-pixel statistics of clear sky data availability can be generated. FORCE provides functionality for compiling best-available-pixel composites and spectral temporal metrics, which both utilize all available observations within a defined temporal window using selection and statistical aggregation techniques, respectively. These products are immediately fit for common Earth observation analysis workflows, such as machine learning-based image classification, and are thus referred to as highly analysis ready data (hARD). FORCE provides data fusion functionality to improve the spatial resolution of (i) coarse continuous fields like land surface phenology and (ii) Landsat ARD using Sentinel-2 ARD as prediction targets. Quality controlled time series preparation and analysis functionality with a number of aggregation and interpolation techniques, land surface phenology retrieval, and change and trend analyses are provided. Outputs of this module can be directly ingested into a geographic information system (GIS) to fuel research questions without any further processing, i.e., hARD+. FORCE is open source software under the terms of the GNU General Public License v. >= 3, and can be downloaded from http://force.feut.de.


Introduction
We are currently experiencing an exciting new era of Earth observation, wherein multiple, freely available remote sensing systems provide us data at unprecedented spatial, temporal, and spectral resolutions. The Landsat mission occupies a prominent role in this development: The opening of the Landsat archive in 2008 [1] has fundamentally changed the usage of Earth observation data [2]

Product Level and Data Cube Definition
Remote sensing products are grouped in a hierarchical classification scheme [15]. The lowest available level is commonly Level 1, i.e., radiometrically calibrated and georectified data. Level 2 data most notably include some sort of atmospheric correction. Level 3 data are temporal Level 2 aggregates that are provided in a different spatial reference, commonly a grid system with a single coordinate system. Level 4 products are model output, often derived from multi-temporal or multi-sensor measurements. In this paper, Levels 1 and 2 are referred to as lower-level products and Levels 3 and above as higher-level products, respectively. Several modifications to this scheme are commonly used. As an example, Level 3 products are the first that are mapped on a regular grid, whereas the lower-level products are still in georectified swath geometry (e.g., the Landsat Worldwide Reference System 2 (WRS-2) path/row system). In contrast, the key element of ARD is to provide gridded data [13,16,17]-regardless of product level. This is for e.g., reflected in ESA's production and distribution strategy of Sentinel-2 data as they already include gridding on Level 1 [6]-although still using local Universal Transverse Mercator (UTM) zones with a substantial amount of redundant data between overlapping and neighboring tiles. As such, FORCE adapts gridding on Level 2, i.e., all generated products are reprojected into one coordinate system (e.g., a continental projection as in [13] or [18]), and organized in smaller tiles. The following terms are defined; see Figure 1 for a graphical representation of these concepts: • The 'grid' as the regular spatial subdivision of the land surface in the target coordinate system. • The 'grid origin' is the location, where the tile numbering starts with zero. Tile numbers increase toward the South and East. Although not recommended, negative tile numbers may be present if the tile origin is not North-West of the study area. • The 'tile' is one entity of the grid, i.e., a grid cell with a unique tile identifier, e.g., X0003_Y0002. The tile is stationary, i.e., it always covers the same extent on the land surface. • The 'tile size' is defined in target coordinate system units (most commonly in meters). Tiles are square. • Each 'original image' is partitioned into several 'chips', i.e., any original image is intersected with the grid and then tiled into chips.

•
Chips are grouped in 'datasets', which group data according to acquisition date and sensor. Each dataset contains several 'products'. At minimum, a reflectance product and an accompanying quality product are generated.

•
The 'data cube' groups all datasets within a tile in a time-ordered manner. The data cube may contain data from several sensors and different resolutions. Thus, the pixel size is allowed to vary, but the tile extent stays fixed. The data cube concept allows for non-redundant data storage and efficient data access, as well as simplified extraction of data and information.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 21 As such, FORCE adapts gridding on Level 2, i.e., all generated products are reprojected into one coordinate system (e.g., a continental projection as in [13] or [18]), and organized in smaller tiles. The following terms are defined; see Figure 1 for a graphical representation of these concepts: • The 'grid' as the regular spatial subdivision of the land surface in the target coordinate system. • The 'grid origin' is the location, where the tile numbering starts with zero. Tile numbers increase toward the South and East. Although not recommended, negative tile numbers may be present if the tile origin is not North-West of the study area. • The 'tile' is one entity of the grid, i.e., a grid cell with a unique tile identifier, e.g., X0003_Y0002. The tile is stationary, i.e., it always covers the same extent on the land surface. • The 'tile size' is defined in target coordinate system units (most commonly in meters). Tiles are square. • Each 'original image' is partitioned into several 'chips', i.e., any original image is intersected with the grid and then tiled into chips. • Chips are grouped in 'datasets', which group data according to acquisition date and sensor. Each dataset contains several 'products'. At minimum, a reflectance product and an accompanying quality product are generated. • The 'data cube' groups all datasets within a tile in a time-ordered manner. The data cube may contain data from several sensors and different resolutions. Thus, the pixel size is allowed to vary, but the tile extent stays fixed. The data cube concept allows for non-redundant data storage and efficient data access, as well as simplified extraction of data and information.

Processing Capability
FORCE is organized in several software components. Figure 2 summarizes all available modules and their placement in the level system. A typical FORCE workflow as depicted in Figure 2 consists of following main steps: (i) Level 1 images are acquired from the space agencies, and are (ii) converted to Level 2 ARD, which are (iii) aggregated and analyzed using several higher-level modules. More detailed descriptions of the individual components will be given in the following sections, mainly ordered by level.
Remote Sens. 2019, 11, x FOR PEER REVIEW 4 of 21 FORCE is organized in several software components. Figure 2 summarizes all available modules and their placement in the level system. A typical FORCE workflow as depicted in Figure 2 consists of following main steps: (i) Level 1 images are acquired from the space agencies, and are (ii) converted to Level 2 ARD, which are (iii) aggregated and analyzed using several higher-level modules. More detailed descriptions of the individual components will be given in the following sections, mainly ordered by level. Figure 2. Overview of FORCE, general workflow. ARD-analysis ready data; hARD-highly analysis ready data; hARD+-highly analysis ready data plus; DEM-digital elevation model; CSO-clear sky observation; LSP-land surface phenology; CF-continuous field; CR-coarse resolution; MRmedium resolution; WVDB-water vapor database; ESA-European Space Agency; USGS-U.S. Geological Survey (USGS); NASA-National Aeronautics and Space Administration.

Level 1
The FORCE Level 1 Archiving Suite (L1AS) assists in acquiring and managing Level 1 data. L1AS has two different routines for Landsat and Sentinel-2, respectively ( Figure 3). The main difference is that Landsat data need to be downloaded manually, while Sentinel-2 images are automatically retrieved by FORCE.
Once Landsat data were downloaded from USGS, L1AS ingests new images into local data holdings. L1AS keeps track of data versioning and tiers, which means outdated/lower-ranked data is Overview of FORCE, general workflow. ARD-analysis ready data; hARD-highly analysis ready data; hARD+-highly analysis ready data plus; DEM-digital elevation model; CSO-clear sky observation; LSP-land surface phenology; CF-continuous field; CR-coarse resolution; MR-medium resolution; WVDB-water vapor database; ESA-European Space Agency; USGS-U.S. Geological Survey (USGS); NASA-National Aeronautics and Space Administration.

Level 1
The FORCE Level 1 Archiving Suite (L1AS) assists in acquiring and managing Level 1 data. L1AS has two different routines for Landsat and Sentinel-2, respectively ( Figure 3). The main difference is that Landsat data need to be downloaded manually, while Sentinel-2 images are automatically retrieved by FORCE. ignored next time.
The Sentinel-2 routine is similar to the one above, but ESA provides an application programming interface (API) for data query and automatic download. Based on a coordinate string list, cloud cover, and date range, a metadata report is pulled from the Copernicus API Hub. Each hit is compared with the local data holdings, and missing images are downloaded. A file queue is generated and updated accordingly.

Level 2: Analysis Ready Data
The FORCE Level 2 Processing System (FORCE L2PS) generates harmonized, standardized, and radiometrically consistent Level 2 products with per-pixel quality information, i.e., analysis ready data. L2PS pulls each enqueued Level 1 image and processes it to ARD specification. Each image (box in Figure 4) is processed independently using multiprocessing [19]. The pipeline is memory resident to minimize input/output (I/O), i.e., input data are read once, and only the final, gridded data products are written to disc.  Once Landsat data were downloaded from USGS, L1AS ingests new images into local data holdings. L1AS keeps track of data versioning and tiers, which means outdated/lower-ranked data is replaced with newer/improved data, thus preventing data redundancy. On successful ingestion, the image is appended to a file queue, which controls Level 2 processing. The file queue is a text file that holds the full path to the image, as well as a processing-state flag. This flag is either QUEUED or DONE, which means that it is enqueued for Level 2 processing or was already processed and will be ignored next time.
The Sentinel-2 routine is similar to the one above, but ESA provides an application programming interface (API) for data query and automatic download. Based on a coordinate string list, cloud cover, and date range, a metadata report is pulled from the Copernicus API Hub. Each hit is compared with the local data holdings, and missing images are downloaded. A file queue is generated and updated accordingly.

Level 2: Analysis Ready Data
The FORCE Level 2 Processing System (FORCE L2PS) generates harmonized, standardized, and radiometrically consistent Level 2 products with per-pixel quality information, i.e., analysis ready data. L2PS pulls each enqueued Level 1 image and processes it to ARD specification. Each image (box in Figure 4) is processed independently using multiprocessing [19]. The pipeline is memory resident to minimize input/output (I/O), i.e., input data are read once, and only the final, gridded data products are written to disc.
The FORCE Level 2 Processing System (FORCE L2PS) generates harmonized, standardized, and radiometrically consistent Level 2 products with per-pixel quality information, i.e., analysis ready data. L2PS pulls each enqueued Level 1 image and processes it to ARD specification. Each image (box in Figure 4) is processed independently using multiprocessing [19]. The pipeline is memory resident to minimize input/output (I/O), i.e., input data are read once, and only the final, gridded data products are written to disc.

Processing
The processing is based on the methodology described by Frantz et al. [18], amended by several improvements. Most prominently, support for Sentinel-2 was implemented.
Cloud masking is based on a modified version of the Fmask code [20], incorporating most updates [21] and the changes detailed by [18,22]. For Sentinel-2, the Cloud Displacement Index was developed to compensate missing thermal information employing parallax effects [23].
The spatial resolution of the 20 m Sentinel-2 bands can be improved, using the native 10 m bands as prediction targets. Three algorithms were implemented, which are listed with increasing prediction quality and processing time: (i) Spectral-only setup of the STARFM code [24], (ii) spectral-only setup of the ImproPhe code [25], and (iii) window regression [26].
Radiometric correction includes radiative-transfer-based atmospheric correction [27,28]. Aerosol optical depth is estimated over dark water and dense dark vegetation objects [29,30] using multiple scattering [18,31]. The usage of the Dark Object Database [18] to restrain aerosol optical depth estimation to permanent dark targets, was deprecated. Water vapor is estimated for each Sentinel-2 pixel; auxiliary data are used for Landsat (next section). Topographic correction is performed with an enhanced C-correction, based on the principle outlined by [32]. The C-factor is estimated for each pixel in the image and then propagated through the spectrum using radiative transfer theory. Three kernels of increasing size are used to approximate the background reflectance for environment correction [33]. Nadir BRDF-adjusted reflectance is retrieved using a global set of MODIS-derived (Moderate Resolution Imaging Spectroradiometer) BRDF kernel parameters [34][35][36].
Aerosol optical depth estimation, topographic correction effectiveness, and surface reflectance consistency was assessed for a Southern African study area [18]. The effectiveness of the topographic correction for improved forest-type classification was recently assessed in the Caucasus mountains [37]. Extended, global validation of aerosol optical depth and water vapor as well as surface reflectance were performed in the Atmospheric Correction Inter-comparison Exercise (ACIX) [38]. The parallax-based cloud detection was recently assessed in [39].
The data are reprojected to a custom projection and are then split to image chips using a custom grid with rectangular tiles, thus representing data cubes. Redundancy is prevented by aggregation of same-day/same-sensor data on output, i.e., redundant Level 1 data are not carried to Level 2. An example is shown in Figure 5.
reflectance were performed in the Atmospheric Correction Inter-comparison Exercise (ACIX) [38]. The parallax-based cloud detection was recently assessed in [39].
The data are reprojected to a custom projection and are then split to image chips using a custom grid with rectangular tiles, thus representing data cubes. Redundancy is prevented by aggregation of same-day/same-sensor data on output, i.e., redundant Level 1 data are not carried to Level 2. An example is shown in Figure 5.

Auxiliary Data
A digital elevation model (DEM) mosaic covering the complete study area is used for enhanced cloud shadow detection, scaling optical depths with altitude, and to perform the topographic correction. A precompiled water vapor database is used for atmospheric correction of Landsat data. The database holds water vapor values for the central coordinates of each WRS-2 frame. If available, day-specific values are used. If not, a monthly long-term climatology is used instead. The FORCE water vapor database component (FORCE WVDB, see Figure 2) can be used to generate and maintain such a database or a ready-to-use dataset may be freely downloaded [40]. The effect of using the water vapor climatology as a fallback option was globally assessed in [41].

Output Format
The gridded data are provided as compressed GeoTiff or flat binary format, accompanied by metadata. For each dataset, multiple products are stored as different files. Bottom-of-Atmosphere (BOA) reflectance (multi-band, same resolution) and quality assurance information (QAI; single band) are standard output. To homogenize and simplify usage of multi-sensor data, original band names are not carried to Level 2. Instead, specific bands can be addressed using their wavelength designation in the higher-level FORCE routines (see Table 1). The QAI product collects a number of quality-relevant status flags in bit notation ( Table 2).
Note: Level 1 bands, which are mainly intended for atmospheric characterization are used internally, but are not output.

General Concept
All higher-level FORCE routines follow the same general concept and act on the Level 2 ARD data cubes. The processing is tile based, i.e., the tiles are processed in sequential order (see Figure 6). Parallelization is implemented within the tile using multithreading. FORCE reads and processes necessary information only. The square extent needs to be defined (red rectangle in Figure 6). Additionally, a tile white-list can be provided to restrict the number of tiles for non-square areas of interest (colored tiles in Figure 6). The data fusion functionalities (see Section 3.3.5) require additional data from neighboring tiles to produce seamless products; only pixels on the edge of the tiles are read (in dependence on the prediction radius). Only relevant products are pulled (in most cases, these are BOA and QAI products). The same applies to sensors, i.e., any combination of Landsat 4, 5, 7, 8, Sentinel-2A, and -2B can be chosen. A waveband mapping procedure is used to generate multi-sensor products, i.e., only matching bands are used (Table 1; for details see [42]). Spectral bands are only read when required (e.g., red and near-infrared bands for the Normalized Difference Vegetation Index (NDVI)). Temporal filters restrict the amount of data to the time period (and/or season) that is required. Output products can freely be selected, which in turn trigger the respective processing. Output is tile based; the FORCE auxiliary module (FORCE AUX) includes a tool for mosaicking generated products using the Geospatial Data Abstraction Library (GDAL) Virtual Format.
As multiple spatial resolutions are permitted within a data cube, the target resolution must be defined. Resolution adjustment can be performed using nearest-neighbor resampling (pixel decimation/replication) or reduction using approximated point spread functions (PSF). On-the-fly resolution enhancement is not implemented, but the spatial resolution of ARD can be improved beforehand (see Section 3.3.5).
Quality control is completely under the user's control. All provided quality flags (  (Figure 7), e.g., to make informed decisions about the parameterization or applicability of a specific method, or to identify areas where commissions errors reduce the amount of usable data [43]. Clear sky observations are FORCE reads and processes necessary information only. The square extent needs to be defined (red rectangle in Figure 6). Additionally, a tile white-list can be provided to restrict the number of tiles for non-square areas of interest (colored tiles in Figure 6). The data fusion functionalities (see Section 3.3.5) require additional data from neighboring tiles to produce seamless products; only pixels on the edge of the tiles are read (in dependence on the prediction radius). Only relevant products are pulled (in most cases, these are BOA and QAI products). The same applies to sensors, i.e., any combination of Landsat 4, 5, 7, 8, Sentinel-2A, and -2B can be chosen. A waveband mapping procedure is used to generate multi-sensor products, i.e., only matching bands are used (Table 1; for details see [42]). Spectral bands are only read when required (e.g., red and near-infrared bands for the Normalized Difference Vegetation Index (NDVI)). Temporal filters restrict the amount of data to the time period (and/or season) that is required. Output products can freely be selected, which in turn trigger the respective processing. Output is tile based; the FORCE auxiliary module (FORCE AUX) includes a tool for mosaicking generated products using the Geospatial Data Abstraction Library (GDAL) Virtual Format.
As multiple spatial resolutions are permitted within a data cube, the target resolution must be defined. Resolution adjustment can be performed using nearest-neighbor resampling (pixel decimation/replication) or reduction using approximated point spread functions (PSF). On-the-fly resolution enhancement is not implemented, but the spatial resolution of ARD can be improved beforehand (see Section 3.3.5).
Quality control is completely under the user's control. All provided quality flags (Table 2) can be used individually.

Clear Sky Observations
FORCE clear sky observations (FORCE CSO) mines data availability (Figure 7), e.g., to make informed decisions about the parameterization or applicability of a specific method, or to identify areas where commissions errors reduce the amount of usable data [43]. Clear sky observations are defined in response to the quality control settings. For a given time period (in years) and interval (months), the number of clear sky observations are counted, and statistics on the temporal difference between clear sky observations are calculated; currently available statistics are the average, standard deviation, minimum, maximum, range, skewness, kurtosis, median, 25/75% quantiles, and interquartile range. The beginning and end of the intervals act as boundaries for this assessment. This processing scheme reflects the fact, that a single measure of data availability might not yield representative results. As an example (Figure 7), data availability for the first and second half of 2018 is equal in terms of the number of observations and the average time between observations. However, there are large differences in the maximum difference as data are clumped in the first half. This has important implications, e.g., for the detectability of harvesting events.
Remote Sens. 2019, 11, x FOR PEER REVIEW 10 of 21 deviation, minimum, maximum, range, skewness, kurtosis, median, 25/75% quantiles, and interquartile range. The beginning and end of the intervals act as boundaries for this assessment. This processing scheme reflects the fact, that a single measure of data availability might not yield representative results. As an example (Figure 7), data availability for the first and second half of 2018 is equal in terms of the number of observations and the average time between observations. However, there are large differences in the maximum difference as data are clumped in the first half. This has important implications, e.g., for the detectability of harvesting events.

Level 3: Highly Analysis Ready Data
The FORCE Level 3 Processing System (FORCE L3PS) temporally condenses multi-temporal observations into a more controllable amount of spatially complete data, which lowers the usage barrier compared to Level 2 ARD. Thus, these data are referred to as highly analysis ready data (hARD). hARD products have undergone the necessary processing required for many machinelearning-based land cover/change classification purposes, which put spatial completeness before temporal exactness. Acquisition dates and quality flags of Level 2 ARD are retained as suggested by [44].
FORCE L3PS is capable of producing best-available-pixel composites [45] and spectral temporal metrics [46] (Figure 8). Both concepts utilize all available observations within a defined temporal window; best-available-pixel composites are produced by selecting the optimal observation with respect to defined criteria, whereas spectral temporal metrics are produced by a statistical description of all available spectral observations. Composites are optimal to preserve spectra for physical interpretation, but are often noisier than spectral temporal metrics. Spectral temporal metrics are produced band wise, thus physical interpretability is limited. However, they provide rich information on temporal variability and data distribution and are thus ideal predictors for machinelearning techniques that require independent features. However, their quality is closely related to data availability as a sufficient number of clear sky observations (in dependence of the statistical moment) are required to produce reliable statistics.
FORCE employs a parametric weighting scheme [45] as implemented in [47]. For each pixel, the observation with the highest total score is selected for the best-available-pixel composite. Only highest-quality pixels are considered, i.e., observations with very low cloud or haze score are discarded. Similarly, observations with very low seasonal score are discarded, which ensures that Level 3 products are representative of the season of interest (can be switched off to produce annual products). The best-available-pixel composite composites can either be parameterized using a static target date [45] or by inputting a land surface phenology dataset to dynamically adapt the target dates for each pixel [47] (example: Figure 9). Over persistent water, the compositing algorithm is switched to minimum shortwave-infrared (SWIR2 band) compositing, as the parametric weighting

Level 3: Highly Analysis Ready Data
The FORCE Level 3 Processing System (FORCE L3PS) temporally condenses multi-temporal observations into a more controllable amount of spatially complete data, which lowers the usage barrier compared to Level 2 ARD. Thus, these data are referred to as highly analysis ready data (hARD). hARD products have undergone the necessary processing required for many machine-learning-based land cover/change classification purposes, which put spatial completeness before temporal exactness. Acquisition dates and quality flags of Level 2 ARD are retained as suggested by [44].
FORCE L3PS is capable of producing best-available-pixel composites [45] and spectral temporal metrics [46] (Figure 8). Both concepts utilize all available observations within a defined temporal window; best-available-pixel composites are produced by selecting the optimal observation with respect to defined criteria, whereas spectral temporal metrics are produced by a statistical description of all available spectral observations. Composites are optimal to preserve spectra for physical interpretation, but are often noisier than spectral temporal metrics. Spectral temporal metrics are produced band wise, thus physical interpretability is limited. However, they provide rich information on temporal variability and data distribution and are thus ideal predictors for machine-learning techniques that require independent features. However, their quality is closely related to data availability as a sufficient number of clear sky observations (in dependence of the statistical moment) are required to produce reliable statistics.  FORCE employs a parametric weighting scheme [45] as implemented in [47]. For each pixel, the observation with the highest total score is selected for the best-available-pixel composite.
Only highest-quality pixels are considered, i.e., observations with very low cloud or haze score are discarded. Similarly, observations with very low seasonal score are discarded, which ensures that Level 3 products are representative of the season of interest (can be switched off to produce annual products). The best-available-pixel composite composites can either be parameterized using a static target date [45] or by inputting a land surface phenology dataset to dynamically adapt the target dates for each pixel [47] (example: Figure 9). Over persistent water, the compositing algorithm is switched to minimum shortwave-infrared (SWIR2 band) compositing, as the parametric weighting selection is often noisy due to the high temporal variability of water reflectance. Currently implemented spectral temporal metrics are the per-band average, standard deviation, minimum, maximum, range, skewness, kurtosis, median, 25/75% quantiles, and interquartile range of reflectance. Figure 9. Best-available-pixel composite (near-infrared, shortwave infrared, red in RGB) for Angola, Zambia, Zimbabwe, Botswana, and Namibia. The 250, 25, and 2.5 km subsets provide different zoom levels of the composited data. The composite is temporally centered at the end of season land surface phenology metric for 2018. The land surface phenology was derived from the Moderate Resolution Imaging Spectroradiometer (MODIS), and its spatial resolution was enhanced with the FORCE ImproPhe code (see Section 3.3.5).

Time Series Analysis/Level 4 Highly Analysis Ready Data+
FORCE time series analysis (FORCE TSA) provides time series preparation and analysis functionality (Figure 10), i.e., extraction of quality-controlled time series with a number of aggregation and interpolation techniques, deriving land surface phenology metrics, and computing change and trend metrics. Complex processing workflows (example: Figure 11) can be executed in a single process. Many outputs of FORCE TSA are referred to as highly analysis ready data plus Figure 9. Best-available-pixel composite (near-infrared, shortwave infrared, red in RGB) for Angola, Zambia, Zimbabwe, Botswana, and Namibia. The 250, 25, and 2.5 km subsets provide different zoom levels of the composited data. The composite is temporally centered at the end of season land surface phenology metric for 2018. The land surface phenology was derived from the Moderate Resolution Imaging Spectroradiometer (MODIS), and its spatial resolution was enhanced with the FORCE ImproPhe code (see Section 3.3.5).

Time Series Analysis/Level 4 Highly Analysis Ready Data+
FORCE time series analysis (FORCE TSA) provides time series preparation and analysis functionality (Figure 10), i.e., extraction of quality-controlled time series with a number of aggregation and interpolation techniques, deriving land surface phenology metrics, and computing change and trend metrics. Complex processing workflows (example: Figure 11) can be executed in a single process. Many outputs of FORCE TSA are referred to as highly analysis ready data plus (hARD+), meaning that generated products can be directly ingested, analyzed, and interpreted in a geographic information system (GIS) to fuel research questions without any further processing.
Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 21 Figure 10. Processing workflow of FORCE time series analysis (TSA). All products indicated by a USB plug can be output; all products indicated by * can be centered/standardized before output. Figure 11. Land surface phenology-based trend and change analysis for Crete, Greece. The change, aftereffect, trend (CAT) transformation shows both long-term (30+ years) gradual, and abrupt changes. The CAT transform was applied to the annual value of base-level phenometric time series, Figure 10. Processing workflow of FORCE time series analysis (TSA). All products indicated by a USB plug can be output; all products indicated by * can be centered/standardized before output. Figure 10. Processing workflow of FORCE time series analysis (TSA). All products indicated by a USB plug can be output; all products indicated by * can be centered/standardized before output.  Processing is based on a spectral band (Table 1), spectral index (e.g. NDVI, for a full list see [42]), or fractional cover (using linear spectral mixture analysis [48] with custom endmembers). The full time series (limited by temporal filters, see Section 3.3.1) is generated, quality-controlled, and potentially output. The time series may be centered and/or standardized each pixel's mean and/or standard deviation before output as indication for vegetation under-/over-performance. A basic summary of the full time series can be generated, which includes per-pixel mean, standard deviation, minimum, and maximum.
The time series may be interpolated/smoothed at equidistant time steps using linear interpolation, moving average filter, and radial basis function (RBF) filter ensembles. The RBF kernel strengths are adapted by weighting with actual data availability in each kernel [49].
The full time series may be folded (aggregated) by year, month, week, or day-using mean, minimum, or maximum statistics. Folding by year is most common and generates annual time series (e.g., as employed by [50]). If folded by month, week, or day, the observations are pooled into a single virtual year, which gives up to 12, 52, or 365 values per pixel, and can, for e.g., be used to derive the long-term mean seasonality [51].
The interpolated time series may be folded by year with the land surface phenology method, i.e., annual phenometrics are extracted using the Spline Analysis of Time Series (SPLITS) API [52]. Twenty six metrics are available, which describe the timing and value of specific temporal points of interest, amplitudes, integrals, and durations.
A time series analysis can be performed on any of the folded time series. In the case of land surface phenology, the analysis is performed for each phenometric. Currently implemented analyses are linear trend analysis to derive long-term changes [53,54] and an extended change, aftereffect, trend (CAT) transform [55] with full trend parameters for the three parts of the time series (example: Figure 11).

Data Fusion
FORCE ImproPhe (Improving the spatial resolution of land surface Phenology) increases the spatial resolution of coarse continuous fields (example: Figure 12). It was originally developed to increase the spatial resolution of coarse resolution MODIS phenometrics, using Landsat ARD as multi-temporal prediction targets [25]. The fusion intensively uses the information from the local pixel neighborhood at both resolutions, wherein sparser medium resolution data are used to disentangle the land surface phenology by employing textural and spectral homogeneity metrics. ImproPhe is useful (i) in areas or times when Landsat/Sentinel-2 data are not dense enough to derive land surface phenology directly, and (ii) in areas where inter-annual climate variation prevents the strategy of pooling multiple years to increase data density. In general, ImproPhe can be applied to any coarse continuous field, provided a link to spectral-temporal land surface processes exists. spatial resolution of coarse continuous fields (example: Figure 12). It was originally developed to increase the spatial resolution of coarse resolution MODIS phenometrics, using Landsat ARD as multi-temporal prediction targets [25]. The fusion intensively uses the information from the local pixel neighborhood at both resolutions, wherein sparser medium resolution data are used to disentangle the land surface phenology by employing textural and spectral homogeneity metrics. ImproPhe is useful (i) in areas or times when Landsat/Sentinel-2 data are not dense enough to derive land surface phenology directly, and (ii) in areas where inter-annual climate variation prevents the strategy of pooling multiple years to increase data density. In general, ImproPhe can be applied to any coarse continuous field, provided a link to spectral-temporal land surface processes exists. Figure 12. Land surface phenology metrics at coarse resolution (MODIS-derived, 500 m) and with improved spatial resolution at 30 m for an image subset in Brandenburg, Germany. Depicted are (rate of maximum rise, integral of green season, and value of early minimum in RGB). Using the FORCE ImproPhe module, the spatial resolution was enhanced using multi-temporal Landsat and Sentinel-2 A/B prediction targets. FORCE Level 2 ImproPhe (L2IMP) is capable of improving the spatial resolution of lowerresolution ARD using higher-resolution ARD, e.g., refining Landsat images with Sentinel-2 targets (example: Figure 13). Although this function produces Level 2 data (Figure 2), the general higherlevel concept (3.3.1) also applies. The higher-resolution ARD are condensed to seasonal windows, and the ImproPhe code is applied to each lower-resolution ARD dataset. The refined dataset is appended to the original dataset as a separate product; thus two surface reflectance versions are available for each date. The higher-level FORCE modules can digest this data structure, and the user can choose to use the original BOA or the refined product. FORCE Level 2 ImproPhe (L2IMP) is capable of improving the spatial resolution of lower-resolution ARD using higher-resolution ARD, e.g., refining Landsat images with Sentinel-2 targets (example: Figure 13). Although this function produces Level 2 data (Figure 2), the general higher-level concept (3.3.1) also applies. The higher-resolution ARD are condensed to seasonal windows, and the ImproPhe code is applied to each lower-resolution ARD dataset. The refined dataset is appended to the original dataset as a separate product; thus two surface reflectance versions are available for each date. The higher-level FORCE modules can digest this data structure, and the user can choose to use the original BOA or the refined product.

Implementation
FORCE is open software under the terms of the GNU General Public License v. >= 3. The software and user guide can be freely downloaded from http://force.feut.de [56]. The software was developed and tested under Ubuntu Linux operating systems. The software is mostly written in C/C++, with some auxiliary functionality implemented in bash. FORCE builds on several open source tools and libraries such as GDAL [57], the GNU Scientific Library (GSL) [58], OpenMP [59], curl [60], and GNU parallel [19]. Optionally, FORCE can be linked with the SPLITS API [61] to enable deriving phenometrics.

Application
FORCE is increasingly used to support a wide range of scientific to operational applications. Landsat ARD and hARD, as well as Landsat-improved MODIS phenometrics were generated to serve as baseline products for environmental monitoring purposes in Southern Africa [62]. Landsat ARD and higher-level products have been extensively used in the Miombo forest ecosystem in central Angola (i) to evaluate the trade-off between food and timber resulting from forest to agriculture conversion [63], (ii) to assess spatio-temporal changes of smallholder cultivation patterns [64], and (iii) to detect forest areas that are being degraded [50]. Landsat ARD were used to support illuminating the discrepancy between deforestation and its social perception in Zambia [65]. Landsat hARD products were used to map cropping practices on a national scale in Turkey [66]. FORCE has been used in a number of conference contributions, e.g., to characterize Mediterranean land degradation due to overgrazing [67], to highlight the benefit of topographically corrected ARD for improved land cover classification [68], or as an essential building block in prototypic operational forestry applications [69].

Outlook
Several improvements and new features are being developed or are planned to be implemented. FORCE is open source software, and as such, external contributions are welcome. The Level 2 Processing System is currently undergoing a major overhaul to run more efficiently on weak RAM machines (e.g., common High Performance Computing (HPC) setups). Thus, memory requirements are reduced, and multithreading is implemented. Both will allow hybrid parallelization and thus Figure 13. Landsat ARD at original 30 m resolution (top), and Landsat ARD with improved spatial resolution at 10 m (bottom) for image subsets from North Rhine Westphalia, Germany. Using the FORCE L2IMP module, the spatial resolution was enhanced using multi-temporal Sentinel-2 A/B prediction targets.

Implementation
FORCE is open software under the terms of the GNU General Public License v. >= 3. The software and user guide can be freely downloaded from http://force.feut.de [56]. The software was developed and tested under Ubuntu Linux operating systems. The software is mostly written in C/C++, with some auxiliary functionality implemented in bash. FORCE builds on several open source tools and libraries such as GDAL [57], the GNU Scientific Library (GSL) [58], OpenMP [59], curl [60], and GNU parallel [19]. Optionally, FORCE can be linked with the SPLITS API [61] to enable deriving phenometrics.

Application
FORCE is increasingly used to support a wide range of scientific to operational applications. Landsat ARD and hARD, as well as Landsat-improved MODIS phenometrics were generated to serve as baseline products for environmental monitoring purposes in Southern Africa [62]. Landsat ARD and higher-level products have been extensively used in the Miombo forest ecosystem in central Angola (i) to evaluate the trade-off between food and timber resulting from forest to agriculture conversion [63], (ii) to assess spatio-temporal changes of smallholder cultivation patterns [64], and (iii) to detect forest areas that are being degraded [50]. Landsat ARD were used to support illuminating the discrepancy between deforestation and its social perception in Zambia [65]. Landsat hARD products were used to map cropping practices on a national scale in Turkey [66]. FORCE has been used in a number of conference contributions, e.g., to characterize Mediterranean land degradation due to overgrazing [67], to highlight the benefit of topographically corrected ARD for improved land cover classification [68], or as an essential building block in prototypic operational forestry applications [69].

Outlook
Several improvements and new features are being developed or are planned to be implemented. FORCE is open source software, and as such, external contributions are welcome. The Level 2 Processing System is currently undergoing a major overhaul to run more efficiently on weak RAM machines (e.g., common High Performance Computing (HPC) setups). Thus, memory requirements are reduced, and multithreading is implemented. Both will allow hybrid parallelization and thus enable improved flexibility with regards to different hardware architectures. As the Sentinel-2 Global Reference Image for improving geolocation accuracy [70] is still not available, and as ESA has not committed on reprocessing the available archive upon its completion, co-registration functionality is currently being implemented [71]. After having participated in the Atmospheric Correction Inter-comparison Exercise (ACIX) [38], FORCE will undergo further validation and testing in ACIX II, and the accompanying Cloud Masking Inter-comparison Exercise (CMIX) [72]. In order to support coastal aquatic applications [73], the option to output the coastal aerosol band of Landsat 8 and Sentinel-2 will be included. It is planned to implement support for Sentinel-1 data in the higher-level FORCE modules, which will need to be pre-processed similarly to the optical FORCE ARD; a fully integrated Level 2-like preprocessing tool is currently not planned by the developer, but could be contributed by interested third parties. The higher-level FORCE modules are often I/O-bound, thus measures are currently implemented to continuously pre-load data, which will reduce idle CPU times due to sequential reading-processing-writing cycles. Several software utilities are currently developed at the Earth Observation Lab, Humboldt-Universität zu Berlin: The QGIS plugins 'EO Time Series Viewer' [74], 'Raster Time Series Manager' [75], and 'Raster Data Plotting' [76] are being developed for visualizing mass remote sensing data at spatial, temporal, and spectral scales, and thus facilitate exploring data generated by FORCE.