Earth’s biosphere is exposed to increasing pressure from environmental changes, many of which are of anthropogenic cause (e.g., global change [1
]). Various associated manifestations can be observed in the land surface domain of the planet. To effectively address related issues, knowledge on environmental indicators at a high spatial and temporal resolution is required to perform detailed analyses in different fields [2
]. It is crucial that the acquired information is reliable in order to minimize error propagation in further applications, thus creating the need for substantial validation efforts. For the quantification and prediction of planet-wide processes, it is furthermore necessary to conduct highly frequent and accurate investigations at the continental to global scale over extended periods of time [6
]. The challenge of creating datasets that meet these requirements can be approached efficiently by satellite remote sensing. Large spatial coverage (“wall to wall”) and high revisit times of specific spaceborne instruments promote the derivation of time-series data, containing extensive geospatial information. The scientific demand to engage in global-scale environmental questions has consequently facilitated the growth of this field [7
]. This development was possible due to the release of well characterized and calibrated satellite data and the pressing demand from the user community to have access to consistent and ready-to-use products [9
Although only a limited number of earth observation (EO) sensors are suitable for the generation of long-term and temporally dense time-series, the increased quantity of easily accessible data sources provides the foundation for numerous satellite-derived geophysical datasets, which can be allocated to the optical (i.e., multispectral, thermal) and radar (radio detection and ranging) sectors [10
]. The progressive use of combinations of sensor-specific data additionally enables enhanced investigation periods along with advanced research capabilities (e.g., [6
This review addresses the key issue of the reliability of obtained results. EO time-series are here regarded as temporally dense observations of land surface parameters in a certain time period [10
], which either consist of directly measured electromagnetic radiation (digital numbers, reflectance) or derived variables, such as geophysical (e.g., fraction of absorbed photosynthetically active radiation, FAPAR; land surface temperature, LST), index (e.g., normalized difference vegetation index, NDVI; leaf area index, LAI), thematic (e.g., forest versus non forest), topographic (e.g., slope, roughness), or texture (e.g., compactness, fragmentation) variables [12
]. Either via binary, categorical, or continuous values, these variables estimate the prevailing properties of the surface. In addition to the current state, time-series offer temporal trajectories, revealing the dynamic behavior of a measured instance. Moreover, long-term remote sensing time-series are able to access knowledge regarding long-term directional trends, seasonal/systematic movements, and irregular/unsystematic short-term fluctuations [10
]. The main contributors to the respective data repositories are the MODIS (Moderate Resolution Imaging Spectroradiometer) science team, providing a range of extensive products in the fields of atmosphere, ocean, land, and calibration disciplines [13
] as well as the systematically produced global land surface time-series datasets of the Copernicus Global Land Service (e.g., Figure 1
]). Besides the direct interpretation of land surface time-series, a wide range of applications rely on consistent large-area datasets as model input [15
]. Such applications can offer a more complete understanding of earth system dynamics [19
]. However, the main criteria for the utilization of remote sensing time-series products in process models and their assimilation in more sophisticated forecast schemes are spatiotemporal continuity and auxiliary uncertainty information [21
Despite considerable progress and extensive use of recognized EO products in a variety of applications, significant improvements are still required concerning quality, accuracy, and spatiotemporal coverage [6
]. In addition to shortcomings of single satellite-based retrievals, the quality of EO time-series is furthermore influenced by spatial patterns and temporal dynamics determined by sensor properties, observational distortions, and processing algorithms [21
]. The potential impacts can be critical, as shown by the misinterpretation of a global trend of vegetation browning as a consequence of sensor degradation that had not yet been addressed in the MODIS C5 geoinformation product collection and was corrected for in the Collection 6 (C6) data archives [29
]. If error sources are not properly handled or remain undetected, consequences on downstream operations can be severe [30
]. To anticipate spurious assertions, efforts have been made to introduce certain standards by which datasets of geophysical variables are generated and validated. The World Meteorological Organization [31
] states different levels of user requirements for observations according to variables and application. The Land Product Validation Subgroup [32
] of the Committee on Earth Observation Satellites (CEOS) provides a validation hierarchy covering five validation stages classified depending on the set of locations, covered time periods, spatiotemporal consistency, publication, and systematic updates. Furthermore, a web-based service for the validation of medium-resolution land products (On Line Validation Exercise, OLIVE) has also been made available [9
]. Distinct monitoring principles and observation requirements have also been specified by the Global Climate Observing System [2
], as it relies on the operation of satellite systems for climate monitoring.
This states the importance of validation and error quantification of results obtained by the use of EO data, as these provide a quantitative assessment of the reliability of results and further outline critical information to end users. For geospatial data, this information is especially relevant on the pixel level [33
A suite of techniques for determining the quality of geospatial information has been established for remote sensing products. However, knowledge about the strengths and weaknesses in terms of assessed accuracy, quality, and overall agreement of EO-based time-series is often scarce [36
]. When considering continued use of unvalidated or insufficiently validated data, error propagation on downstream analyses can bear the risk of substantial error amplification [37
]. Furthermore, the awareness of researchers conducting validation efforts concerning the implications and constraints of indicator values needs to be ensured [38
]. As processing methods become more complex, informative validation metrics are key for providing the right details about the characteristics and overall quality of products, thus facilitating the interest of a broad user community [39
]. It is only with confidence in the generated information that inferences can be substantiated and eventually reported to scientific, management, or policy support entities [40
The goal of this review was to provide an overview of validation methods in Science Citation Index (SCI)-listed studies since the year 2000, which assess the quality of large-area EO time-series regarding land surface variables (Figure 2
). Time-series were considered that show a continental to global coverage (at least a 10 million km² region of interest) and feature a minimal temporal resolution of one month. This thus excluded multi-annual land use/land cover (LULC) datasets, which typically feature at least an annual temporal resolution or less according to the definition of the Food and Agriculture Organization (FAO) [41
]. However, comprehensive overview papers on the topic of LULC validation are available for the interested reader [38
]. Furthermore, studies showing an adequate spatiotemporal extent and resolution but concentrate only on a small number of local sites for validation and/or specific points in time were not regarded in this review due to the lack of coherence of the product and validation. The considered criteria in this review purposely targeted datasets for which classical good practice validation methods of spatial products [45
] are unfavorable. Next to reviewing validation methods, we also targeted quantitative questions regarding the involved data as well as thematic and spatial foci to present a bigger picture associated with validation.
The applied search criteria (Appendix A
of Table A2
) identified 89 papers and subsequently 91 data products for the evaluation. Some studies presented more than one product with associated validation (i.e., [6
]) while others covered the same product in different publications (i.e., [34
2. Theoretical Background
Validation is defined by the Committee on Earth Observation Satellites Working Group on Calibration and Validation [46
] as the process of assessing the quality of data products derived from system outputs by independent means, and must be distinguished from calibration, which provides quantitative information on system responses to a known and controlled input. To properly communicate validation results, coherent terminology is necessary.
Definitions of basic terms associated with validation (accuracy, trueness, bias, error, and precision) are given by the International Organization for Standardization [47
] and are illustrated in Figure 3
. Here, accuracy is defined as the closeness of agreement between a test result and an accepted reference value. When the accuracy of a set of test results is determined, a combination of random components (errors) and a common systematic error component (bias) is involved. The term accuracy is often stated when trueness would be more appropriate, since the general term accuracy refers to both trueness and precision. Trueness is the closeness of agreement between the average value obtained from a large series of test results and an accepted reference value. Trueness can be expressed in terms of bias, referring to the total systematic error. The conception of (random) error, on the other hand, involves the difference between a single measurement and the true value. Built on random errors is the term precision, which depends only on the distribution of random errors and can be expressed as the standard deviation of test results. As an applied example, Verger et al. [48
] evaluated the overall performance of LAI products by decomposing RMSE (root mean square error) into accuracy and precision components, where accuracy is presented as the mean value of the differences between products and ground measurements and precision as the standard deviation of estimates around the best linear fit.
Uncertainty is referred to as the potential deficiency in any phase or activity of the modeling process that is due to a lack of knowledge [49
]. It is reported alongside result values and specifies the range in which the true value is asserted to be found. Uncertainty addresses both systematic and random errors (Figure 3
). Within the domain of spatial data, the correct use of uncertainty concepts is furthermore subject to the concreteness of the object definition (e.g., category definition in classification). For well-defined objects, uncertainty originates in errors and has a probabilistic nature, whereas poorly defined objects introduce vagueness or ambiguity, which can be approached by fuzzy set theory or lead to discord or non-specificity, respectively [50
2.1. Main Error Sources of Remote Sensing Time-Series Products
Measurements of physical quantities are subject to uncertainty. Factors that affect measurements in the field of remote sensing cannot be held constant, resulting in variability of the repeated measurements. Contributions to uncertainty in satellite estimations can be based on manifold factors (e.g., retrieval errors, sampling errors, and inadequate calibration/validation data [51
], Figure 4
Satellite sensors need to convert the electromagnetic signal from earth in voltage and digital numbers in order to obtain reflectance or radiance values. For a reliable conversion, the constant calibration of sensors is essential [29
] to counteract sensor-prone error sources (e.g., sensor degradation, different sensor sensitivity for specific bandwidths). For this purpose, modern instruments, such as MODIS, employ onboard calibration systems. Furthermore, deviations due to variable sun–sensor geometry and geolocation errors have to be attended to. Geolocation uncertainties can be an issue due to varying projection systems, target shift, or different point spread functions [53
]. Additionally, orbital drift has to be taken into account, as, for example, the orbital drift of each National Oceanic and Atmospheric Administration (NOAA) satellite during their lifetime evokes a cooling effect in advanced very high resolution radiometer (AVHRR) LST products [54
]. The next step is the correction of signal alternations induced on the path from the target to the sensor. Outgoing radiance consists of components affected by environmental influences and atmospheric absorption and scattering effects. To obtain surface reflectance information, atmospheric correction is commonly applied, along with topographic correction and bidirectional reflectance distribution functions (BRDFs). For passive microwave observations, man-made radio frequency interference is also able to contaminate signals [55
]. Once the corrected data is accessible, successive error sources are introduced in the subsequent processing. Model uncertainties mainly relate to simplified approximations of natural variability [56
] and data gap handling. Main causes for temporal inconsistency in optical remote sensing data are data gaps due to cloud cover. Data quality suffers under cloud and cloud shadow contamination by amplified variance and abnormal error distribution in the data [57
]. Depending on the object of investigation, snow cover and coarse temporal resolution can also cause data inconsistency. To interpolate data gaps, several reconstruction models (e.g., harmonic analysis, double logistic, asymmetric Gaussian or Whittaker smoother, Savitzky–Golay filter) have been established [58
]. Gap-corrected data, however, have the potential for implications in downstream applications, as incorporated uncertainty is carried further to subsequent models [28
]. A different approach to gain consistent time-series presents value compositing (e.g., maximum value), where multiple images are processed for a preset period of time to create representative, cloud-free datasets with the least atmospheric attenuation and viewing geometry effects [59
]. This can reduce the impact of missing data and unexpected day-to-day variations. However, the application of filters and compositing comes with the disadvantage of information loss and lower temporal consistency [48
]. For regions with a high frequency of gaps (e.g., tropical areas), even composite products often contain extensive gaps, which has become increasingly problematic in areas where no alternative source of geospatial datasets is available [28
]. To oppose ambiguous observations, quality control (QC) is used to provide information on the retrieval quality by flagging data acquisitions. This procedure adds quality indicators to the original observation without modifying or removing it [60
]. As a result, the anticipation of systematic errors related to the shortcoming of retrieval algorithms and inaccurate prior knowledge [57
] can be improved. Further accuracy-limiting factors are of a technological nature, concerning the sensor’s spatial (geometric), radiometric, spectral, and temporal resolution [61
], which can be summarized as sampling errors. These errors become effective if information is assessed at a level of detail that cannot be properly sustained by the capabilities of the data acquisition mechanism.
Although an emphasis lies on the correction of known errors sources, complete compensation cannot be achieved. Myneni et al. [62
] have shown, for instance, in LAI modeling, that even properly calibrated surface reflectance information obtained under clear sky conditions is still subject to independent variations due to measurement geometry, surface characteristics (e.g., snow, soil characteristics), and canopy cover. Eventually, results have to be accompanied by validation information to ensure proper interpretation. Furthermore, time-series need to show temporal stability in their accuracy, since changing data quality over time would justifiably concern users that temporal trends observed in the data are ambiguous [63
2.2. General Considerations in Time-Series Validation
Validation presents means to assess implied uncertainties by analytical comparison to reference data. Uncertainty can be differentiated in physical and theoretical uncertainties [64
]. Physical uncertainties represent the actual departure of product values from reference values, which are obtained through independent validation studies [32
]. Theoretical uncertainties emerge from the inverse procedure as they relate mainly to uncertainties in the input data along with model simplifications and are usually estimated by individual product science teams to accompany the product with quality information (e.g., QC information) [21
Primarily methods can be categorized into direct and indirect validation. Direct validation is based on the comparison with independent data sources, which are representative of the target or true value, enabling the absolute quantification of uncertainties (physical uncertainties). For geospatial data, direct validation is commonly accomplished at the pixel scale through the comparison of a product with independently acquired data, observing the same ground parameter coherent in time and location [66
]. The required truthfulness of reference data, however, is not always met for the validation of geophysical measurement systems, as the data considered often include inherent measurement errors, biasing calibration and validation information [67
]. Next to the quality of reference data, the amount of obtainable reference data is mainly a limiting factor for extensive spatial coverage of validation efforts [68
]. Raw in situ data is typically available as point measurements, leading to biased assessments when compared to raster data [69
]. As a consequence of the larger spatial resolution of satellite sensors, representative areas of ground observations have to be upscaled to approximate the target cell size. This principle is frequently applied to higher resolution satellite images to represent an estimation of the in situ state on a spatial grid and as an intermediate step for comparisons to data that has an even coarser resolution [65
]. This two-stage process is subject to uncertainty itself, as the accuracy of ground-based reference maps depends on errors in field measurements but also on uncertainties of fine resolution satellite data, and sampling and spatial scaling errors [72
]. On the other hand, for small-scale investigations based on higher resolution data, a direct comparison to interpolated in situ data time-series can be more viable [73
]. Indirect validation is less affected by these constraints, using data with similar characteristics for an intercomparison of products. This procedure allows the evaluation of gross differences and possible insights into the reasons for variations [8
] by considering the consistency of a given product relative to related data at a comparable spatial scale. The ease of application promotes indirect validation in the field of remote sensing, especially with regard to the temporal aspect of time-series, although absolute uncertainty is not achievable. As a major issue, temporal stability of validation outcomes is one aspect required by the Committee on Earth Observation Satellites’ Land Product Validation Subgroup [32
]. To provide spatially and temporally consistent reference data, a vast network of calibration and validation sites has been established. Through the provision of independent measurements of relatable values over time, requirements for long-term direct validation can be met [74
6. Discussion on the Challenges and Implications of Validation
In this review, studies were investigated that cover large-area and temporally dense remote sensing time-series that feature validation efforts for their outcomes. In terms of the applicability of validation approaches, the scope of this review represents challenging circumstances regarding the spatiotemporal availability of adequate validation data. The hereby favored validation approach is the intercomparison of products (indirect validation). A main strength of this procedure is the similar spatiotemporal extent that related remote sensing products usually share, making it comparatively easy to evaluate consistencies/differences coherently in space and time over large areas representative of various global conditions [35
]. On the contrary, it lacks a link to quantitative reference data [68
], implying the inability to investigate absolute errors with respect to the true state. The same principle is observed with TC. This procedure can be used as a complementary method to estimate theoretical uncertainties. Further challenges for the application of TC is the incorporation of different kinds of datasets (in situ measurements, satellite observations, and model fields) as they refer to different scales. By using satellite remote sensing triplets, the requirement of uncorrelated errors becomes problematic [21
]. However, the inclusion of a fourth independent dataset can help meet this requirement [148
]. Implementations as extended triple collocation [67
] are able to yield the correlation coefficient of the measurement system with respect to the unknown truth, thus not requiring a reference dataset.
To address the requirements of global change communities (e.g., [2
]), physical uncertainties obtained from direct validation are needed in order to obtain total uncertainty [21
]. To conduct validation of remote sensing data according to recognized criteria [46
], independent data (e.g., in situ data) are essential. Validation approaches that follow this guideline can be found in the comparison and accuracy assessment categories in this review. For the categorization of these validation methods, the main defining criterion is the use of reference data, which is expected to approximate the true state of a measured variable with considerably smaller uncertainty than the assessed product. The main data sources that meet this requirement are ground observations. Yet, in situ data contains measurement errors as well. In certain cases, uncertainties account for 10% to 30% (evapotranspiration estimates by eddy covariance flux towers [20
]). However, they can be considered insignificant compared to those of most major remote sensing retrievals (e.g., soil moisture [138
]). Furthermore, the implementation of confusion matrices in accuracy assessments is limited by the preciseness of the class definition [50
Another common source for reference data is higher resolution remote sensing images, as this approach is recommended, for instance, by the CEOS CalVal protocol for burned area products [76
]. Respective data are subject to established errors in the field of remote sensing as well. Nevertheless, by using adequate means for the derivation of reference information, it can represent independent, detailed, and accurate validation data [59
Alternatively, networks of validation sites could cover the need for extensive reference datasets. Several studies refer to validation sites as the origin of their validation data, with common examples like BEnchmark Land Multisite ANalysis and Intercomparison of Products (BELMANIP), VAlidation of Land European Remote sensing Instruments (VALERI), and FLUXNET sites. Due to several well-distributed networks, it is already possible to match the spatiotemporal requirements to validate time-series of certain variables by the use of reference data (e.g., LAI, FAPAR [150
]). However, the capabilities of major validation networks for covering the variety of geophysical variables are limited. This consequently constrains the investigation spectrum to products that target identical or explicitly correlated variables. The combined use of multiple networks in global studies (e.g., soil moisture [60
]) is further complicated by a lack of standardization and protocolling. A major aspect regarding the usage of reference data is the spatial and/or temporal scale difference to time-series products, as it is challenging for field observations to be representative of coarse raster data cells (e.g., MODIS pixel [94
]). Additionally, the location of coarse pixels and ground data may not match because of geo-location uncertainties [152
]. Methods to overcome scaling issues have been shown by several studies, but eventually, sampling errors are not completely avoidable.
Other studies are focused on the temporal aspect of validation, shifting the emphasis from quantitative investigations to distinctive temporal behaviors of time-series. This approach is particularly vulnerable to continuity distortions (persistent cloud cover, high levels of atmospheric aerosols), weak seasonality of investigated variables, and insufficient temporal sampling of remote sensing sensors to capture rapid transitions (e.g., phenology studies [85
]). This kind of assessment merely reveals validity in terms of timing, as no absolute uncertainty in the form of measured quantities is targeted. On the other hand, valuable information can be identified when validation using ground measurements fails to capture potential issues [121
Although advances in the use of extensive calibration and validation networks for specific geophysical variables have been made, the central issue of validation remains in the acquisition of appropriate reference data. There is still a lack of ground measurements representative for sufficiently long time periods to fully utilize quantitative assessments [6
]. Also, a need for more in situ measurements in specific regions makes the evaluation of products challenging (e.g., areas north of 60° [25
]). Although more demanding in implementation, only direct validation by the use of reference data enables the assessment of physical uncertainties and it may be argued that only such methods can be seen as actual validation in the field of remote sensing. As part of this review, a variety of means to assess the validity of remote sensing time-series are considered as relevant methods.
An opportunity is given for internal validation methods, for which validation data is inherent to the product or might be generated by simulations based on available information. Simulations of time-series provide an additional approach, particularly for testing temporal analysis methods in a controlled environment [154
]. As for intercomparison, internal methods are not able to provide a total measure of uncertainty but enable an internal estimation of model/algorithm performance. Nevertheless, a generic virtue of product internal validation data is the complete coverage of the product extent, both in space and time.
Apart from only using a single validation method, a strong inclination towards facilitating more than one approach is observed. By combining methods, an improved information content of the validation results is achievable. However, diverging outcomes have to be expected for the execution of a number of validation methods. This has been shown by D’Odorico et al. [21
], as three diverse validation methods (comparison, intercomparison, internal validation) resulted in different validation outcomes. Additional discrepancies in validation results can be implied by means of different sampling strategies for reference data. Various studies placed an emphasis on the distribution of their validation efforts to distinct validation locations according to scale, land cover, biome type, or climatic setting in order to account for expected variability over the globe [54
]. As pointed out by Cescatti et al. [81
] in the case of the investigation of surface albedo, a weakness of in situ sites can be inhomogeneity. This further highlights possible discrepancies of validation outcomes according to applied approaches and demonstrates that, in practice, a validation approach might not produce the universally best results but should address product features relevant to the final product user [130
]. An overview of general considerations regarding the validation method categories in this review is given in Table 1
As validation is an essential component of any earth observation product, some studies still do not facilitate validation efforts. In the reviewing process of this paper, several publications failed to meet the requirement of a distinct expression of validation efforts and corresponding results. This brings attention to the controversial issue of the need for validation, especially for products that are generated by the use of already validated time-series, and error propagation is marginal due to the simple processing of input.
7. Conclusions and Outlook
This review presented an overview of the validation approaches of studies that utilize large-area, temporally dense earth observation time-series of land surface variables since the year 2000. Along with validation strategies, quantitative information on data basis, thematic foci, and spatial distribution were also investigated. Encountered validation methods were categorized according to the utilized type of validation data. The resulting classes incorporate a comparison to reference data, including accuracy assessments, the intercomparison of related data products, product internal validation, and temporal evaluation of time-series. The scope of this review led to the following key findings:
The main data sources of studies are optical sensors (78.5%), with MODIS, AVHRR, and SPOTVGT as major contributors.
The dominant thematic focus is on vegetation-orientated variables (71.8%), mainly represented by time-series of NDVI, LAI, and FAPAR.
An emphasis on a global coverage of studies is prevalent (73.6%).
The main sources of validation data are related remote sensing products (33.7%) and in situ data (28.1%).
For the expression of validation outcomes, conventional metrics or correlation-based metrics (RMSE, R²) are mostly calculated, along with a frequent presentation of graphical illustrations of temporal profiles, correlation plots, and map comparisons.
The most commonly used validation method is the intercomparison of products (indirect validation, 38.5%), followed by the comparison to reference data (direct validation, 37.3%). The majority of studies used more than one validation method (65.9%).
A general increase in relevant studies published per year, along with a minor diversification of the corresponding validation methods, could be observed.
Challenges comprise a lack of adequate reference data, consequently promoting other methods.
The issue of matching product and validation data in a reasonable spatiotemporal fashion is seen throughout studies that consider external sources for validation.
The use of validation methods that are not bound to external validation data is limited (15.4%), as is validation by time-series-derived points in time in temporal evaluations (8.9%).
Validation by accuracy assessment is unfavored by the majority of studies in the scope of this review (4.7%), which excludes land use/land cover products.
Although the assessment of physical uncertainty referring to the true state of a measured variable (direct validation) is demanded by major EO-related organizations, indirect validation is frequently implemented.
For long-term and large-area investigations, remote sensing offers unique advantages. A major constraining issue, however, is the need for reliable validation incorporating adequate validation data. Challenges and potential arise for the integration of remote sensing and ground observations [83
], facilitating the application of traditional validation methods. To resolve the central issue of the lack of adequate reference data, new sources of validation data (e.g., near-surface remote sensing camera-bound vegetation indices [153
], citizen science data [155
]) could be considered. So far, one strategy is going towards the standardization of validation by the introduction of validation level hierarchies and observation requirements for measurement uncertainty for atmosphere-, land-, and ocean-related variables [2
]. The hereby implied guidelines include the consistent use of reference data for validation. This development is sustained by extensive calibration and validations networks, providing reliable reference data in high spatiotemporal resolution. With progressive capabilities in the acquisition of EO data (e.g., Cubesat [157
]), such networks could become even more relevant.
On the other hand, increasingly more applications of remote sensing data in sophisticated modelling efforts require auxiliary uncertainty estimations along with the product in high spatial and temporal resolution. Conventional validation methods that are based on reference data and even product intercomparisons are not able to properly fulfill this requirement. Also, differences in absolute values and inconsistencies in uncertainty representation still render some datasets inadequate for modelling or data fusion (e.g., FAPAR products [21
]). A greater focus might be given to innovative internal methods, able to provide uncertainty information on the spot. With advances in novel techniques, this could be achieved in an automated fashion at the product generation stage, similar to the production of QC information. Some studies in the scope of this review have demonstrated approaches in line with this concept. However, publications that exhibit novel validation techniques are more typical for investigations of smaller regions of interest, facilitated in the development stage of a product (e.g., [158
]). Nevertheless, internal methods have the potential to remedy the issue of the complete coverage of results with uncertainty information in space and time but are not able to provide uncertainty measures that are referenced to the true state of a geophysical variable. Consequently, this conception is missing physical uncertainties from a direct validation approach. A compromise could be achieved by strategies of rigorous validation by reference data in the development phase of a product and subsequent internal validation when the product is used in downstream applications that require more concise spatiotemporal quality information. Further research is needed to evaluate the capabilities and potential of novel methods, along with more extensive validation work in general to improve the understanding of remote sensing results, clearing the path for continuous high-quality data products.