- An increase the amount of data available (EO and non EO, including fiducial reference data), both to science and downstream services;
- Ensure shorter time for delivery;
- Increase collaboration;
- Guarantee quality/reliability of the data; and
- Improve flexibility to respond to changing demands and expectations.
2. Objectives and Benefits
- The data, and specifically the time series of daily/ten-daily data, from 1998 until the present is too large to be downloaded to, and processed on, the users’ premises, at least for the majority of the users. Here, the data from the former SPOT-VEGETATION 1 and 2 missions and PROBA-V at 1 km are treated in the MEP as one time series. The differences between the two datasets will be documented and provided to the users as part of the documentation provided on the platform.
- Users benefit from a unique point of access to the mission/sensor data (core, derived, prototyped, test), but also to data from similar missions/sensors (i.e., in support to joint calibration and validation (cal/val) or synergetic products also issued from programmatic collaboration), to meteorological data or fiducial reference and campaign data. The MEP becomes the hub of the mission for the broad communities of users/scientists and service providers for searching, accessing, processing, manipulating, and visualizing the data with suitable tools.
- The platform can stimulate collaboration between the users, as we bring together services from various stakeholders on the same platform with a number of tools to support the publishing of these services and to allow others to provide online feedback. A further focus on documentation, knowledge-sharing, and user support complements this and also makes the MEP an information centre for the mission with documentation on all supported products, similar missions, and fiducial reference data, specifications, standards, reports, etc.
- Fast and efficient information exchange is ensured by social networking tools: forum/wikis/blogs/calendar of events/new social networks capabilities. The MEP becomes the hub information centre for the mission where ESA, in the role of facilitator, is re-enforced as a mediator and coordinator, setting a stepping stone for future growth.
- The platform goes beyond offering standard products which can be only downloaded as they are produced by the processing centre: in the MEP applications are offered to visualize and analyse large time series of data. Furthermore on-demand processing services are developed including a web-based UI, which deliver user-tailored products. In a next step we gradually deploy a virtual research environment, being a platform, which allows users to develop, debug, and test an application on an infrastructure at VITO with access to the complete data archive. Successful applications from third-parties can then be offered as operational on-demand processing services to the user community on the same scalable platform. The platform is open for any processing workflow using any processing algorithm, including open source ‘processing’ code and/or public/collaborative software.
- The MEP strengthens the veracity pillar of the Big Data concept by holding all available information for all the supported products (core, derived, prototyped/test) of the mission and similar ones (synergetic products, joint cal/val, programmatic collaboration between agencies/services), fiducial reference and campaign data. The cal/val need is strong in a MEP context, both when the mission is still in its initial phase of operations and later on for algorithm validation.
- As an exploitation platform (EP) with a focus on open interfaces, we position the PROBA-V mission in an ecosystem of TEPs (Thematic EPs), REPs (Regional EPs), and other MEPs. In the future the PROBA-V MEP can be integrated gradually in a federation of different platforms, including Sentinel Collaborative Ground Segments, in line with the current ESA strategy on the ‘EO Ground Segment Evolution’.
3. Technical Solution
3.1. PROBA-V MEP Platform
3.1.1. Product Distribution Facility
3.1.2. Scalable Processing and Data Analytics Platform–Hadoop Cluster
- The easiest way is to consider the existing application as a black box. The integration is then quite simple: (1) compile the code for CentOS 7 ; (2) wrap its invocation in a shell script; (3) upload auxiliary data and executables on the platform; (4) create a workflow using the available Graphical User Interface (GUI) and finally (5) run and test the application on the Hadoop cluster. The drawback of this approach is that no parallelisation is done, hence there’s no performance gain of running the application on the Hadoop cluster. The benefit is however the access to the full data archive.
- Because the PROBA-V synthesis products are delivered in tiles by 10° × 10°, the processing can be done tile based, while keeping the application as it is. By this approach, the throughput is increased as each tile can be processed in parallel. In addition to what is described in the previous point, a simple Spark (Python or Java) script needs to be written to parallelise the processing. It only requires a minimal programming knowledge in Python or Java and the application/algorithm needs of course to support a tile based processing.
- The ideal, but most complex approach, is to write a new Spark program whereby the existing code can be reused as much as possible, if it is sufficiently modular. In this case the use of Spark can be maximised and all Hadoop features can be exploited, e.g., distributed machine learning, streaming, in memory distributed processing, etc. Based on our experience we learned that, by doing so, we can significantly increase the performance but a good knowledge of Hadoop is a prerequisite and this approach requires considerably more development time. VITO is supporting interested users in the project to guide them through this learning process.
3.1.3. Data Manager
3.1.4. Cloud Middleware–OpenStack Cluster
3.1.5. Web Portal, Including Dashboards
3.1.6. Web Services
3.2. Portfolio of Tools
3.2.1. Geo Viewer
3.2.2. Time Series Viewer
3.2.3. N-Daily Compositor
3.2.4. Cloud Toolbox
3.2.5. Interactive Notebooks
- The Geo Viewer and Time Series Viewer are easy to use and, therefore, targeted to the widest range of users, including those without any programming knowledge or remote sensing expertise. Both applications offer a Web client and a Web service interface.
- The pre-defined on-demand processing applications, of which the N-daily compositor is one example, allow users to process customised EO-products, i.e., tailored to their needs. Since a user needs to be able to use the results, the target user here is the remote sensing expert. If this expert is capable of working with Python or R, the Interactive Notebooks are suited to perform interactive data analytics.
- Towards researchers and application developers, the Cloud Toolbox is offered as a Virtual Research Environment with complex tools and libraries from a Virtual Machine. In this application the complexity is high, but the flexibility is maximal i.e., the user can even install additional tools or download additional data to the platform. Using the Cloud Toolbox, the user can reproduce within the MEP his own working environment, as if he/she was working at his premises, but having access to all available data archives.
Calibration and Validation Applications
- The possibility to extract area of interest (AOI) from the archive, at the polygon level or at the pixel level and to have easy access to time series of calibrated measurements over this AOI. This is typically the case for geographical areas around cal/val in situ validation sites or corresponding to the location of fiducial reference measurements (FRM). Note that FRM are defined as the suite of independent ground-measurements that provide the maximum return on investment (ROI) by delivering to users the required confidence in data products, in the form of independent validation results over the entire end-to-end duration of a satellite mission .
- The possibility to systematically inter-compare Level 2 geo-physical products retrieved from satellite measurements with collocated in-situ measurements or FRM. This includes the capability to define the settings for match-up collocation, the application of basic quality filtering to the satellite data, the post-processing of the results using statistical tools, and the systematic generation of standardized scatterplots and time series plots with associated statistical metrics (e.g., mean, standard deviation). Note that for some of the geo-physical parameters of interest, validation protocols, and best practices already exist, which should be followed for performing the comparison between satellite data and ground truth; see, for instance, the example of the Leaf Area Index (LAI) best practice protocol for validation, defined in the frame of CEOS Working Ground on cal/val .
- The possibility to plug in new user modules into the core processor, namely new Level 2 (or higher level) processing algorithms or part of the algorithm, or to tune the processing parameters and to systematically compare the new products’ outputs against the “operational” ones and with respect to the in situ data or FRM. This feature is essential during the algorithm improvement cycle, since it provides the required evidence of validity of the new algorithm and it allows measuring the improvements with respect to the current operational processor.
- Validation of surface reflectances. The PROBA-V higher-level products consist of atmospherically-corrected TOC reflectances, provided as daily, 5-day (only for 100 m), and 10-day synthesis products. The validation of surface reflectances is a rather complex topic, since no real “ground truth” data is available at the medium spatial resolution scale. Note that, for high-resolution sensors such as Sentinel-2 and Landsat-8, the ESA-CNES RadCalNET  prototype network is now defining a set of FRM, for the systematic validation of surface reflectances at the decametric scale. One of the approaches for validating surface reflectance at the hectometric scale is to use the AERONET ground-based network , which can be considered as a FRM. The process consists in using Radiative Transfer Model (RTM), such as LibRadtran, and the input atmospheric data retrieved from the AERONET measurements, i.e., the aerosol optical thickness (AOT) and the column water vapour content (WV), to compute “ground-truth” surface reflectances from TOA satellite measurements. These “ground-truth” surface reflectances can then be compared against the collocated operational TOC products in order to estimate their precision and accuracy. This process should be systematically available within the PROBA-V MEP as a predefined workflow, so that a typical Cal/Val or expert user can estimate the accuracy of the current processor and can also implement new algorithms (e.g., for aerosol retrieval) and see how they improve the results with respect to the current approach.
- Validation of Aerosol Retrieval. Another potential cal/val scenario for the PROBA-V MEP is the validation of AOT retrieval, using the AERONET network as FRM. The AOT estimation is performed within the PROBA-V processing chain using a semi-empirical approach, inherited from SPOT-VEGETATION and described in . This semi-empirical approach works only for vegetated areas (NDVI > 0.2) and it lacks for intensive validation, despite the importance of aerosol correction for land applications. In order to address this need, one user scenario within the PROBA-V MEP, will be to test and validate new methods for AOT estimation, using AERONET as a “ground-truth” reference. The goal is to stimulate advances in AOT retrieval algorithms and to improve the overall quality of PROBA-V operational TOC products. The typical workflow in such case consists of extracting PROBA-V TOA data over the in situ locations, apply the new aerosol algorithm, and systematically compare the AOT estimated from the satellite with the one retrieved from the ground-based AERONET measurements. Systematic production of standardized output should be considered within the workflow, such as scatterplots and temporal series of AOT for the relevant locations.
- Validation of Biophysical variables. The current baseline for the PROBA-V ESA products does not include biophysical variables which are, instead, nominally generated within the Copernicus Global Land Service. On the other hand, scientific users within the PROBA-V MEP can propose, develop, test, and validate new algorithms for retrieving specific biophysical products, e.g., leaf area index (LAI) and phenology from PROBA-V data. This is actually one of the major objectives of the MEP, i.e., to foster data exploitation enlarging the current products’ portfolio. These research results could potentially be taken up and qualified into the Copernicus Global Land Service at a later stage. Within this context, the new algorithm needs to be validated against reference ground-truth data in order to assess precision and accuracy. A cal/val scenario is the automatic validation of the new products using a predefined workflow within the PROBA-V MEP and a set of relevant ground-truth input data. The BELMANIP network , which is also used in the frame of the Copernicus Global Land Service, could be used as a validation network for this purpose. The typical workflow scenario, for such a use case, consists of the automatic extraction of the satellite derived biophysical variables over the in situ network locations, the comparison of the results with in situ data, and the provision of a standardized output in the form of scatterplots, temporal plots, and associated statistical indicators.
Conflicts of Interest
|AERONET||Aerosol Robotic Network|
|AOI||Area Of Interest|
|AOT||Aerosol Optical Thickness|
|APEX||Airborne Prism Experiment|
|BELMANIP||Benchmark Land Multisite Analysis and Intercomparison of Products|
|BELSPO||Belgian Science Policy|
|Cal/Val||Calibration and Validation|
|CCI||Climate Change Initiative|
|CEOS||Committee on Earth Observation Satellites|
|CHIRPS||Climate Hazards Group InfraRed Precipitation with Station data|
|CNES||Centre National d’Etudes Spatiales (France)|
|ESA||European Space Agency|
|ESE||Enhanced Service Support Environment|
|FAPAR||Fraction of Absorbed Photosynthetically Active Radiation|
|FCOVER||Fraction of Vegetation Cover|
|FRM||Fiducial Reference Measurements|
|GIS||Geographic Information System|
|GSTP||General Support Technology Programme|
|GUI||Graphical User Interface|
|HDF||Hierarchical Data Format|
|HDFS||Hadoop Distributed File System|
|ICT||Information and Communication Technology|
|IDE||Integrated Development Environment|
|IPR||Intellectual Property Rights|
|LAI||Leaf Area Index|
|MEP||Mission Exploitation Platform|
|NASA||National Aeronautics and Space Administration (U.S.A.)|
|NDVI||Normalized Difference Vegetation Index|
|NFS||Network File System|
|OBEOS||Ontology Based EO Search|
|OGC||Open Geospatial Consortium|
|Product Distribution Facility|
|PROBA-V||PROBA Vegetation–Project for On-Board Autonomy-Vegetation|
|R&D||Research and Development|
|REST||Representational State Transfer|
|ROI||Return on Investment|
|RSS||Research and Service Support|
|RTM||Radiative Transfer Model|
|SNAP||Sentinel Application Platform|
|SOAP||Simple Object Access Protocol|
|TOA||Top of Atmosphere|
|TOC||Top of Canopy|
|VITO||Vlaamse Instelling voor Technologisch Onderzoek|
|WCS||Web Coverage Service|
|WMS||Web Map Service|
|WMTS||Web Map Tile Service|
|WPS||Web Processing Service|
- PROBA-V Mission Exploitation Platform. Available online: http://proba-v-mep.esa.int/ (accessed on 29 June 2016).
- PROBA-V Web Site. Available online: http://proba-v.vgt.vito.be (accessed on 29 June 2016).
- Dierckx, W.; Sterckx, S.; Benhadj, I.; Livens, S.; Duhoux, G.; Van Achteren, T.; Francois, M.; Mellab, K.; Saint, G. PROBA-V mission for global vegetation monitoring: Standard products and image quality. Int. J. Remote Sens. 2014, 35, 2589–2614. [Google Scholar] [CrossRef]
- Maisongrande, P.; Duchemin, B.; Dedieu, G. VEGETATION/SPOT: An operational mission for the Earth monitoring; presentation of new standard products. Int. J. Remote Sens. 2004, 25, 9–14. [Google Scholar] [CrossRef]
- OpenSearch. Available online: http://www.opensearch.org/Home (accessed on 29 June 2016).
- VITO Product Distribution Facility. Available online: http://www.vito-eodata.be (accessed on 29 June 2016).
- APEX. Available online: http://www.apex-esa.org/ (accessed on 29 June 2016).
- Hadoop. Available online: http://hadoop.apache.org/ (accessed on 29 June 2016).
- Hortonworks. Available online: http://hortonworks.com/ (accessed on 29 June 2016).
- Spark. Available online: http://spark.apache.org/ (accessed on 29 June 2016).
- CentOS 7. Available online: https://www.centos.org/ (accessed on 29 June 2016).
- NFS. Available online: https://en.wikipedia.org/wiki/Network_File_System (accessed on 29 June 2016).
- HDFS. Available online: https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html (accessed on 29 June 2016).
- OpenStack. Available online: https://www.openstack.org/ (accessed on 29 June 2016).
- ESA RSS Cloud Toolbox. Available online: http://eogrid.esrin.esa.int/cloudtoolbox (accessed on 29 June 2016).
- OGC. Available online: http://www.opengeospatial.org/ (accessed on 29 June 2016).
- WMTS. Available online: http://www.opengeospatial.org/standards/wmts (accessed on 29 June 2016).
- WCS. Available online: http://www.opengeospatial.org/standards/wcs (accessed on 29 June 2016).
- WPS. Available online: http://www.opengeospatial.org/standards/wps (accessed on 29 June 2016).
- Copernicus Global Land Service Web Site. Available online: http://land.copernicus.eu/global (accessed on 29 June 2016).
- QGIS. Available online: http://www.qgis.org (accessed on 29 June 2016).
- GeoTIFF. Available online: https://trac.osgeo.org/geotiff/ (accessed on 29 June 2016).
- HDF-5. Available online: https://www.hdfgroup.org/HDF5/ (accessed on 29 June 2016).
- SNAP Toolbox. Available online: http://step.esa.int/main/toolboxes/snap/ (accessed on 29 June 2016).
- GRASS GIS. Available online: https://grass.osgeo.org/ (accessed on 29 June 2016).
- GDAL. Available online: http://www.gdal.org/ (accessed on 29 June 2016).
- Jupyter Notebook. Available online: http://jupyter.org/ (accessed on 29 June 2016).
- ElasticSearch. Available online: https://www.elastic.co/ (accessed on 29 June 2016).
- Barker, K.; Mazeran, C.; Lerebourg, C.; Bouvet, M.; Antoine, D.; Ondrusek, M.; Zibordi, G.; Lavender, S. MERMAID: The MERis MAtchup in-situ Database. In Proceedings of the 2nd MERIS (A) ATSR Users Workshop, Frascati, Italy, 22–26 September 2008.
- Weiss, M.; Baret, F.; Block, T.; Koetz, B.; Burini, A.; Scholze, B.; Lecharpentier, P.; Brockmann, C.; Fernandes, R.; Plummer, S.; et al. On Line Validation Exercise (OLIVE): A web based service for the validation of medium resolution land products. Application to FAPAR products. Remote Sens. 2014, 6, 4190–4216. [Google Scholar] [CrossRef]
- Blondeau-Patissier, D.; Schroeder, T.; Brando, V.E.; Maier, S.W.; Dekker, A.G.; Phinn, S. ESA-MERIS 10-year mission reveals contrasting phytoplankton bloom dynamics in two tropical regions of Northern Australia. Remote Sens. 2014, 6, 2963–2988. [Google Scholar] [CrossRef]
- Theocharous, E.; Fox, N.P. Fiducial Reference Measurements for Validation of Surface Temperature from Satellites (FRM4STS)-Laboratory Calibration of Participants Radiometers and Blackbodies; Protocol for the FRM4STS LCE (LCE-IP), ESA Contract No. 4000113848_15I-LG, NPL report OFE-D-90A-V1-Iss-1-Ver-1; National Physical Laboratory: Teddington, Middlesex, UK, 2015. [Google Scholar]
- Morisette, J.T.; Baret, F.; Privette, J.L.; Myneni, R.B.; Nickeson, J.E.; Garrigues, S.; Shabanov, N.V.; Weiss, M.; Fernandes, R.A.; Leblanc, S.G.; et al. Validation of global moderate-resolution LAI products: A framework proposed within the CEOS land product validation subgroup. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1804–1817. [Google Scholar] [CrossRef]
- Main-Knorn, M.; Pflug, B.; Debaecker, V.; Louis, J. Calibration and Validation Plan for the L2A Processor and Products of the Sentinel-2 Mission. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, 40, 1249–1255. [Google Scholar] [CrossRef]
- Holben, B.N.; Eck, T.F.; Slutsker, I.; Tanre, D.; Buis, J.P.; Setzer, A.; Vermote, E.; Reagan, J.A.; Kaufman, Y.J.; Nakajima, T.; et al. AERONET—A federated instrument network and data archive for aerosol characterization. Remote Sens. Environ. 1998, 66, 1–16. [Google Scholar] [CrossRef]
- Baret, F.; Morissette, J.T.; Fernandes, R.A.; Champeaux, J.L.; Myneni, R.B.; Chen, J.; Plummer, S.; Weiss, M.; Bacour, C.; Garrigues, S.; et al. Evaluation of the representativeness of networks of sites for the global validation and intercomparison of land biophysical products: Proposition of the CEOS-BELMANIP. IEEE Trans. Geosci. Remote Sens. 2006, 44, 1794–1803. [Google Scholar] [CrossRef]
© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).