Semantic Integration of Raster Data for Earth Observation on Territorial Units †
- A generic semantic model that allows the semantic and homogeneous description of spatio-temporal data to qualify predefined areas and to keep track of their provenance. This model is extendable to handle any kind of observed EO property and has been applied to several use cases.
- A configurable and reproducible semantic Extract, Transform and Load (ETL) process (the process is encapsulated as a docker image, accessible on https://hub.docker.com/r/h2020candela/triplification, accessed date: 23 December 2021), based on the proposed model. We have defined a set of transformation functions to populate the semantic model with data and obtain a homogeneous semantic data representation. One of the features of this process is to extract and aggregate data from rasters together with data from other sources. Aggregation takes place on territorial units’ areas.
- An EO Sentinel eco-system that allows exploiting Sentinel images, since we can represent and calculate different properties from Sentinel images (e.g., NDVI, change) or to import raster datasets from external sources (e.g., land cover data).
2. Related Work
2.1. Semantic ETL for EO Data Integration
2.2. Processing of Raster Data in a Semantic Framework
3. Semantic Model
3.1. Territorial Observation Model (tom)
3.2. Sentinel Images Metadata (eom)
3.3. EO Analysis Model (eoam)
4. EO Data Analysis Process
- Satellite images processing and analysis: This task consists of processing and analyzing satellite images coming from a DIAS (Data and Information Access Services) (in our project, CreoDias (https://creodias.eu/, accessed date: 23 December 2021)). Possible analyses can range from simple ones such as NDVI calculation to more sophisticated analyses such as change detection on time series or land cover annotation. The task generates raster files as output.
- Semantic data integration: The task extracts data from the raster files using vector files that contain the territorial units to be observed. Vector sources could come from open data repositories or from our semantic database through Semantic search.
- Semantic search: The task aims to analyze the integrated data by querying the semantic database. The SPARQL query results can be used to perform once again the first two tasks for further analysis, either as parameters for guiding the process or as input data. The results can also be used by specialized GIS applications for detailed analyses of the whole integrated data.
4.2. Semantic ETL Process
4.2.1. Data Extraction from Raster and Vector Files
- Categorical rasters (as for land cover): in this kind of raster, a pixel value is a code that represents a class. Thus, additional information is needed to decode it. For example, in a CESBIO raster, pixel values encode a type of landcover; the value 15 is decoded as a vineyard;
- Continuous rasters (as for NDVI or change indicators): a pixel value is processed to be automatically classified into a level (or class) such as Very low, Low, Middle, High, or Very high, for instance.
- Period or date of validity of the raster data: this information will be represented as a time:TemporalEntity associated with each tom:GeoFeatureObservationCollection (one per territorial unit) using the sosa:phenomenonTime property.
- Type of raster: this information, e.g., specific land cover, NDVI, or change, represented as an instance of tom:GeoFeatureObservablePropertyType, will be linked to each tom:GeoFeatureObservationCollection.
- ID of the satellite images used to generate the raster. This ID is represented as an instance of eom:Product.
- Agent who generated the raster, in other words, who performed the EO analysis process. This information is converted to an instance of prov-o:Agent.
4.2.2. Data Transformation
4.2.3. Data Load
- It is not easy to perform on-demand mapping, since the data sources we considered (presented below) are available in different formats (JSON, GeoTIFF, shapefile, or even remote compressed files (see Section 5.1), which requires a prior pre-processing step.
- A geospatial triplestore can be used as a warehouse to store semantic data to perform data enrichment and linking.
- Different datasets may be offered by several endpoints requiring a federation mechanism. However, there is currently no query engine mature enough to answer GeoSPARQL queries over such a federation . Considering that we send a single GeoSPARQL query to examine the territorial units stored in different triplestores, spatial comparisons on the fly are not possible.
5. Experimental Evaluation
5.1. Use Cases
- Vineyard use case: The objective of this use case is to recover changes in vineyards that have been damaged by natural hazards such as frost or hail. These climatic events can cause significant losses in vineyards and reduce wine productions. Their periods are known to the user who selects the corresponding images. The study area is located in the Aquitaine region of France. The vineyards in this region were reported as heavily damaged by frost on 20 April 2017. We chose the village of Saint-Emilion (INSEE code: 33394) to conduct the study. Saint-Emilion is a small well-preserved medieval village that is famous for its prestigious red wine, which is called Grands Crus Classés.
- Urban expansion use case: The use case aims at studying the changes associated with urban expansion on agricultural areas. This type of study can help land managers with their planning. We studied changes on villages between 2017 and 2020 around the Bordeaux city, one of the largest cities in France, which is surrounded by agricultural areas.
- Change indicator: Two partners of the project, Thales Alenia Space France and Thales Alenia Space Italy, have each developed a change detection application to identify various types of changes between two Sentinel images. These tools produce change indicator raters representing the probability of changes between 0 and 1.
- NDVI: NDVI information is obtained by processing near-infrared and red sensors of Sentinel images. The output of the calculation is a matrix of values between −1 and 1 characterizing the NDVI of each pixel. Since the values between −1 and 0 represent the elements composed of water, these values are set to 0 so that the rasters contain only values between 0 and 1.
- Land cover: The datasets provide information about the land cover of an area on Earth. The CESBIO land cover datasets (http://osr-CESBIO.ups-tlse.fr/~oso/, accessed date: 23 December 2021) are used for our use cases. They cover the French territory with a spatial resolution of 10 m2.
- Land register: Land register data are available from the French government data website (https://cadastre.data.gouv.fr/datasets/cadastre-etalab, accessed date: 23 December 2021) in GeoJSON format or shapefiles.
- Administrative unit data: Information of villages inside an area of interest can be obtained from OpenStreetMap-based datasets that are published on the French government website (https://www.data.gouv.fr/en/datasets/decoupage-administratif-communal-francais-issu-d-openstreetmap, accessed date: 23 December 2021). The datasets are available in shapefiles and are updated yearly.
5.2. Adaptability of the Model
- The model treats all the raster datasets, and their versions, in the same way, as long as they exist in the correct format. They can contain change, NDVI, or land cover information.
- Different classifications can be used to observe the same type of EO property. For example, the land cover observations can come from Corine Land Cover, Global Land Cover Share, CESBIO, or other open sources.
- The system can consume any vector source that describes territorial divisions, such as agricultural parcels, land register parcels, administrative units, Sentinel tiles, or forest units.
5.3. Adaptability of the Pipeline
- Vineyard use case: (i) We first obtain all the parcels of the Saint-Emilion village from the cadastre data and the CESBIO land cover raster for 2017. Then, semantic data integration is used to integrate the land cover and parcels information. (ii) Vineyard parcels within villages are retrieved via semantic search. (iii) Appropriated Sentinel-2 images are used for NDVI calculation. (iv) These images are also used for change detection. (v) The generated rasters from the 3rd and 4th steps along with the vector from the 2nd step are integrated into the semantic database. (vi) Finally, semantic search can be used to analyze all integrated information related to the vineyard of interest.
- Urban expansion use case: (i) We first select adapted Sentinel images and execute NDVI calculation. (ii) These images are also used for change detection. (iii) We obtain vector data of all the villages of the Gironde department (INSEE code: 33) from the open administrative unit datasets. Semantic data integration is next launched using the raster generated from the previous step and the obtained vector files. (iv) Finally, we can perform semantic search to analyze the integrated data related to the villages of interest.
5.4. Added Value of the Generated Datasets in Helping Decision Making
5.4.1. Improving Land Monitoring Thanks to Data Integration
5.4.2. Data Cross-Verification Thanks to Data Integration
- The parcel was not correctly annotated by the user of DLR tools, as he identified it as a ‘Mixed urban area’ (property LC_DM).
- CESBIO (property LC_CESBIO17) and Corine (property LC_Corine) correctly identify land cover at the parcel level. In fact, the differences observed between the two land covers come from the spatial resolution of the rasters (which has an impact on the precision). While the best resolution for the French area is given by CESBIO, the DLR data mining labels are given by a user with limited domain knowledge.
- The parcel was detected as having low change by TAS (property Change_Opt) using their deep learning algorithm. In fact, there is about a 5% change in NDVI levels (property NDVI) during this period.
5.4.3. Use Cases Analysis
- Vineyard use case: Two Sentinel-2 images collected on the T30TYQ tile are used for change detection and NDVI computation; they are respectively dated 19 April 2017 and 29 April 2017: we chose these images because they have very low cloud cover (0% and 15%) and the interval between these observations includes the period of study.Figure 7 represents an overview of the change levels detected and the degradation of NDVI between two dates (the NDVI after the phenomenon is lower than before, i.e., there is less vegetation than before). The very low change level is eliminated, since it is not very relevant. The NDVI degradation indicator represents the total percent degradation of five NDVI levels. We also eliminated parcels with less than 20% NDVI degradation.Finally, there are 858 parcels detected as having changed, 756 parcels detected as having NDVI degradation above 20%, and 510 parcels detected in both cases.
- Urban expansion use case: For NDVI calculation and change detection, it is recommended to collect images at the same period and in summer to limit the cloud cover and the influence of vegetation growth. Thus, two Sentinel-2 images were collected on 2 August 2017 and 6 August 2020 and have 0% of cloud cover. Figure 8 (right) represents an overview of the detected change levels and degraded NDVI levels between these two dates, along with (left) the source Sentinel images. We can observe that (i) the detected levels of change correspond quite well with the degraded NDVI due to urbanization; (ii) the closer the village reaches the city, the more it is modified. The next analysis could compare the information on change, NDVI, and land cover at the parcel level for specific villages.
5.5. Approach Scalability
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
- Villegas, J.; Sánchez Pastor, H.; Hernanz, L.; Checa, M.; Roman, D. Enabling the Use of Sentinel-2 and LiDAR Data for Common Agriculture Policy Funds Assignment. Int. J. Geo-Inf. 2017, 6, 255. [Google Scholar] [CrossRef][Green Version]
- Tran, B.H.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C. Semantic Integration of Raster Data for Earth Observation on Territorial Units. In Proceedings of the 4th International Workshop on GeoSpatial Linked Data (GeoLD 2021@ESWC 2021), Heraklion, Greece, 6–10 June 2021; Haller, A., Yaman, B., Sherif, M.A., Ngomo, A.C.N., Eds.; Mohamed Ahmed Sherif; Beyza Yaman; Armin Haller; Axel-Cyrille Ngonga Ngomo: Hersonissos, Greece, 2021; pp. 1–12. [Google Scholar]
- Zinke, C.; Ngomo, A.C.N. Discovering and Linking Spatio-Temporal Big Linked Data. In Proceedings of the IGARSS 2018—2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 411–414. [Google Scholar]
- Brizhinev, D.; Toyer, S.; Taylor, K.; Zhang, Z. Publishing and Using Earth Observation Data with the RDF Data Cube and the Discrete Global Grid System. Technical Report; W3C and OGC. 2017. Available online: https://www.w3.org/TR/eo-qb/ (accessed on 23 December 2021).
- Hobbs, J.R.; Pan, F. An ontology of time for the semantic web. ACM Trans. Asian Lang. Inf. Process. 2004, 3, 66–85. [Google Scholar] [CrossRef]
- Lefort, L.; Bobruk, J.; Haller, A.; Taylor, K.; Woolf, A. A Linked Sensor Data Cube for a 100 Year Homogenised Daily Temperature Dataset. In Proceedings of the 5th International Conference on Semantic Sensor Networks, SSN’12, Boston, MA, USA, 12 November 2012; CEUR-WS.org: Aachen, Germany, 2012; Volume 904, pp. 1–16. [Google Scholar]
- Augustin, H.; Sudmanns, M.; Tiede, D.; Lang, S.; Baraldi, A. Semantic Earth observation data cubes. Data 2019, 4, 102. [Google Scholar] [CrossRef][Green Version]
- Bereta, K.; Caumont, H.; Daniels, U.; Goor, E.; Koubarakis, M.; Pantazi, D.A.; Stamoulis, G.; Ubels, S.; Venus, V.; Wahyudi, F. The Copernicus App Lab project: Easy Access to Copernicus Data. In Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), Lisbon, Portugal, 26–29 March 2019. [Google Scholar]
- Kolas, D.; Perry, M.; Herring, J. Getting Started with GeoSPARQL; Technical Report; OGC: Rockville, MD, USA, 2013. [Google Scholar]
- Janowicz, K.; Haller, A.; Cox, S.J.; Phuoc, D.L.; Lefrançois, M. SOSA: A lightweight ontology for sensors, observations, samples, and actuators. J. Web Semant. 2019, 56, 1–10. [Google Scholar] [CrossRef][Green Version]
- Espinoza-Molina, D.; Nikolaou, C.; Dumitru, C.O.; Bereta, K.; Koubarakis, M.; Schwarz, G.; Datcu, M. Very-High-Resolution SAR Images and Linked Open Data Analytics Based on Ontologies. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 1696–1708. [Google Scholar] [CrossRef][Green Version]
- Sukhobok, D.; Sanchez, H.; Estrada, J.; Roman, D. Linked Data for Common Agriculture Policy: Enabling Semantic Querying over Sentinel-2 and LiDAR Data. In Proceedings of the ISWC 2017 Posters & Demonstrations and Industry Tracks (ISWC-PD-Industry), Vienna, Austria, 23–25 October 2017; Nikitina, N., Song, D., Fokoue, A., Haase, P., Eds.; Number 1963 in CEUR Workshop Proceedings. 2017. [Google Scholar]
- Alirezaie, M.; Kiselev, A.; Längkvist, M.; Klügl, F.; Loutfi, A. An Ontology-Based Reasoning Framework for Querying Satellite Images for Disaster Monitoring. Sensors 2017, 17, 2545. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Masmoudi, M.; Taktak, H.; Ben Abdallah Ben Lamine, S.; Boukadi, K.; Karray, M.H.; Baazaoui Zghal, H.; Archimede, B.; Mrissa, M.; Guegan, C.G. PREDICAT: A Semantic Service-Oriented Platform for Data Interoperability and Linking in Earth Observation and Disaster Prediction. In Proceedings of the 2018 IEEE 11th Conference on Service-Oriented Computing and Applications (SOCA), Paris, France, 20–22 November 2018; pp. 194–201. [Google Scholar]
- Abburu, S.; Dube, N.; Nayak, M.R.; Golla, S. An Ontology Based Methodology for Satellite Data Semantic Interoperability. Adv. Electr. Comput. Eng. 2015, 15, 105–110. [Google Scholar] [CrossRef]
- Blower, J.; Gonçalves, P.; Caumont, H.; Koubarakis, M.; Perkins, B. Exploiting Open Environmental Data using Linked Data and Cloud Computing: The MELODIES project. In Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria, 12–17 April 2015; p. 15624. [Google Scholar]
- Andrejev, A.; Misev, D.; Baumann, P.; Risch, T. Spatio-Temporal Gridded Data Processing on the Semantic Web. In Proceedings of the 2015 IEEE International Conference on Data Science and Data Intensive Systems, Sydney, Australia, 11–13 December 2015; pp. 38–45. [Google Scholar] [CrossRef]
- Bereta, K.; Xiao, G.; Koubarakis, M. Ontop-spatial: Ontop of geospatial databases. J. Web Semant. 2019, 58, 100514. [Google Scholar] [CrossRef]
- Arocena, J.; Lozano, J.; Quartulli, M.; Olaizola, I.; Bermudez, J. Linked open data for raster and vector geospatial information processing. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 5023–5026. [Google Scholar]
- Homburg, T.; Prudhomme, C.; Würriehausen, F.; Karmacharya, A.; Boochs, F.; Roxin, A.; Cruz, C. Interpreting Heterogeneous Geospatial Data Using Semantic Web Technologies. In Proceedings of the Computational Science and Its Applications—ICCSA 2016, Beijing, China, 5–8 July 2016; Gervasi, O., Murgante, B., Misra, S., Rocha, A.M.A., Torre, C.M., Taniar, D., Apduhan, B.O., Stankova, E., Wang, S., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 240–255. [Google Scholar]
- Nishanbaev, I.; Champion, E.; Mcmeekin, D. A Survey of Geospatial Semantic Web for Cultural Heritage. Heritage 2019, 2, 1471–1498. [Google Scholar] [CrossRef][Green Version]
- Ding, L.; Xiao, G.; Calvanese, D.; Meng, L. A Framework Uniting Ontology-Based Geodata Integration and Geovisual Analytics. ISPRS Int. J. Geo-Inf. 2020, 9, 474. [Google Scholar] [CrossRef]
- Tran, B.H.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C. Semantic Integration of Raster Data for Earth Observation: An RDF Dataset of Territorial Unit Versions with their Land Cover. ISPRS Int. J. Geo-Inf. 2020, 9, 503. [Google Scholar] [CrossRef]
- Tran, B.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C. An Approach for Integrating Earth Observation, Change Detection and Contextual Data for Semantic Search. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2020, Waikoloa, HI, USA, 26 September–2 October 2020. [Google Scholar] [CrossRef]
- Dorne, J.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C.; Hugues, R. Giving meaning to unsupervised EO change detection rasters: A semantic-driven approach. In Proceedings of the BIGSPATIAL ’20: Proceedings of the 9th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial@SIGSPATIAL 2020, Seattle, WA, USA, 3 November 2020; Chandola, V., Vatsavai, R.R., Shashidharan, A., Eds.; ACM: New York, NY, USA, 2020; pp. 1–10. [Google Scholar] [CrossRef]
- Dorne, J.; Aussenac-Gilles, N.; Comparot, C.; Hugues, R.; Trojahn, C. From EO Change Rasters to Knowledge Graphs: An approach Based on Regions of Interest. In Proceedings of the 4th International Workshop on Geospatial Linked Data (GeoLD 2021) Co-Located with the 18th Extended Semantic Web Conference (ESWC 2021) (GeoLD 2021@ESWC 2021), Crete, Greece, 6–10 June 2021; Yaman, B., Sherif, M.A., Ngomo, A.C.N., Haller, A., Eds.; Beyza Yaman; Mohamed Ahmed Sherif; Axel-Cyrille Ngonga Ngomo; Armin Haller: Hersonissos, Greece, 2021; Volume 2977, pp. 76–79. [Google Scholar]
- Suárez-Figueroa, M.C.; Gómez-Pérez, A.; Fernández-López, M. The NeOn Methodology for Ontology Engineering. In Ontology Engineering in a Networked World; Suárez-Figueroa, M.C., Gómez-Pérez, A., Motta, E., Gangemi, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 9–34. [Google Scholar]
- Suárez-Figueroa, M.C.; Gómez-Pérez, A.; Fernández-López, M. The NeOn Methodology framework: A scenario-based methodology for ontology development. Appl. Ontol. 2015, 10, 107–145. [Google Scholar] [CrossRef]
- Arenas, H.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C. Semantic Integration of Geospatial Data from Earth Observations. In Proceedings of the 20th International Conference on Knowledge Engineering and Knowledge Management, Bologna, Italy, 19–23 November 2016; pp. 97–100. [Google Scholar]
- Dumitru, C.O.; Schwarz, G.; Pulak-Siwiec, A.; Kulawik, B.; Lorenzo, J.; Datcu, M. Earth Observation Data Mining: A Use Case for Forest Monitoring. In Proceedings of the IGARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 5359–5362. [Google Scholar]
- Patroumpas, K.; Giannopoulos, G.; Athanasiou, S. Towards GeoSpatial Semantic Data Management: Strengths, Weaknesses, and Challenges Ahead. In Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Dallas, TX, USA, 4–7 November 2014; ACM: New York, NY, USA, 2014; pp. 301–310. [Google Scholar]
- Ioannidis, T.; Garbis, G.; Kyzirakos, K.; Bereta, K.; Koubarakis, M. Evaluating Geospatial RDF stores Using the Benchmark Geographica 2. J. Data Semant. 2021, 10, 189–228. [Google Scholar] [CrossRef]
- Quoc, H.N.M.; Serrano, M.; Mau, H.N.; Breslin, J.G.; Le-Phuoc, D. A Performance Study of RDF Stores for Linked Sensor Data; IOS Press: Amsterdam, The Netherland, 2019. [Google Scholar]
- Car, N.J.; Bastrakova, I. LDR: A 2nd-gen, National GeoLD System. In Proceedings of the 4th International Workshop on Geospatial Linked Data, GeoLD 2021, Virtual Event, 6–10 June 2021; pp. 42–50. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Tran, B.-H.; Aussenac-Gilles, N.; Comparot, C.; Trojahn, C. Semantic Integration of Raster Data for Earth Observation on Territorial Units. ISPRS Int. J. Geo-Inf. 2022, 11, 149. https://doi.org/10.3390/ijgi11020149
Tran B-H, Aussenac-Gilles N, Comparot C, Trojahn C. Semantic Integration of Raster Data for Earth Observation on Territorial Units. ISPRS International Journal of Geo-Information. 2022; 11(2):149. https://doi.org/10.3390/ijgi11020149Chicago/Turabian Style
Tran, Ba-Huy, Nathalie Aussenac-Gilles, Catherine Comparot, and Cassia Trojahn. 2022. "Semantic Integration of Raster Data for Earth Observation on Territorial Units" ISPRS International Journal of Geo-Information 11, no. 2: 149. https://doi.org/10.3390/ijgi11020149