Provenance Information Representation and Tracking for Remote Sensing Observations in a Sensor Web Enabled Environment

The provenance of observations from a Sensor Web enabled remote sensing application represents a great challenge. There are currently no representations or tracking methods. We propose a provenance method that represents and tracks remote sensing observations in the Sensor Web enabled environment. The representation can be divided into the description model, encoding method, and service implementation. The description model uses a tuple to define four objects (sensor, data, processing, and service) and their relationships at a time point or interval. The encoding method incorporates the description into the Observations & Measurements specification of the Sensor Web. The service implementation addresses the effects of the encoding method on the implementation of Sensor Web services. The tracking method abstracts a common provenance algorithm and four algorithms that track the four objects (sensor, data, processing, and service) in a remote sensing observation application based on the representation. We conducted an experiment on the representation and tracking of provenance information for vegetation condition products, such as the Normalized Difference Vegetation Index (NDVI) and the Vegetation Condition Index (VCI). Our experiments used raw Moderate Resolution Imaging Spectroradiometer (MODIS) data to produce daily NDVI, weekly NDVI, and weekly VCI for the 48 contiguous states of the United States, for May from 2000 to 2012. We also implemented inverse tracking. We evaluated the time and space requirements of the proposed method in this OPEN ACCESS Remote Sens. 2015, 7 7647 scenario. Our results show that this technique provides a solution for determining provenance information in remote sensing observations.


Introduction
The Sensor Web is an infrastructure that establishes a bridge between sensor resources and applications through a series of information models and interface specifications defined by the Open Geospatial Consortium (OGC) [1].The Sensor Web has been extensively used in remote sensing applications, where observations often undergo complex processing between their origin and data product [2][3][4][5][6][7].The provenance of remote sensing products is important when tracing their histories, validating their trustworthiness, and analyzing their qualities.Moreover, it is particularly important to track the four aspects (sensor, processing, data, and service) [8][9][10] that are associated with the following frequently asked questions.Which sensor was used to observe these remote sensing observations, what method or algorithm was applied to these remote sensing observations to generate the products, what remote sensing observations were processed to create these products, and what were the services and processing related to these observations?Here, an observation is an act of measuring or otherwise determining the value of a property [1], a sensor is a mechanical device used to obtain a remote sensing observation, data are similar to products, a processing is a method or algorithm that produces remote sensing data, and a service is a Web service.We have focused on the provenance of these four objects.However, it can be hard to capture provenance information in the Sensor Web enabled environment for remote sensing observations, because there is a lack of representation and tracking methods in loosely-coupled and standards-based service environments [8].
With respect to provenance, researchers have investigated database, workflow/Web, international specification, and sensor network systems.Database methods were the first to be considered.Researchers have focused on transformations performed on a view, a table, or an item in a database, and the inverse provenance method [11][12][13][14].Data provenance within a workflow/Web environment records the workflow process and describes the Web data [15][16][17][18][19].Some international specifications have been proposed for building open provenance models (OPMs).An OPM defines an open data model from an inter-operability viewpoint, with respect to the community of contributors, reviewers, and users [20].The W3C Provenance Incubator Group maps a core set of provenance vocabularies and models, including the OPM.W3C developed a general provenance data model called PROV-DM, to offer a basis for interoperability across diverse provenance management systems [21].The ISO 19115 and ISO 19115-2 Lineage Models were proposed to break the data provenance heterogeneity and integrate resources, such as the Web Coverage Service [22,23].A remotely sensed data processing provenance system called Karma was based on the OPM.Provenance-aware applications for sensor networks were studied in terms of naming, indexing, and mapping using the OPM method [24].
Great progress has been made with respect to provenance, but existing methods cannot be directly applied to the Sensor Web.The Sensor Web environment is a specification framework, and it must properly integrate provenance representation and tracking.Database techniques focused on what the provenance information was and how the provenance information was stored and queried in a database.However, in the Sensor Web enabled environment, it is more important to consider the standards themselves.There are issues that should be considered by the developer who concretely implements the specification.Workflow/Web methods provided some approaches for organizing provenance information, but have not effectively built provenance models that can be seamlessly integrated with current Sensor Web specifications, because they cannot fuse the provenance model and specifications.International provenance models are open and abstract models that are used to exchange provenance information between systems.However, they do not provide a domain-specific provenance model.For example, in the Sensor Web domain, OPMs and PROV-DM need concrete expressions.This requires a Sensor Web, domain-specific provenance method.Although the Sensor Web has some descriptive specification information for sensors, processes, and observations (Sensor Model Language (SensorML) [25] and Observation & Measurement (O&M) [26,27] specifications), there are no explicit representation models and tracking methods for provenance.For example, it cannot represent the provenance relationships and associated tracking method.
The objective of this paper was to propose a provenance information representation and tracking method for remote sensing observations in the Sensor Web enabled environment.We considered the provenance issue in the Sensor Web enabled environment (and not the common open remote sensing production service, Web Coverage Service) for the following reasons.(1) Sensor Web technologies have been extensively applied to remote sensing applications, and there are existing provenance issues [8]; (2) The provenance issue for the Web Coverage Service has been solved to some degree by the ISO 19115 and ISO 19115-2 Lineage Models [22]; (3) The Web Coverage Service has been applied to the inputs and outputs of the Sensor Web [4,5], so the provenance in the Sensor Web enabled environment is also associated with Web Coverage Service provenance.In this paper, we focused on four aspects (sensors, data, processing, and services) and investigated two procedures: (1) modelling the direct and implicit relationships between the sensors, data, processing, and services as a tuple descriptor, providing a representation and linkage solution for the entire remote sensing observation processing; and (2) developing an indicator that helps users trace historic information, validate trustworthiness, and analyze the quality of remotely sensed observations in their applications.The contributions of this paper are as follows.
(1) We propose a method for representing and tracking provenance information in the Sensor Web enabled environment for remote sensing applications.(2) We tested this method by applying it to vegetation condition applications.
The remainder of this paper is organized into four sections.We demonstrate a provenance method in Section 2, and describe our method, experiments, and results in Section 3. Finally, Section 4 contains a discussion of our results and conclusions.

Provenance Method
In remote sensing applications, final products are created from original observations through a series of complex processing steps.Therefore, the representation of provenance information for remote sensing observations at each stage of the processing is a prerequisite for tracking the provenance of the final products.The representation refers to modeling (provenance description model), encoding (encoding method used for the provenance description model), and implementation (service implementation considering the provenance description model) in the Sensor Web enabled environment.The provenance description model formalizes provenance objects and their relationships, making it easier to represent and track provenance information.The encoding method incorporates the provenance model into Sensor Web specifications.Accordingly, the service implementation is a response to changes made by the encoding method.The ability to track provenance information depends on the content of the representation.Thus, we must: (1) define the provenance description model for remote sensing observations; (2) encode the provenance description model and incorporate it into Sensor Web specifications; (3) implement provenance-compatible Sensor Web specifications into the services; and (4) track provenance information in the Sensor Web enabled environment based on the developed representation.These four problems are referred to as the description model, encoding method, service implementation, and tracking algorithm, and are discussed further in Sections 2.1-2.4.

Description Model
The description model is the formal representation of the relationships between provenance objects for remote sensing observations.The description model is responsible for building relationships between the four objects with the following conventions.All data in a Sensor Web are encoded with O&M.The data are accessed/stored using the Sensor Observation Service (SOS) [28], and all data flow, discovery, accessing, management, and processing stages use standard Web services.These conventions guarantee that the description model was made in a "pure" Sensor Web enabled environment.As mentioned in Section 1, we focused on the provenances of data, sensor, processing, and service in a Sensor Web enabled environment.The "process" in SensorML can be a sensor or an algorithm, whereas the processing in this paper (Definition 2.1) refers to an algorithm.O&M states that an "observation" has a process (e.g., sensor or algorithm) and a result, whereas data in this paper (Definition 2.1) refers to the result.This helps to distinguish between a sensor, an algorithm, and a result, and also their provenance.Therefore, we can consider that observations build the relationships between the four objects by either direct or implicit links (Definition 2.1).Provenance is the utilization of available information to track historical information.Thus, it is meaningful to distinguish between available and historical information, to derive an explicit provenance scope.This distinction is defined in Definitions 2.2 and 2.3 from the observation perspective.Finally, the observation provenance is defined in Definition 2.4. Figure 1 shows an example of the observation provenance and Definitions 2.1-2.4.
where R1 is a restriction defined as which indicates that a Geographical Phenomenon (GP) with the effect of _ results in _ , The symbol "  " is used here to denote the phrase "results in".
When time tn is later than t0 (tn -1 ≥ t0), Equation ( 1) can be written as where R2 is a restriction defined as { } proces g tn service tn which indicates that SOtn is derived from SOtn−1s with the processing _ and the service _ .For example, in Figure 1, observation observation_n for time period 0, 4 and observation_lm for time period 5, 6 are SOs.
Definition 2.2: A Current Status Observation ( CSO ) is a SO at a designated ST.For example, in Figure 1, if the designated ST is for time period 4, 5 , observation_nk and observation_l are both CSOs.
Definition 2.3: A Historic Status Observation (HSO) is a SO relative to a CSO.Its ST is before that of the CSO.Furthermore, the CSO should be directly or indirectly derived from the HSO.
Definition 2.4: Observation Provenance (OP) is the process of determining all of the HSOs of a CSO, that is where OP(CSO) denotes the data provenance for the CSO.The result is the set of the CSO's HSOs.These definitions can be summarized as follows.An SO contains the provenance information for a time point or interval.Provenance is a recursive issue.The starting point of the recurrence is a CSO, and this loops directly to determine its HSOs.If the CSO is a root node, then the HSOs are element nodes and their relationships are linked to the root node and other element nodes to form a "tree", as shown in Figure 1.Provenance, in this case, refers to a backwards trace over the whole "tree".
As previously mentioned, we consider the provenance issue in a "pure" Sensor Web enabled environment.However, the processed data may not come directly from a sensor, and may be from an impure data source (e.g., a Web Coverage Service or a data system).In this case, the provenance of the data source is not supported by the proposed method, but it can be determined by its own provenance method.For example, if the Web Coverage Service is the data source, the provenance problem can be dealt by the ISO 19115 and ISO 19115-2 Lineage Models [22].Moreover, we can strategically map the description model to a standard provenance model.If the impure data source can follow or map a standard provenance model, the problem may be solved.We extended the formalized description of a provenance model from PROV-DM [21], as shown in Figure 2, so that the provenance model is machine-readable and capable of sharing and interoperation.The W3C provenance (PROV) family of specifications is based on the PROV-DM conceptual data model.PROV-DM distinguishes core structures, derives provenance information from extended structures, and caters to more specific uses.PROV-DM defines the objects and relationships of Entity, Activity, and Agent.Data and sensors are extensions of Entities, processing is extension of Activity, and services are extensions of Agents.

Encoding Method
The encoding of the model is used to implement a provenance-aware system.Sensor Web is a Web-based and standard specification-based environment.It is therefore better to incorporate the provenance representation method into existing specifications (e.g., SensorML and O&M), to reduce manpower and material resources, and to maintain consistent sharing and interoperational environments.Considering this, we encode the description model by defining the XML schema for sensors, processing, data, and services.The Sensor Web has some provenance-related information (SensorML and O&M), which describes sensors, processes (algorithms), and observations.SensorML has already modeled sensor and process information, and O&M has modelled all four objects.Thus, in this paper, we mainly use the provenance encoding from O&M 2.0 and SensorML 2.0.In O&M 2.0, the root element is OM_Obsevation, and the main elements branching from it are: type, metadata, relatedObservation, phenomenonTime, resultTime, validTime, procedure, observedProperty, featureOfInterest, rsultQuality and result.SensorML 2.0 classifies the processes into atomic-non-physical, atomic-physical, composite-non-physical, and composite-physical.They are defined by four elements SimpleProcess, AggregateProcess, PhysicalComponent, and PhysicalSystem.
Based on these existing specifications and encoding methods, the provenance encoding strategies are as follows.The observation object is encoded by the OM_Observation element in the O&M schema, the sensor object is encoded by the PhysicalComponent element or the PhysicalSystem element in SensorML, the data object is encoded by the result element in SensorML, the processing object is encoded by the SimpleProcess element or the AggregateProcess element in SensorML, and the service object is encoded by newly defined elements such as ParentOM, ProcessService, and CurrentOMService, as shown in Figure 3. ParentOM is an element that defines the information track of the parent O&M, which is processed to create the current O&M.The ParentOM XML schema is shown in Figure 4.The input element of the SimpleProcess or AggregateProcess elements in SensorML records the Sensor Web Enablement common data type AnyData.Therefore, the parent O&M, as the input, is not expressed by the input element.It uses ParentOM to record the element.SimpleProcess or AggregateProcess simply show the processing, and do not indicate the provided services.A ProcessService records the processing service, as shown in Figure 5a.It displays processing service capability information and links.The CurrentOMService is similar to the ProcessService and can display current O&M service capability information, as shown in Figure 5b.The result element in O&M does not have a specified type.Given these considerations, the set elements ParentOM, ProcessService, and CurrentOMService are the output elements of the result, which maintain the structure and save the provenance information in O&M, as shown in the red rectangle in Figure 6.

Service Implementation
The implementation of the remote sensing observation provenance model in Sensor Web services is important to applications.The proposed provenance model is embedded in O&M and adds some child elements to the O&M result element.All Sensor Web services should respond to this new information when implemented.This response requires an understanding of the relationships between services and the O&M, and the exact effect that the O&M has on the services.The interactions and internal implementations of the O&M and Sensor Web services (SOS, Web Processing Service (WPS) [29] and Sensor Planning Service (SPS) [30]) are shown in Table 1.  1 shows that, the services and their operations, input, and output parameters have relationships with O&M.
SOS is a standard service interface in SWE.SOS provides access to observations from sensors and sensor systems in a standard manner that is consistent for all sensor systems, including remote, in-situ, fixed and mobile sensors.GetObservation and InsertOservation operate the O&M in SOS.GetObservation is invoked to obtain O&M.InsertOservation is invoked to insert O&M into SOS.SOS stores and manages O&M.The three elements (omp:ParentOM, omp:ProcessService, and omp:CurrentOMService), in the om:result should be carefully considered in the storing strategy, when storing the O&M in a SOS system.
SPS is also a standard service interface in SWE.SPS was designed and developed to allow clients to determine the feasibility of a desired set of collection requests for one or more sensors/platforms in an interoperable service.A client directly submits collection requests to these sensors/platforms.GetCapabilities, DescribeTasking, Submit, and DescribeResultAccess are the main operations in SPS.Submit is the core operation in SPS that allows planning sensor or sensor systems to submit a task.It contains two processes: submitting a task, and encapsulating the processing into SimpleProcess or AggregateProcess, or the SPS service into the ProcessService when planning the O&M result.The result of the Submit operation is exposed by the DescribeResultAccess operation.
WPS is an OGC specification.WPS defines a standardized interface that facilitates the publishing of geospatial processes (including any algorithm, calculation, or model that operates on spatially referenced data), the discovery of these processes and the binding of these processes by clients.GetCapabilities, DescribeProcess, and Execute are the three operations in WPS.After the Execute operation, the O&M contains the provenance information if the output is O&M and processes provenance information similar to SPS.
The above analysis highlights factors of particular importance to O&M provenance, but methods for developing the services are unrestricted.

Tracking Algorithm
This section introduces the algorithms that track provenance information based on the described model and encoding method.A CSO may be dealt with by complex processes.To obtain the provenance   executes the following steps.1: If the child element of //result/om:OM_Observation (which shows the element ParemtOM in O&M with an XML Xpath (http://www.w3.org/TR/xpath/)) is om:OM_Observation, we return om:OM_Observation.If not, we proceed to the next substep.2: If the child element of //result/omp:ParemtOM is not om:ParentObject, we return an error message.Otherwise, we obtain the om:ParentObject: xlink:href, method, mineType, encoding, and schema values.If the method is HTTP GET or POST, we return an error message.If the method is a HTTP GET, we send a GET request to xlink:href to obtain the result (O&M).If the method is HTTP POST, we proceed to the next substep.3: If the Header is required, we obtain the Header message and POST the message from the body or bodyReference.Then, we send a POST request to xlink:href by integrating the POST with the Header message to obtain the result.If the Header is not needed, we directly obtain the POST message from the body or bodyReference.We then send a POST request to xlink:href to obtain the result.

Experimental Section
To test the proposed provenance method, we ran an experiment that generated and tracked vegetation conditions in a Sensor Web enabled environment.Vegetation conditions are critical to decisions in agriculture and ecology, in the public and private sectors.Numerous methods and observations can be used to evaluate the vegetation condition [31,32].It can be derived using statistical indices such as the normalized difference vegetation index (NDVI) and the vegetation condition index (VCI) [32].These are defined as (7) and (8) In Equation (7), NIR is the near-infrared band data for a pixel, and IR is the infrared band data.In Equation ( 8), NDVImin and NDVImax are the minimum and maximum NDVIs for a pixel during a specific time period.We used Moderate Resolution Imaging Spectroradiometer (MODIS) observations to calculate the NDVI and VCI.The MODIS observations were obtained from the Land Processes Distributed Active Archive Centre (https://lpdaac.usgs.gov/).We used the "MODIS/Terra Surface Reflectance Daily L2G Global 250 m SIN Grid v005" dataset.The MODIS provides 250-m resolution daily images for the red and NIR bands (0.6 μm-0.9 μm).The experimental observation period was May of each year from 2000-2012, and the area under investigation covered the 48 contiguous states of the United States.
In this experiment, the MODIS data were planned using SPS, encoded with O&M, published with SOS, and processed with WPS.After this, we set up the provenance instance of the MODIS data in a Sensor Web enabled environment.We tracked the provenance of four objects using this provenance instance to evaluate the performance of our method.

Experiment Design
The six main processing of this experiment are shown in Figure 8.
Processing 1: SPS used the raw MODIS observations rather than real MODIS sensor terra data, because it cannot control the sensor [5].The MODIS observations were planned based on spatio-temporal information by invoking the SPS submit operation from the MODIS data center [5].
Processing 2: We assumed that MODIS scans the earth's surface after the SPS.All the required MODIS observations were collected.The MODIS observations encoded in the O&M format were inserted into SOS by invoking the InsertObservation operation from the FTP server.At this point, the MODIS sensor information has already been registered into SOS by the client using the RegisterSensor operation.An SPS instance simulated the planning of the satellite that carries the MODIS sensor.The parameters of the planning task include information on the time and location.The MODIS FTP server provided 250-m resolution daily images for the red and NIR bands (0.6-0.9 μm).The time parameter was the date.The location should was transformed into tiles [33] that cover the 48 contiguous states of Processing 3: The MODIS files were pre-processed by the WPS to generate a daily MODIS file.This processing adjusted for noise, cloud, and snow cover, corrected the angles, and implemented mosaic and clip procedures.The files were then inserted into SOS in the O&M format using the standard method.
Processing 4: The daily NDVI was calculated be the WPS.The inputs for this WPS processing were the daily MODIS O&M products from SOS.The output was the daily NDVI product with O&M formatting.The result was inserted into SOS by the InsertObservation operation.
Processing 5: The weekly NDVI was calculated be the WPS.The input for this WPS contained seven daily NDVI.The output was a weekly NDVI.After processing, the result was inserted into SOS.
Processing 6: The weekly VCI was calculated.The inputs were the current weekly NDVI product from the current year and the 7-day daily NDVI products from the previous year, which have the same date order.The output was a weekly VCI product, which was managed by SOS.
All the O&Ms included the provenance information created by the method in Section 2.2.

Representation of the Provenance Model in O&M
We ran the six processing in Figure 8.The observations from a single month (May) over 13 years (from 2000 to 2012) were downloaded with SPS using spatio-temporal information.This resulted in 10,051 HDF files that were over 700 gigabytes in size, which covered the 48 contiguous states of the United States.An HDF file with an O&M format is shown in Figure 9.The entire O&M document was composed of the current observations.The sensor information was encoded using sml:PhysicalComponent.The observation entity was linked with the xlink:href attribute in the om:result element.All observation entity links have the string pattern, "ftp://e4ftl01.cr.usgs.gov/MODIS_Dailies_B/MOLT/MOD09GQ.005/" + date + "/" + tile observation name, similar to the observation link in Figure 9 "ftp://e4ftl01.cr.usgs.gov/MODIS_Dailies_B/MOLT/MOD09GQ.005/2012.05.31/MOD09GQ.A2012152.h13v04.005.2012154062915.hdf".This linked the observation entity in the O&M document without encoding the observation entity, which was between tens to hundreds of megabytes in size.A large XML document is difficult to process and deliver to the Web through general programming methods [34].This hinders XML [35].The processing service and current O&M service were linked with the omp:ProcessService and omp:CurrentOMService elements, respectively.
We generated daily MODIS files that covered the 48 contiguous states of the United States.In the WPS, the pre-processing algorithm simply performs mosaic and clip operations, and does not implement the other pre-processing steps discussed in Section 3.1.However, we assumed that the pre-processing algorithm eliminated noise, cloud, and snow cover, corrected the angles, and generated a usable daily MODIS file.This daily MODIS file recorded all the O&M processing.The lack of noise, cloud, and snow cover, and the angle correction may be why it was hard to generate a perfect daily NDVI product, and why there was no effect on the provenance model.The model simply records the algorithm and ignores the precision.The daily NDVI O&M is shown in Figure 10.The inputs of this pre-processing algorithm were the daily MODIS tile files.The method was linked using a pre-processing method.xmlfile that describes the calculations in detail.The omp:ParentOM was used to record the tracking request of the parent's provenance information.In this document, we only used one omp:ParentObject to request the 25 (except 23 May 2001 and 11 May 2012) input "parent" provenance data.The request is an O&M GetObservation request, and the procedure, offering, observedProperty, and time information point to the 25 "parent" observations.The generated daily and weekly NDVI (a week starts on a Tuesday) have a provenance information structure similar to that of the daily NDVI.We then calculated the weekly VCI using Equation (8).The NDVI is the current week's NDVI.For example, for 8 May 2012 to 14 May 2012, the NDVI is the weekly NDVI for the 19th week in 2012.NDVImin and NDVImax are the minimum and maximum values for the 7-day NDVI of each previous year (2000 to 2011).The 7-day NDVI of each year has the same start and end times (same ordered day in a year) as 2012, because 8 May 2012 is the 129th day and 14 May 2012 is the 135th day in 2012.The 7-day NDVI of each year represents the 129th to 135th days.The 7-day NDVI for each year may not be the weekly NDVI.Thus, the weekly NDVI should be calculated with the daily NDVI.These results show that the provenance object information is represented in the O&M documents, from the raw MODIS observations to the daily MODIS, daily NDVI, weekly NDVI, and weekly VCI.(1) Tracking the sensor.The current observed data (SOtn) is the weekly VCI.We determined the raw MODIS observation ( _ ) using Algorithm 5.The sensor information was then retrieved as shown in Figure 12.The sensor information was described by PhysicalComponent and its Xpathom:OM_Observation/om:procedure/sml:PhysicalComponent.The sensor information includes the position and output information for the sensor.In this example, we found 2256 raw HDF observations observed by the same MODIS sensor.We tracked a single sensor 2256 times.Therefore, from a theoretical perspective, the provenance processes of the single sensor reflect the provenance processes of multiple sensors.
(2) Tracking the algorithms.An algorithm is represented by the SimpleProcessl element.The SimpleProcessl can contain qualitative and quantitative descriptions of the algorithm.The quantitative description is an executable code and a mathematical expression that defines an algorithm using the mathematical markup language, and an XML application that describes mathematical notations and captures the structure and content of the equation.In this experiment, the processing algorithms were VCI, NDVI, and the pre-processing algorithms, as shown in Figure 12.
(3) Tracking the service.Algorithm 4 identified the processing service.The algorithms used to calculate the daily MODIS, daily NDVI, weekly NDVI, and weekly VCI were encapsulated in the WPS, as shown in Figure 12.
These results show that the sensor, processing, and service information were appropriately tracked by the algorithms.

Performance Analysis
Time complexity is an important aspect of the performance of this provenance model.The encoding method is similar to an inverted 'tree'; the current observation is the root, and the tracking depth of an object is the minimum number of times that the tracking of the object forms the root.Denote the maximum depth of the current observation as n, and the time cost of the provenance at depth i as .Then, the total time is ∑ .does not exceed n max n max , which shows that the maximum time cost at a certain depth provides a reasonable representation of the time complexity of the model.The traversing provenance time of the current observation was not shorter than the queries to some sensors, data, processing, or services.The traversing times of weekly VCI, weekly NDVI, daily NDVI, and daily MODIS were tested as much as possible, instead of simply querying some objects at certain depths to obtain the maximum time cost.The results are shown in Figure 13a-d.Figure 13a shows the traversing time of the weekly VCI.The x-axis represents the 19th week of each year.The y-axis is the time cost (in milliseconds, ms). Figure 13a shows that the time increased linearly from 2000 to 2012.The maximum did not exceed 140 s.In 2012, there were 2367 tracking documents.
Figure 13b shows the 19th weeks of 2000-2012.This figure reveals that the traversing time for weekly NDVI was approximately 10 s (an average of 9948.615ms).There were 170 tracking documents.Figure 13c,d show the time cost for the daily NDVI and MODIS for 12 May between 2000 and 2012.No more than 2 s and 1 s were needed to traverse the daily NDVI and MODIS, respectively.There were 27 and 26 tracking documents.Comparing Figure 13a,d, the time cost of each tracking document was 57.186 (135359/2367), 58.521 (9948.615/170),50.915 (1374.692/27),and 36.980(961.4615/26).The time costs were very similar, except for the daily MODIS, which may be because of a random error (there are inherently unpredictable fluctuations when performing POST requests to Web services).If there were more tests, we would find fewer random errors, given that testing the time cost of each of the 2367 documents 170 times yielded very similar results.The above time analysis reveals the following.In this experiment, when the maximum provenance depth was large (six) and there were hundreds to thousands of tracking documents (2367), the tracking time was short (120 s), and the time cost per document became stable as the tracking documents increased (50-60 ms).Additionally, the execution time grew linearly with the number of tracked documents, as shown in Figure 13a.
Space complexity is another aspect of the performance of this provenance model.The memory or storage required by the provenance information is affected by the size of the O&M document.The data entities are linked by a URI to reduce the size of the O&M document, as mentioned in Section 3.1.Other information is recorded in the O&M document.The O&M document sizes for tile MODIS, daily MODIS, daily NDVI, and weekly NDVI were approximately 3, 7, 5, and 5 KB, respectively.The maximum size for the weekly VCI was approximately 29 KB.Therefore, the O&M was not large.The results demonstrate the space complexity of this experiment was in the order of KBs.

Discussion and Conclusions
Tracking provenance information is an important and challenging issue for remote sensing observations in the Sensor Web.This paper proposed a method for representing and tracking provenance in the Sensor Web, focusing on four objects that are generally considered in remote sensing applications.We conducted an experiment to test the representation and tracking of the provenance objects (sensors, processing, data, and services) for vegetation conditions represented by NDVI and VCI.Our conclusions can be summarized as follows.
 The proposed provenance representation model is a Sensor Web, domain-specific model.
Compared with provenance studies in terms of database, workflow, Web, international specification, and distributed system methods, the work of this paper mainly focused on a provenance representation that can be integrated with Sensor Web specifications.The provenance model was integrated into the Sensor Web specifications without affecting the structures, semantic relationships, and framework.We proposed a provenance model and a tracking approach, but did not consider the implementation, which is left to a developer. The designed provenance method can represent and track provenance information for remote sensing observations in a Sensor Web enabled environment.We conducted an experiment to test the representation and tracking in terms of the sensor, processing, data, and service objects, considering vegetation conditions represented by the NDVI and VCI for May from 2000 to 2012. Although the performance of the provenance method is associated with its implementation, we can consider the time and space complexities in this experiment.The time cost of tracking several depths and hundreds or thousands of documents was in the order of tens to hundreds of seconds.We analyzed the performance for our experiment based on six provenance depths.The average tracking time per document ranged from 50 to 60 ms.The size of the O&M documents was in the order of 10 KBs.Numerous remotely sensed observations may be processed up to a dozen times in this application.The execution time grew linearly with the tracked documents Figure 13a.The average time cost per tracking document was not significantly affected, as shown in Figure 13a-d.Therefore, we can deduce that as the execution time and required storage increased, the cost increased linearly. The proposed framework can be applied in other environments.Although the provenance model is based on the Sensor Web framework, it may be extended to record provenance information for Web services.If the processed observations are described with O&M, and the parent O&M is tracked with a xlink:href in omp:ParentOM without sending a GET/POST request, the provenance information can also be tracked.If an O&M document is used to describe the observation's metadata, the provenance information can also be recorded.The service information should be set to null.However, the procedure information should be described in more detail to explain how the observations were handled, which may increase the scope of this model.
Future work will focus on optimizing the performance of this framework and testing it using hundreds of tracking depths and thousands of tracking documents, attempting to apply it on a larger scale, and considering a security strategy.

Figure 2 .
Figure 2. Formalized description of the provenance model extended from PROV-DM.
information, we need to know: what are the HSOs for a CSO; what are the HSOs for a CSO at a specified time point or time interval; to which sensors does a CSO refer; and what processing and services were used to result in a certain CSO?To answer these questions, we designed an abstract algorithm and four implementations for tracking Pdata, Pprocessing, Psensor, and Pservice.The abstract processes of Algorithm 1 are: input the provenance data and the required provenance object, track the provenance, obtain the nearest historical provenance information, validate the provenance information, and return the provenance result, as shown in Figure 6a.To better explain Algorithm 1, we used a pseudofunction to describe the abstract process, as shown in Figure 7b.Algorithms 2-5 are implementations of Algorithm 1, so they have the same steps.The specific implementations of Algorithms 2-5 are as follows.

Algorithm 3 : 1 :Algorithm 4 : 1 : 2 :Algorithm 5 :
Tracking processing objects Input: current status observation SO tn required provenance information indicator sin PIndicator P proces g tm  Output: processing object sin _ P proces g tm Annotation: Algorithm 3 is an implementation of Algorithm 1.The steps are the same as Algorithm 1, and we define two specific functions as follows.the times match.Otherwise, it returns false.If the child element of om:OM_Observation/om:procedure is SimpleProcess, sin _ P proces g tm is assigned the SimpleProcess information.If the child element of om:OM_Observation/om:procedure is AggregateProcess, sin _ P proces g tm is assigned the AggregateProcess information.Otherwise, we return an error message./om:result/omp:ProcessService.Then, the omp:ProcessService information is added to sin _ P proces g tm .3: Return sin P proces g tm .Tracking service objects Input: current status observation SO tn required provenance information indicator PIndicator P service tm  Output: service object _ P service tm Annotation: Algorithm 4 is an implementation of Algorithm 1.The steps of Algorithm 4 are the same as Algorithm 1, and we define two specific functions.the times match.Otherwise, it returns false.The om:result information of SO tm is obtained using Xpathom:OM_Observation/ om:result.If the children elements have omp:ParentOM, _ P service tm includes omp:ParentOM information.If the children elements have omp:ProcessService, _ P service tm includes omp:ProcessService information.If the child elements have omp:CurrentOMService, _ P service tm includes omp:CurrentOMService information.Return P service tm .Tracking sensor objects Input: current status observation SO tn required provenance 5 is an implementation of Algorithm 1.The steps of Algorithm 5 are the same as Algorithm 1, and we define two specific functions.
. There were 25 MODIS tile Hierarchical Data File (HDF) files for each item in the dataset.

Figure 8 .
Figure 8. Six processing of the experiment.

Figure 11 .
Figure 11.Part of the O&M document for weekly VCI.
The inputs for the 19th weekly VCI in 2012 are the 19th weekly NDVI from 2012 and the 7-day daily NDVI for each year from 2000-2012.The result is shown in Figure 11 (one omp:ParentOM element recorded the weekly NDVI and 12 omp:ParentOM elements recorded the 7-day daily NDVI from 2000-2011).

3. 2 . 2 .
Tracking Objects with the Tracking Algorithm from O&M Consider the 19th (8 May 2012 to 14 May 2012) weekly VCI for 2012, as an example.We will use it to demonstrate how Algorithms 2-5 track the provenance objects.The statistics of the provenance information are shown in Figure 12, and the three provenance objects are as follows (the data is described in Section 3.2.1).

Figure 12 .
Figure 12.Statistics of the provenance information.

Table 1 .
Relationships between Sensor Web services and O&M.

Algorithm 1 :
Abstract tracking algorithm Input: current status observation SO tn required provenance information indicator PIndicator Output: provenance result POutput indicated by PIndicator Use: the provenance information of SO tn at tn get4(SO tn ) obtains the set of four objects, P objectset tn , from SO tn Validate that the current observation time (tn) occurs before the initial time of the sensor t0(tn < t0).If this is true, the correct result cannot be found, even at the initial observation, and we return a "not found" exception.If it is false, we proceed to the next step.Compare the provenance objects from PIndicator with the objects in tracks Annotation: Algorithm 2 is an implementation of Algorithm 1.The steps are the same as in Algorithm 1, and we define the necessary functions as follows.For get4(SO tn ), SO tn is an O&M.The O&M embeds the four objects of the provenance model and thus returns SO tn .