Next Article in Journal / Special Issue
Using Introspection to Collect Provenance in R
Previous Article in Journal
From Offshore Operation to Onshore Simulator: Using Visualized Ethnographic Outcomes to Work with Systems Developers
Article Menu

Export Article

Open AccessArticle
Informatics 2018, 5(1), 11; https://doi.org/10.3390/informatics5010011

LabelFlow Framework for Annotating Workflow Provenance

1
Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L 4365 Esch-sur-Alzette, Luxembourg
2
LAMSADE Research Lab, Université Paris Dauphine, UMR CNRS 7243 Paris, France
3
Department of Population Health Sciences, King’s College London, London SE1 1UL, UK
4
School of Computer Science, University of Manchester, Manchester M13 9PL, UK
*
Author to whom correspondence should be addressed.
Received: 28 November 2017 / Revised: 4 February 2018 / Accepted: 21 February 2018 / Published: 23 February 2018
(This article belongs to the Special Issue Using Computational Provenance)
Full-Text   |   PDF [2623 KB, uploaded 24 February 2018]   |  

Abstract

Scientists routinely analyse and share data for others to use. Successful data (re)use relies on having metadata describing the context of analysis of data. In many disciplines the creation of contextual metadata is referred to as reporting. One method of implementing analyses is with workflows. A stand-out feature of workflows is their ability to record provenance from executions. Provenance is useful when analyses are executed with changing parameters (changing contexts) and results need to be traced to respective parameters. In this paper we investigate whether provenance can be exploited to support reporting. Specifically; we outline a case-study based on a real-world workflow and set of reporting queries. We observe that provenance, as collected from workflow executions, is of limited use for reporting, as it supports queries partially. We identify that this is due to the generic nature of provenance, its lack of domain-specific contextual metadata. We observe that the required information is available in implicit form, embedded in data. We describe LabelFlow, a framework comprised of four Labelling Operators for decorating provenance with domain-specific Labels. LabelFlow can be instantiated for a domain by plugging it with domain-specific metadata extractors. We provide a tool that takes as input a workflow, and produces as output a Labelling Pipeline for that workflow, comprised of Labelling Operators. We revisit the case-study and show how Labels provide a more complete implementation of reporting queries. View Full-Text
Keywords: workflow; provenance; domain-specific annotation workflow; provenance; domain-specific annotation
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Alper, P.; Belhajjame, K.; Curcin, V.; Goble, C.A. LabelFlow Framework for Annotating Workflow Provenance. Informatics 2018, 5, 11.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Informatics EISSN 2227-9709 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top