Skip to content

common-workflow-language/cwlprov

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cwlprov

Profile for provenance research object of a CWL workflow run.

Cite as

Peer-reviewed paper about CWLProv:

Farah Zaib Khan, Stian Soiland-Reyes, Richard O Sinnott, Andrew Lonie, Carole Goble, Michael R Crusoe (2019):
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.
GigaScience 8(11):giz095 https://doi.org/10.1093/gigascience/giz095

Citing the profile itself: Profile DOI

Quicklinks

Overview

CWLProv is an informal profile to define how to record provenance of a workflow run (typically CWL or Nextflow), captured as a research object using Linked Data standards.

There are three parts to this profile:

This repository may later also include formal profiles for computational validation, e.g. BagIt profile of included resources, ShEx for manifest content, and PROV Template to document PROV structures.

The CWLProv white paper describes the background and motivation for this profile. For the avoidance of doubt, from CWLProv 0.3.0 this GitHub repository is authoritative of CWLProv specifications.

Known implementations

License

This repository is distributed under Apache License, version 2.0.

See the file LICENSE.txt for details, and NOTICE for required notices.

Contributing

CWLProv is maintained at https://github.com/common-workflow-language/cwlprov/

Feel free to raise an issue or a pull request to contribute to CWLProv. Contributions are assumed to be covered by section 5 of the Apache License.

You may also want to contribute a corresponding issue or pull request in the cwltool reference implementation, in particular cwltool/provenance.py and documentation on cwltool --provenance support.

For an informal CWLProv discussion with other developers, join the (relatively quiet) Gitter room common-workflow-language/cwlprov, or the (more busy) common-workflow-language/common-workflow-language.

Code of Conduct

The CWL Project is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, age, race, or religion. We do not tolerate harassment of participants in any form. This code of conduct applies to all CWL Project spaces, including the Google Group, the Gitter chat room, the Google Hangouts chats, both online and off. Anyone who violates this code of conduct may be sanctioned or expelled from these spaces at the discretion of the leadership team.

For more details, see our Code of Conduct.

Requirements Language

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in documents of this repository are to be interpreted as described in RFC 2119.

Versions

CWLProv is versioned using Semantic Versioning, following the pattern MAJOR.MINOR.PATCH (e.g. 1.2.0).

To determine version compatibility we consider the packaging of a CWLProv RO as a kind of "API". Examples of changes to CWLProv:

  • Major version change: Removal of resource type, change of format of PROV, removing annotations, changing namespaces, removing PROV statement patterns
  • Minor version change: Adding other resources, adding annotations, additional properties, changing entity identifier scheme, change of file paths in RO, minor change of underlying syntax and package version, adding/augmenting PROV statement patterns, conformance to PROV constraints
  • Patch version change: Fixing syntactical typos (e.g. invalid or inefficient JSON-LD), inconsistencies in textual language, adding inferred PROV statements

This means that consumers of CWLProv can make strong assumptions on backwards and forwards compatibility:

  • Major: Unsupported major versions can't be safely parsed
  • Minor: Can safely parse (but not reproduce) newer versions. Parsing older versions is safe if later CWLProv additions are handled as optional.
  • Patch: Differences can usually be safely ignored

Unless a patch version is affecting the output, the declared profile SHOULD have patch version 0 even if the code was implemented with a later CWLProv.

Tip: You may spot that change of file paths is classified as minor, that is because paths can be found dynamically by following links from the manifest, its annotations and the PROV traces. This is similar to REST principles where URI templates should not be assumed, but followed from links._

The current version of CWLProv have major 0, indicating that disruptive changes may occur before the profile stabilize at 1.x.y.

Each CWLProv version has a w3id.org permalink that SHOULD be declared inside the RO to indicate its conformance.