Skip to content

Permalinks

Stian Soiland-Reyes edited this page Jul 12, 2018 · 20 revisions

A permalink is a permanent, static hyperlink to a particular web resource. The permalink URL might differ from the page URL you see in the browser address bar. When provided, you should use the permalink for:

  • Bookmarking
  • Citing, e.g. from an academic paper
  • Linking, e.g. from documentation

In CWL Viewer we make permalinks that indicate the workflow in that particular version (git commit). The normal page URIs in the CWL Viewer can be fluctual, as they reference external git repositories and branches that change over time. And so a page URL that works today, may not work next week (the CWL file is deleted in git), or may show quite a different workflow (the CWL file changed).

Shown in the bottom of a workflow page are buttons ([turtle] [png] etc) for retrieving the workflow metadata or diagram. These use format-specific permalinks to indicate the particular version the metadata/diagram came from. This mean these permalinks can be used as "retrievedFrom" references.

In the CWL viewer here are some cases where you would rather want to link to the page URL instead of the permalink:

  • From a GitHub README (e.g. to embed the most current diagram)
  • If you are communicating with someone about rapid changes to a workflow (e.g. github pull request, Gitter chat)
  • Linking from your lab homepage, as you want the latest info shown. You probably want to wait with that until your workflow has stabilized (and thus also it's filename within the git repository).

How does it work?

Our permalinks use HTTP redirects from https://w3id.org/ - in case the CWL Viewer at some point moves from https://view.commonwl.org/ or permalinks are minted in other instances of the viewer.

The CWL Viewer loops through all its known workflows to find the matching commit (and thus its git repo). Note that the implementation does not currently inspect the history of the older git commits, and so you have to click the permalink once to force the particular commit to be added to the list of known workflows in case that git repository later moves on.

When maintaining the CWL Viewer we do our best to ensure that the workflows stay registered. If this for some reason fail you can also navigate to the CWL file in that particular commit in GitHub/GitLab and re-register the historic workflow with CWL Viewer.

Permalink URI scheme

The CWL permalink URI scheme is composed like this:

https://w3id.org/cwl/view/{scheme}/{commit}/{path}{?part=fragment}
  • https://w3id.org/cwl/view/ fixed prefix at permalink service https://w3id.org/ (/cwl is our namespace)
  • {scheme} - source code management protocol, currently only git supported:
    • {commit} - full git commit sha1 id (no branches or short commits allowed)
    • {path} - relative path to .cwl file within a checkout of that git commit
    • {?part=fragment} - an optional fragment within packed workflows, e.g. id="main" within the file is ?part=main in the permalink

Any git permalinks are resolved using https://view.commonwl.org/git which - if it knows about that particular git commit - will content-negotiate to provide various representations. Anyone can mint these permalinks for .cwl files for a given commit in any public or private git repository, given no uncommitted files or git submodules.

Example

Example:

..which correspond to:

Content negotiation

If git commit behind the permalink is recognized by the public CWL Viewer, then resolving it can perform content negotiation to various media types, either by using the HTTP Accept header, or by adding a ?format= parameter to the URL (e.g. to force a particular representation for browsers).

Note that some of these may perform HTTP redirects, so use curl -L or equivalent.

Accept: mediatype ?format= Description
text/html html Redirects to Browser view on workflow
application/json json (but see #166) JSON according to the CWL Viewer API
text/turtle turtle RDF Turtle format as from cwltool --print-rdf
application/ld+json jsonld RDF JSON-LD format
application/rdf+xml rdfxml RDF/XML (not recommended)
image/svg+xml svg Workflow diagram in vector-based SVG format (recommended)
image/png png Workflow diagram in bitmap PNG format
application/vnd.wf4ever.robundle+zip ro Research Object Bundle containing workflow, visualizations, manifest and annotations
application/zip zip (same as above)
text/x-yaml yaml Redirect to yaml view of the underlying CWL file
application/octet-stream raw (same as above)

For example, using curl to request RDF in Turtle format:

curl -L -H "Accept: text/turtle" https://w3id.org/cwl/view/git/08308b3e425e952445d669d6d8e429018d02037b/workflows/hello/hello.cwl

Or to download the image representation to the file workflow.svg:

curl -o workflow.svg -L -H "Accept: image/svg+xml" https://w3id.org/cwl/view/git/08308b3e425e952445d669d6d8e429018d02037b/workflows/hello/hello.cwl

Raw

If you use curl or wget without any content negotiation, the server will default to the application/octet-stream redirect to the raw YAML file in CWL format. If you are accessing the permalink with a browser it should redirect to the text/html view unless you force it with ?format=, e.g. https://w3id.org/cwl/view/git/08308b3e425e952445d669d6d8e429018d02037b/workflows/hello/hello.cwl?format=raw

In theory ?format=raw should work for any file in that git repository. (but see #167)

Note that the raw redirects only works for repositories hosted at https://github.com/ or https://gitlab.com/

Packed workflows

For packed workflows (cwl files with more than one Workflow) to disambiguate it is needed to use the parameter ?part=fragment where fragment correspond to the #fragment identifier as traditionally exposed in --print-rdf and when running with cwltool.

Note that this means there is a mismatch between the URIs returned by the RDF representations (which use #fragment without ?part) and the access URIs.

For packed workflows the CWL Viewer is unable to resolve permalinks with ?part except for ?format=raw.

References