5s-crate

TRE-FX Five Safes RO-Crate profile

Five Safes RO-Crate https://w3id.org/5s-crate/ specifies a profile of RO-Crate for the purpose of workflow execution in a distributed trusted research environment (TRE).

The profile is being developed by the TRE-FX project.

Cite as

Stian Soiland-Reyes, Stuart Wheater, Thomas Giles, Carole Goble, Philip Quinlan (2023):
TRE-FX Technical Documentation - Five Safes RO-crate.
Zenodo
https://doi.org/10.5281/zenodo.10376350

Profile releases

Current release:

Next draft in progress:

Archived drafts:

About Five Safes RO-Crate

A Five Safes RO-Crate represents a unit of computational workflow-based access to sensitive information which is managed in accordance with a set of principles conforming to the Five Safe framework, a well-established model for managing access to confidential or sensitive data. The aim is to enable trusted workflow execution in a Trusted Research Environment (TRE), from an authenticated workflow run request, through approval and review processes to a completed workflow execution. The profile has been developed for the purpose of TRE-FX implementation of workflow execution in a distributed TRE. The Five Safes RO-Crate is a specialised profile of RO-Crate, whereby encapsulated elements and metadata provide the necessary context for evaluating the safety and appropriateness of both data access and analysis.

Note: A crate that is compliant to the Five Safes RO-Crate profile is not inherently safe - its role is to streamline the flow of information by standardising the metadata it collects and carries. That metadata is used to support the Five Safes processes of the TREs and their issuing/receiving clients. A Five Safes RO-Crate operates in a pre-determined and controlled context: (i) the workflow that is to be executed within the TRE to answer a request has already pre-approved by the TRE and will be executed in a secure deployment and (ii) the services to implement the crate phases are secure and adhere to the Five Safes.” %}

RO-Crate is a community-based specification for packaging and describing research outputs, based on FAIR linked data standards. The approach has been adopted by a variety of research domains with specialisation in different profiles to combine generic and domain-specific metadata. Recently, the Workflow Run Crate profiles have been developed and are being implemented by more than 6 workflow engines including CWL and Galaxy.

The Five Safes model provides a structured approach to managing confidential or sensitive data through five dimensions: Safe Data, Safe Projects, Safe People, Safe Settings, and Safe Outputs. For data controllers operating Trusted Research Environments (TREs), ensuring compliance with data governance and legal frameworks is critical, especially in the context of federated analytics.

The Five Safes RO-Crate aims to provide a mechanism to encapsulate data, workflows, and provenance with extensible metadata in a standardised, compliant package, and hence improve the flow of the metadata, queries and results necessary to streamline TRE operations, enable rigorous compliance, and enhance data integrity and security.

The initial crate with a workflow run request references a pre-approved workflow and project details for manual and automated assessment according to the TRE’s agreement policy for the sensitive dataset. The crate goes through multiple phases internal to the TRE, including validation, sign-off, workflow execution and disclosure control. At this later stage the crate is also conforming to the Workflow Run Crate profile for return to the user, and a derived public version (possibly redacted) can be published in data use registers to document the analysis.

Phases of Five Safes RO-Crate

The phases a Five Safes RO-Crate is expected to go through includes:

Phases of RO-Crate: Check outside TRE, Validation, Workflow Retrieval, Sign-off, Workflow Execution, Disclosure, Publishing outside TRE, Receiving outside TRE

Each phase is recorded within the Crate with timestamps and responsible agents, even within the TRE. It is possible to fail some phases, e.g. a failed Disclosure Phase will return a Crate with no output data. Some phases may be performed either inside or outside the TRE depending on local requirements (e.g. Validation Phase).

Licence

Feedback

For feedback or suggested changes, feel free to raise a GitHub Issue or Pull Request: https://github.com/trefx/5s-crate/