ShapeUML: RDF Constraints Visualization based on UML

Unofficial Draft

Latest editor's draft:
https://w3id.org/imec/unshacled/spec/shape-uml/
Editors:
(Ghent University — IDLab — imec)
Anastasia Dimou (Ghent University — IDLab — imec)
Author:
(Ghent University — IDLab — imec)
This Version:
https://w3id.org/imec/unshacled/spec/shape-uml/20210118/
Previous Version:
https://w3id.org/imec/unshacled/spec/shape-uml/20200815/

Abstract

This document defines ShapeUML, a visual notation for RDF constraints. It specifies the visualization primitives to represent constraints to users.

This document is the result of ongoing research.

1. Introduction

Knowledge Graphs defined with the Resource Description Framework [RDF] consist of vocabulary terms and instance data both described using the graph model of RDF. Several constraint languages exist to define conditions on such RDF graphs, one such language is the W3C recommended Shapes Constraint Language [SHACL] which itself is represented using RDF. Users rely on different types of constraints offered by constraint languages to define conditions on RDF graphs.

With respect to visual notations, the Unified Modeling Language [UML] is widely used in many disciplines. A specific UML-profile for RDF [ODM] states how RDF classes and properties are mapped to UML classes. ShapeUML follows this profile but also adapt graphical symbols to the needs for RDF constraints.

2. Terminology

Throughout the document, the following terminology is used.

2.1 General

(anonymous) entity
An entity is something that exists as a distinct, independent, or self-contained unit. Examples are people, companies, buildings and so on. If you want to uniquely address an entity then it needs to have a unique name. Therefore, entities use HTTP URIs. However, when a entity does not have a URI we call it an anonymous entity [MapVOWL].
attribute
More information is provided about an entity by adding attribute to it. Examples are the name of a person, the VAT number of a company and so on [MapVOWL].
relationship
An entity and its attributes are connected with each other using relationships. A relationship describes how a attribute relates to its entity [MapVOWL].
constraints
A constraint is a condition on data which should be satisfied. Several types of constraints exist, e.g. a datatype constraint on a property restricts the value of that property to be of a specific datatype and a cardinality constraint on a property restricts the number of that property [SHACL-core-constraints].
data shape
A data shape is a set of conditions on RDF data [SHACL-abstract]. It provides context for constraints, i.e. a datatype constraint can be used in different data shapes which corresponds to different contexts.
node shape
A node shape is a specific data shape which represents conditions on nodes, i.e. subjects and objects of triples [SHACL].
property shape
A property shape is a specific data shape which represents conditions on properties and their values, i.e. predicates and related objects of triples [SHACL].
closed data shape
A closed data shape may only have values for the properties explicitly enumerated via property shapes [SHACL].
severity
Each data shape has a severity which qualifies the violation of the stated conditions, i.e. it categorizes validation results [SHACL]. By default a data shape has the severity "violation" but other severities can be defined, i.e. the SHACL core specifies the three severities "violation", "warning" and "information".

2.2 Visualization

node
A node is a visual representation of either an entity or an attribute [MapVOWL].
edge
An edge is a line connecting two nodes. It represents the relationship between two entities or one entity and one attribute [MapVOWL].
graph
A graph is a collection of nodes and edges [MapVOWL].

3. Graphical Primitives

Name Primitive Description
rectangle data shape
solid line
  • relationship between a node and a property shape
dashed line
  • relationship between nodes and data shapes or property shapes and node shapes (not between node shapes and property shapes)
  • vertically subsuming individual relationships: one-to-many relationships
arrowhead relationship direction
text text labels, constraints and other textual information

4. Colors

4.1 General

Abstract Color Name Concrete Color Recommendation Description Application
canvas #ffffff white Bright color with a good contrast to all other colors. canvas where are all other graph elements are shown on
foreground #000000 black Very dark color with a good contrast to all other colors. border of elements and edges
base color Color is not material to UML; we recommend no specific color, i.e. the same as the canvas color. standard fill color of elements

5. Constraints

First, we provide a mapping from semantic constructs of RDF constraint languages to visual elements of ShapeUML (see figures). Next, we explain splitting rules with respect to the visualization of property shapes. Finally, we provide a complete visual example.

5.1 Semantic constructs

The SHACL specification [SHACL] contains core constraints and other concepts relevant for data validation. We provide a mapping of semantic constructs to visual representations with ShapeUML, i.e. in the first figure we provide the mapping from SHACL core constraints and in the second figure we provide the mapping from other relevant SHACL concepts.

Figure 1 Semantic constructs of SHACL core constraints and their visual representation in ShapeUML.
Figure 2 Semantic constructs of validation-related SHACL concepts and their visual representation in ShapeUML.

5.2 Splitting rules

Node shapes can be reused by a directed dashed connectioni and property shapes can be reused by directed solid connections. Property shapes are visualized only once, however, ingoing solid connections with a property path as label and cardinalities on the arrow head are splitted. I.e. a property shape which is reused n times will be visualized only once but will have n ingoing solid edges with the property path as label and cardinalities on the arrow head.

5.3 Example

Below you can find an example for constraints expressed with ShapeUML.

Figure 3 Constraints visualized using ShapeUML: A subject valid to the Person data shape should have an IRI (1), at least one but maximum two ex:address properties (2) of class schema:PostalAddress (3) and the object of at least one ex:address property should comply with the existing data shape ex:validAddress (4). Additionally, the subject valid to person should either have exactly one ex:fullName or at least one schema:givenName (5) and at least one schema:familyName all of datatype xsd:string. The value of ex:fullName must not comply with the data shape ex:organizationShape (6). Addresses must only have values for the property postalAddress with an exception for rdf:type (7). Constraints of the ex:organizationShape are not considered for validation (8).

A. References

A.1 Informative references

[MapVOWL]
MapVOWL. rml.io. URL: https://rml.io/specs/mapvowl/
[ODM]
Ontology Definition Metamodel (ODM). Object Management Group (OMG). URL: https://www.omg.org/spec/ODM/1.1/
[RDF]
RDF 1.1 Concepts and Abstract Syntax. W3C. URL: https://www.w3.org/TR/rdf11-concepts/
[SHACL]
Shapes Constraint Langauge (SHACL). W3C. URL: https://www.w3.org/TR/shacl/
[SHACL-abstract]
Shapes Constraint Langauge (SHACL) - Abstract. W3C. URL: https://www.w3.org/TR/shacl/#abstract
[SHACL-core-constraints]
Shapes Constraint Langauge (SHACL) - Constraint Types. W3C. URL: https://www.w3.org/TR/shacl/#core-components
[UML]
Unified Modeling Language (UML). Object Management Group (OMG). URL: https://www.omg.org/spec/UML/2.5.1