Declarative Construction and Validation of Knowledge Graphs

Co-located with the Twelfth International COnference on Knowledge Capture (K-CAP 2023)

The wide adoption of knowledge graphs have boosted the development of techniques and tools to support their use along their life cycle. Among them we focus on declarative approaches designed for knowledge graph construction, that rely on the use of mapping languages (e.g. R2RML, RML, SPARQL-Anything) to describe the transformation process. The preliminary limitations of these technologies have been progressively addressed with the efforts of the community so as to overcome their limitations and motivate their adoption. Our objective with this tutorial is to explain the progress on declarative mapping technologies to tackle more complex use cases, and show from a practical perspective the tools and methods that ease the mapping creation process and integration in KG construction pipelines. Furthermore, we also want to present how declarative approaches can also be exploited for constructing, but validating knowledge graphs. We aim to show the benefits that declarative approaches can bring into the production of high-quality knowledge graphs, and assists them along their life cycle.



Program

Declarative Knowledge Graph Construction (13:00 - 15:00)

Presenter: Ana Iglesias-Molina

Part I: Getting started with mapping languages

Declarative mappings are commonly used for specifying how a dataset can be transformed into an RDF graph according to the schema provided usually by an ontology or vocabulary. We present one of the most popular languages used to this end, RML, and the novel features of its most recent release.

Part II: Creating declarative mappings

Different approaches for mapping creation will be presented. Then, a guided hands-on session will be conducted with atendees to create mappings using YARRRML, a user-friendly serialization for RML.

Part III: Constructing RDF graphs using RML mappings

There are different approaches and implementations to construct RDF graphs from heterogeneous data sources. We provide an overview of these approaches focusing on the ones that can be easily integrated into automated pipelines.

Declarative Knowledge Graph Validation (15:30 - 17:30)

Presenter: Xuemin Duan

Part IV: Introducing SHACL shapes

SHACL was proposed for validating RDF graphs against a set of constraints. We will first go through the core constraint components and introduce the syntactic constraints related to SHACL Core. Then, we will use shacl-shacl, SHACL shapes that are used to verify whether shapes are well-formed. Subsequently, we will explore how to use shacl-shacl to validate the created SHACL shapes.

Part V: Translating RML to SHACL and validating RDF graphs

RML2SHACL is a tool that translates RML mappings to SHACL shapes based on a set of correspondences. We will first elucidate the process of automatically generating SHACL shapes from RML using this tool. Subsequently, we will employ PySHACL, a Python library designed for the validation of RDF graphs against SHACL shapes, to validate the graphs constructed in previous sessions.


Presenters

Ana Iglesias-Molina

Universidad Politécnica de Madrid

ana.iglesiasm (at) upm.es

Ana Iglesias-Molina is a PhD student at the Ontology Engineering Group - UPM under the supervision of Prof. Oscar Corcho. Her research is focused on knowledge graph construction and management with declarative mapping languages, ontology engineering, and knowledge representation. She participated in the organization and presented in two tutorials on Knowledge Graph Construction (ISWC2020, ESWC2022), and since 2023 she is also part of the organization of the Knowledge Graph Construction Workshop (ESWC2023). In the past years she has been teaching Data Science in a Business School Master in Madrid, and Semantic Web courses of Bachelor and Master’s level in Universidad Politécnica de Madrid. She has also been part of the Knowledge Graph Construction Community Group since its foundation.

Xuemin Duan

KU Leuven

xuemin.duan (at) kuleuven.be

Xuemin Duan is a PhD student at KU Leuven under the supervision of Prof. Anastasia Dimou. Her work is mainly focused on the automatic construction and validation of knowledge graphs, and she is currently working on the extraction and integration of SHACL shapes. She has been teaching the lab sessions of various courses at KU Leuven, including Computer Network in the Fall of 2022, Data Engineering in the Fall of 2022, Data and Knowledge Graphs in the Spring of 2023, and Web AI in the Fall of 2023.


Materials

All the necessary materials for the tutorial are available at: https://github.com/kg-construct/tutorials/tree/main/kcap2023/resources
Software
  • Morph-KGC: KG generation engine optimized for tabular data
  • YARRRML Matey: User-friendly web services for writing YARRRML mappings
  • Yatter: Mapping translator for YARRRML, RML and R2RML
  • RML2SHACL: SHACL shape generator from RML mappings
  • PySHACL: Python validator for SHACL

Sponsors