SHACTOR: Improving the Quality of Large-Scale Knowledge Graphs with Validating Shapes

Kashif Rabbani, Matteo Lissandrini, Katja Hose

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

4 Citations (Scopus)

Abstract

We demonstrate SHACTOR, a system for extracting and analyzing validating shapes from very large Knowledge Graphs (KGs). Shapes represent a specific form of data patterns, akin to schemas for entities. Standard shape extraction approaches are likely to produce thousands of shapes, and some of those represent spurious constraints extracted due to the presence of erroneous data in the KG. Given a KG having tens of millions of triples and thousands of classes, SHACTOR parses the KG using our efficient and scalable shapes extraction algorithm and outputs SHACL shapes constraints. The extracted shapes are further annotated with statistical information regarding their support in the graph, which allows to identify both erroneous and missing triples in the KG. Hence, SHACTOR can be used to extract, analyze, and clean shape constraints from very large KGs. Furthermore, it enables the user to also find and correct errors by automatically generating SPARQL queries over the graph to retrieve nodes and facts that are the source of the spurious shapes and to intervene by amending the data.

Original languageEnglish
Title of host publicationCompanion of the 2023 International Conference on Management of Data (SIGMOD '23)
Number of pages4
PublisherAssociation for Computing Machinery (ACM)
Publication date4 Jun 2023
Pages151-154
ISBN (Electronic)978-1-4503-9507-6
DOIs
Publication statusPublished - 4 Jun 2023
Event2023 ACM/SIGMOD International Conference on Management of Data, SIGMOD 2023 - Seattle, United States
Duration: 18 Jun 202323 Jun 2023

Conference

Conference2023 ACM/SIGMOD International Conference on Management of Data, SIGMOD 2023
Country/TerritoryUnited States
CitySeattle
Period18/06/202323/06/2023
SponsorACM SIGMOD
SeriesProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN0730-8078

Bibliographical note

Funding Information:
This research was partially funded by the Danish Council for Independent Research (DFF) under grant agreement no. DFF-8048-00051B, the EU’s H2020 research and innovation programme under grant agreement No 838216, and the Poul Due Jensen Foundation.

Publisher Copyright:
© 2023 ACM.

Keywords

  • knowledge graphs
  • quality assessment
  • SHACL
  • shapes extraction

Fingerprint

Dive into the research topics of 'SHACTOR: Improving the Quality of Large-Scale Knowledge Graphs with Validating Shapes'. Together they form a unique fingerprint.

Cite this