Automatically Extracted SHACL Shapes for WikiData, DBpedia, YAGO-4, and LUBM & Associated Coverage Statistics



The uploaded datasets contain automatically extracted SHACL shapes for the following datasets:

WikiData (the truthy dump from September 2021 filtered by removing non-English strings) [1]DBpedia [2]YAGO-4 [3] LUBM (scale factor 500) [4]

The validating shapes for these datasets are generated by a program that parses the corresponding RDF files (in `.nt` format). The extracted shapes encode various SHACL constraints, e.g., sh:minCount, sh:path, sh:class, sh:datatype etc. For each shape we encode coverage in terms of number of entities satisfying such shape, this information is encoded using the void:entities predicate. 

We have provided as executable Jar file the program we developed to extract these SHACL shapes.
More details about the datasets used to extract these shapes and <em>how to run the Jar</em> are available on our GitHub repository

