Automatically Extracted SHACL Shapes for WikiData, DBpedia, YAGO-4, and LUBM & Associated Coverage Statistics

Datasæt

Beskrivelse

The uploaded datasets contain automatically extracted SHACL shapes for the following datasets: WikiData (the truthy dump from September 2021 filtered by removing non-English strings) [1]DBpedia [2]YAGO-4 [3] LUBM (scale factor 500) [4] The validating shapes for these datasets are generated by a program that parses the corresponding RDF files (in `.nt` format). The extracted shapes encode various SHACL constraints, e.g., sh:minCount, sh:path, sh:class, sh:datatype etc. For each shape we encode coverage in terms of number of entities satisfying such shape, this information is encoded using the void:entities predicate.  We have provided as executable Jar file the program we developed to extract these SHACL shapes. More details about the datasets used to extract these shapes and <em>how to run the Jar</em> are available on our GitHub repository https://github.com/Kashif-Rabbani/validatingshapes. [1] Vrandečić, Denny, and Markus Krötzsch. "Wikidata: a free collaborative knowledgebase." Communications of the ACM 57.10 (2014): 78-85. [2] Auer, Sören, et al. "Dbpedia: A nucleus for a web of open data." The semantic web. Springer, Berlin, Heidelberg, 2007. 722-735. [3] Pellissier Tanon, Thomas, Gerhard Weikum, and Fabian Suchanek. "Yago 4: A reason-able knowledge base." European Semantic Web Conference. Springer, Cham, 2020. [4] Guo, Yuanbo, Zhengxiang Pan, and Jeff Heflin. "LUBM: A benchmark for OWL knowledge base systems." Journal of Web Semantics 3.2-3 (2005): 158-182.
Dato for tilgængelighed2022
ForlagZenodo

Citationsformater