Projects per year
Abstract
With the growing popularity of storing data in native RDF, we witness more and more diverse use cases with complex SPARQL queries. As a consequence, query optimization - and in particular cardinality estimation and join ordering - becomes even more crucial. Classical methods exploit global statistics covering the entire RDF graph as a whole, which naturally fails to correctly capture correlations that are very common in RDF datasets, which then leads to erroneous cardinality estimations and suboptimal query execution plans. The alternative of trying to capture correlations in a fine-granular manner, on the other hand, results in very costly preprocessing steps to create these statistics. Hence, in this paper we propose shapes statistics, which extend the recent SHACL standard with statistic information to capture the correlation between classes and properties. Our extensive experiments on synthetic and real data show that shapes statistics can be generated and managed with only little overhead without disadvantages in query runtime while leading to noticeable improvements in cardinality estimation.
Original language | English |
---|---|
Title of host publication | Advances in Database Technology : 24th International Conference on Extending Database Technology, EDBT 2021 |
Editors | Yannis Velegrakis, Yannis Velegrakis, Demetris Zeinalipour, Panos K. Chrysanthis, Panos K. Chrysanthis, Francesco Guerra |
Number of pages | 6 |
Publisher | OpenProceedings.org |
Publication date | 2021 |
Pages | 505-510 |
ISBN (Electronic) | 978-3-89318-084-4 |
DOIs | |
Publication status | Published - 2021 |
Event | Advances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021 - Virtual, Nicosia, Cyprus Duration: 23 Mar 2021 → 26 Mar 2021 |
Conference
Conference | Advances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021 |
---|---|
Country/Territory | Cyprus |
City | Virtual, Nicosia |
Period | 23/03/2021 → 26/03/2021 |
Sponsor | Oracle, Snowflake, ZOOM, Zoom Video Communications, Inc. |
Series | Advances in Database Technology |
---|---|
ISSN | 2367-2005 |
Bibliographical note
Funding Information:Acknowledgments. This research was partially funded by the Danish Council for Independent Research (DFF) under grant agreement no. DFF-8048-00051B, the EU’s H2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 838216, and the Poul Due Jensen Foundation.
Publisher Copyright:
© 2021 Copyright held by the owner/author(s).
Fingerprint
Dive into the research topics of 'Optimizing SPARQL queries using shape statistics'. Together they form a unique fingerprint.-
Poul Due Jensen Professorate in Big Data and Artificial Intelligence
Hose, K., Jendal, T. E. & Hansen, E. R.
01/11/2019 → 31/10/2024
Project: Research
-
-
EDAO: EDAO: Example Driven Analytics for Open Knowledge Graphs
Lissandrini, M., Pedersen, T. B. & Hose, K.
15/09/2019 → 14/09/2021
Project: Research