Optimizing SPARQL queries using shape statistics

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

11 Citationer (Scopus)
226 Downloads (Pure)

Abstract

With the growing popularity of storing data in native RDF, we witness more and more diverse use cases with complex SPARQL queries. As a consequence, query optimization - and in particular cardinality estimation and join ordering - becomes even more crucial. Classical methods exploit global statistics covering the entire RDF graph as a whole, which naturally fails to correctly capture correlations that are very common in RDF datasets, which then leads to erroneous cardinality estimations and suboptimal query execution plans. The alternative of trying to capture correlations in a fine-granular manner, on the other hand, results in very costly preprocessing steps to create these statistics. Hence, in this paper we propose shapes statistics, which extend the recent SHACL standard with statistic information to capture the correlation between classes and properties. Our extensive experiments on synthetic and real data show that shapes statistics can be generated and managed with only little overhead without disadvantages in query runtime while leading to noticeable improvements in cardinality estimation.

OriginalsprogEngelsk
TitelAdvances in Database Technology : 24th International Conference on Extending Database Technology, EDBT 2021
RedaktørerYannis Velegrakis, Yannis Velegrakis, Demetris Zeinalipour, Panos K. Chrysanthis, Panos K. Chrysanthis, Francesco Guerra
Antal sider6
ForlagOpenProceedings.org
Publikationsdato2021
Sider505-510
ISBN (Elektronisk)978-3-89318-084-4
DOI
StatusUdgivet - 2021
BegivenhedAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021 - Virtual, Nicosia, Cypern
Varighed: 23 mar. 202126 mar. 2021

Konference

KonferenceAdvances in Database Technology - 24th International Conference on Extending Database Technology, EDBT 2021
Land/OmrådeCypern
ByVirtual, Nicosia
Periode23/03/202126/03/2021
SponsorOracle, Snowflake, ZOOM, Zoom Video Communications, Inc.
NavnAdvances in Database Technology
ISSN2367-2005

Bibliografisk note

Publisher Copyright:
© 2021 Copyright held by the owner/author(s).

Fingeraftryk

Dyk ned i forskningsemnerne om 'Optimizing SPARQL queries using shape statistics'. Sammen danner de et unikt fingeraftryk.

Citationsformater