The Odyssey Approach for Optimizing Federated SPARQL Queries

Gabriela Montoya, Hala Skaf-Molli, Katja Hose

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

11 Citationer (Scopus)

Resumé

Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans. Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore enables Odyssey to produce better query execution plans. Our experimental results show that Odyssey produces query execution plans that are better in terms of data transfer and execution time than state-of-the-art optimizers. Our experiments using the FedBench benchmark show execution time gains of at least 25 times on average.
OriginalsprogEngelsk
TitelThe Semantic Web - ISWC 2017 : 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I
Vol/bind10587
ForlagSpringer
Publikationsdato2017
Sider471-489
ISBN (Trykt)978-3-319-68287-7
ISBN (Elektronisk)978-3-319-68288-4
StatusUdgivet - 2017
BegivenhedThe 16th International Semantic Web Conference - Vienna, Østrig
Varighed: 21 okt. 201731 okt. 2017
Konferencens nummer: 16th
https://iswc2017.semanticweb.org/

Konference

KonferenceThe 16th International Semantic Web Conference
Nummer16th
LandØstrig
ByVienna
Periode21/10/201731/10/2017
Internetadresse
NavnLecture Notes in Computer Science
ISSN0302-9743

Fingerprint

Statistics
Data transfer
Dynamic programming
Costs
Engines
Experiments

Citer dette

Montoya, G., Skaf-Molli, H., & Hose, K. (2017). The Odyssey Approach for Optimizing Federated SPARQL Queries. I The Semantic Web - ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I (Bind 10587, s. 471-489). Springer. Lecture Notes in Computer Science
Montoya, Gabriela ; Skaf-Molli, Hala ; Hose, Katja. / The Odyssey Approach for Optimizing Federated SPARQL Queries. The Semantic Web - ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I. Bind 10587 Springer, 2017. s. 471-489 (Lecture Notes in Computer Science).
@inproceedings{964fa89cf57149d89970305304652c17,
title = "The Odyssey Approach for Optimizing Federated SPARQL Queries",
abstract = "Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans. Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore enables Odyssey to produce better query execution plans. Our experimental results show that Odyssey produces query execution plans that are better in terms of data transfer and execution time than state-of-the-art optimizers. Our experiments using the FedBench benchmark show execution time gains of at least 25 times on average.",
keywords = "Federated Queries, Query Optimization, Join Ordering, Source Selection",
author = "Gabriela Montoya and Hala Skaf-Molli and Katja Hose",
year = "2017",
language = "English",
isbn = "978-3-319-68287-7",
volume = "10587",
series = "Lecture Notes in Computer Science",
publisher = "Springer",
pages = "471--489",
booktitle = "The Semantic Web - ISWC 2017",
address = "Germany",

}

Montoya, G, Skaf-Molli, H & Hose, K 2017, The Odyssey Approach for Optimizing Federated SPARQL Queries. i The Semantic Web - ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I. bind 10587, Springer, Lecture Notes in Computer Science, s. 471-489, The 16th International Semantic Web Conference, Vienna, Østrig, 21/10/2017.

The Odyssey Approach for Optimizing Federated SPARQL Queries. / Montoya, Gabriela; Skaf-Molli, Hala; Hose, Katja.

The Semantic Web - ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I. Bind 10587 Springer, 2017. s. 471-489 (Lecture Notes in Computer Science).

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

TY - GEN

T1 - The Odyssey Approach for Optimizing Federated SPARQL Queries

AU - Montoya, Gabriela

AU - Skaf-Molli, Hala

AU - Hose, Katja

PY - 2017

Y1 - 2017

N2 - Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans. Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore enables Odyssey to produce better query execution plans. Our experimental results show that Odyssey produces query execution plans that are better in terms of data transfer and execution time than state-of-the-art optimizers. Our experiments using the FedBench benchmark show execution time gains of at least 25 times on average.

AB - Answering queries over a federation of SPARQL endpoints requires combining data from more than one data source. Optimizing queries in such scenarios is particularly challenging not only because of (i) the large variety of possible query execution plans that correctly answer the query but also because (ii) there is only limited access to statistics about schema and instance data of remote sources. To overcome these challenges, most federated query engines rely on heuristics to reduce the space of possible query execution plans or on dynamic programming strategies to produce optimal plans. Nevertheless, these plans may still exhibit a high number of intermediate results or high execution times because of heuristics and inaccurate cost estimations. In this paper, we present Odyssey, an approach that uses statistics that allow for a more accurate cost estimation for federated queries and therefore enables Odyssey to produce better query execution plans. Our experimental results show that Odyssey produces query execution plans that are better in terms of data transfer and execution time than state-of-the-art optimizers. Our experiments using the FedBench benchmark show execution time gains of at least 25 times on average.

KW - Federated Queries

KW - Query Optimization

KW - Join Ordering

KW - Source Selection

M3 - Article in proceeding

SN - 978-3-319-68287-7

VL - 10587

T3 - Lecture Notes in Computer Science

SP - 471

EP - 489

BT - The Semantic Web - ISWC 2017

PB - Springer

ER -

Montoya G, Skaf-Molli H, Hose K. The Odyssey Approach for Optimizing Federated SPARQL Queries. I The Semantic Web - ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21-25, 2017, Proceedings, Part I. Bind 10587. Springer. 2017. s. 471-489. (Lecture Notes in Computer Science).