Parallelizing Federated SPARQL Queries in Presence of Replicated Data

Thomas Minier; Gabriela Montoya; Hala Skaf-Molli; Pascal Molli

doi:10.1007/978-3-319-70407-4_33

Parallelizing Federated SPARQL Queries in Presence of Replicated Data

Thomas Minier, Gabriela Montoya, Hala Skaf-Molli, Pascal Molli

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

2 Citationer (Scopus)

Abstract

Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.

Originalsprog	Engelsk
Titel	The Semantic Web: ESWC 2017 Satellite Events : ESWC 2017 Satellite Events, Portorož, Slovenia, May 28 – June 1, 2017, Revised Selected Papers
Forlag	Springer
Publikationsdato	2017
Sider	181-196
ISBN (Trykt)	978-3-319-70406-7
ISBN (Elektronisk)	978-3-319-70407-4
DOI	https://doi.org/10.1007/978-3-319-70407-4_33
Status	Udgivet - 2017
Begivenhed	14th Extended Semantic Web Conference, ESWC 2017 - Portoroz, Slovenien Varighed: 28 maj 2017 → 1 jun. 2017

Konference

Konference	14th Extended Semantic Web Conference, ESWC 2017
Land/Område	Slovenien
By	Portoroz
Periode	28/05/2017 → 01/06/2017
Sponsor	Elsevier, IOS Press

Navn	Lecture Notes in Computer Science
Vol/bind	10577
ISSN	0302-9743

Adgang til dokumentet

10.1007/978-3-319-70407-4_33

https://hal.archives-ouvertes.fr/hal-01591791v2/document

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

@inproceedings{cbc02c2ebaa84181817ef65a2114c4de,

title = "Parallelizing Federated SPARQL Queries in Presence of Replicated Data",

abstract = "Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.",

keywords = "Linked Data, Parallel query processing, Fragment replication, federated , Triple Pattern Fragment, Load balancing",

author = "Thomas Minier and Gabriela Montoya and Hala Skaf-Molli and Pascal Molli",

year = "2017",

doi = "10.1007/978-3-319-70407-4_33",

language = "English",

isbn = "978-3-319-70406-7",

series = "Lecture Notes in Computer Science",

publisher = "Springer",

pages = "181--196",

booktitle = "The Semantic Web: ESWC 2017 Satellite Events",

address = "Germany",

note = "14th Extended Semantic Web Conference, ESWC 2017 ; Conference date: 28-05-2017 Through 01-06-2017",

}

Minier, T, Montoya, G, Skaf-Molli, H & Molli, P 2017, Parallelizing Federated SPARQL Queries in Presence of Replicated Data. i The Semantic Web: ESWC 2017 Satellite Events: ESWC 2017 Satellite Events, Portorož, Slovenia, May 28 – June 1, 2017, Revised Selected Papers. Springer, Lecture Notes in Computer Science, bind 10577, s. 181-196, 14th Extended Semantic Web Conference, ESWC 2017, Portoroz, Slovenien, 28/05/2017. https://doi.org/10.1007/978-3-319-70407-4_33

Parallelizing Federated SPARQL Queries in Presence of Replicated Data. / Minier, Thomas; Montoya, Gabriela; Skaf-Molli, Hala et al.
The Semantic Web: ESWC 2017 Satellite Events: ESWC 2017 Satellite Events, Portorož, Slovenia, May 28 – June 1, 2017, Revised Selected Papers. Springer, 2017. s. 181-196 (Lecture Notes in Computer Science, Bind 10577).

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

TY - GEN

T1 - Parallelizing Federated SPARQL Queries in Presence of Replicated Data

AU - Minier, Thomas

AU - Montoya, Gabriela

AU - Skaf-Molli, Hala

AU - Molli, Pascal

PY - 2017

Y1 - 2017

N2 - Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.

AB - Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.

KW - Linked Data

KW - Parallel query processing

KW - Fragment replication

KW - federated

KW - Triple Pattern Fragment

KW - Load balancing

U2 - 10.1007/978-3-319-70407-4_33

DO - 10.1007/978-3-319-70407-4_33

M3 - Article in proceeding

SN - 978-3-319-70406-7

T3 - Lecture Notes in Computer Science

SP - 181

EP - 196

BT - The Semantic Web: ESWC 2017 Satellite Events

PB - Springer

T2 - 14th Extended Semantic Web Conference, ESWC 2017

Y2 - 28 May 2017 through 1 June 2017

ER -

Minier T, Montoya G, Skaf-Molli H, Molli P. Parallelizing Federated SPARQL Queries in Presence of Replicated Data. I The Semantic Web: ESWC 2017 Satellite Events: ESWC 2017 Satellite Events, Portorož, Slovenia, May 28 – June 1, 2017, Revised Selected Papers. Springer. 2017. s. 181-196. (Lecture Notes in Computer Science, Bind 10577). doi: 10.1007/978-3-319-70407-4_33

Parallelizing Federated SPARQL Queries in Presence of Replicated Data

Abstract

Konference

Adgang til dokumentet

AUB Link

Fingeraftryk

Citationsformater