PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments

Thomas Minier, Gabriela Montoya, Hala Skaf-Molli, Pascal Molli

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

1 Citation (Scopus)
5 Downloads (Pure)

Abstract

Replicating data fragments in Linked Data improves data availability and performances of federated query engines. Existing replication aware federated query engines mainly focus on source selection and query decomposition in order to prune redundant sources and reduce intermediate results thanks to data locality. In this paper, we extend replication-aware federated query engines with a replication-aware parallel join operator: PeNeLoop. PeNeLoop exploits redundant sources to parallelize the join operator and reduce execution time. We implemented PeNeLoop in the federated query engine FedX with the replicated-aware source selection Fedra and we empirically evaluated the performance of FedX + Fedra + PeNeLoop. Experimental results suggest that FedX + Fedra + PeNeLoop outperforms FedX + Fedra in terms of execution time while preserving answer completeness.
Original languageEnglish
Title of host publicationJoint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017
Number of pages14
Volume1870
PublisherCEUR Workshop Proceedings
Publication date2017
Pages37-50
Publication statusPublished - 2017
Event14th Extended Semantic Web Conference, ESWC 2017 - Portoroz, Slovenia
Duration: 28 May 20171 Jun 2017

Conference

Conference14th Extended Semantic Web Conference, ESWC 2017
CountrySlovenia
CityPortoroz
Period28/05/201701/06/2017
SponsorElsevier, IOS Press
SeriesCEUR Workshop Proceedings
Volume1870
ISSN1613-0073

Fingerprint

Engines
Availability
Decomposition

Keywords

  • Federated SPARQL Query Processing
  • Fragment Replication
  • Linked Data
  • Parallel Query Processing

Cite this

Minier, T., Montoya, G., Skaf-Molli, H., & Molli, P. (2017). PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. In Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017 (Vol. 1870, pp. 37-50). CEUR Workshop Proceedings. CEUR Workshop Proceedings, Vol.. 1870
Minier, Thomas ; Montoya, Gabriela ; Skaf-Molli, Hala ; Molli, Pascal. / PeNeLoop : Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017. Vol. 1870 CEUR Workshop Proceedings, 2017. pp. 37-50 (CEUR Workshop Proceedings, Vol. 1870).
@inproceedings{585d47beb04f4a279f9029628903136e,
title = "PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments",
abstract = "Replicating data fragments in Linked Data improves data availability and performances of federated query engines. Existing replication aware federated query engines mainly focus on source selection and query decomposition in order to prune redundant sources and reduce intermediate results thanks to data locality. In this paper, we extend replication-aware federated query engines with a replication-aware parallel join operator: PeNeLoop. PeNeLoop exploits redundant sources to parallelize the join operator and reduce execution time. We implemented PeNeLoop in the federated query engine FedX with the replicated-aware source selection Fedra and we empirically evaluated the performance of FedX + Fedra + PeNeLoop. Experimental results suggest that FedX + Fedra + PeNeLoop outperforms FedX + Fedra in terms of execution time while preserving answer completeness.",
keywords = "Federated SPARQL Query Processing, Fragment Replication, Linked Data, Parallel Query Processing",
author = "Thomas Minier and Gabriela Montoya and Hala Skaf-Molli and Pascal Molli",
year = "2017",
language = "English",
volume = "1870",
series = "CEUR Workshop Proceedings",
publisher = "CEUR Workshop Proceedings",
pages = "37--50",
booktitle = "Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017",

}

Minier, T, Montoya, G, Skaf-Molli, H & Molli, P 2017, PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. in Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017. vol. 1870, CEUR Workshop Proceedings, CEUR Workshop Proceedings, vol. 1870, pp. 37-50, 14th Extended Semantic Web Conference, ESWC 2017, Portoroz, Slovenia, 28/05/2017.

PeNeLoop : Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. / Minier, Thomas; Montoya, Gabriela; Skaf-Molli, Hala; Molli, Pascal.

Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017. Vol. 1870 CEUR Workshop Proceedings, 2017. p. 37-50 (CEUR Workshop Proceedings, Vol. 1870).

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

TY - GEN

T1 - PeNeLoop

T2 - Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments

AU - Minier, Thomas

AU - Montoya, Gabriela

AU - Skaf-Molli, Hala

AU - Molli, Pascal

PY - 2017

Y1 - 2017

N2 - Replicating data fragments in Linked Data improves data availability and performances of federated query engines. Existing replication aware federated query engines mainly focus on source selection and query decomposition in order to prune redundant sources and reduce intermediate results thanks to data locality. In this paper, we extend replication-aware federated query engines with a replication-aware parallel join operator: PeNeLoop. PeNeLoop exploits redundant sources to parallelize the join operator and reduce execution time. We implemented PeNeLoop in the federated query engine FedX with the replicated-aware source selection Fedra and we empirically evaluated the performance of FedX + Fedra + PeNeLoop. Experimental results suggest that FedX + Fedra + PeNeLoop outperforms FedX + Fedra in terms of execution time while preserving answer completeness.

AB - Replicating data fragments in Linked Data improves data availability and performances of federated query engines. Existing replication aware federated query engines mainly focus on source selection and query decomposition in order to prune redundant sources and reduce intermediate results thanks to data locality. In this paper, we extend replication-aware federated query engines with a replication-aware parallel join operator: PeNeLoop. PeNeLoop exploits redundant sources to parallelize the join operator and reduce execution time. We implemented PeNeLoop in the federated query engine FedX with the replicated-aware source selection Fedra and we empirically evaluated the performance of FedX + Fedra + PeNeLoop. Experimental results suggest that FedX + Fedra + PeNeLoop outperforms FedX + Fedra in terms of execution time while preserving answer completeness.

KW - Federated SPARQL Query Processing

KW - Fragment Replication

KW - Linked Data

KW - Parallel Query Processing

UR - http://www.scopus.com/inward/record.url?scp=85025150220&partnerID=8YFLogxK

M3 - Article in proceeding

VL - 1870

T3 - CEUR Workshop Proceedings

SP - 37

EP - 50

BT - Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017

PB - CEUR Workshop Proceedings

ER -

Minier T, Montoya G, Skaf-Molli H, Molli P. PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. In Joint Proceedings of the 2nd RDF Stream Processing (RSP 2017) and the Querying the Web of Data (QuWeDa 2017) Workshops co-located with 14th ESWC 2017 (ESWC 2017), Portoroz, Slovenia, May 28th - to - 29th, 2017. Vol. 1870. CEUR Workshop Proceedings. 2017. p. 37-50. (CEUR Workshop Proceedings, Vol. 1870).