Parallel SPARQL query optimization

Buwen Wu, Yongluan Zhou, Hai Jin, Amol Deshpande

3 Citationer (Scopus)
64 Downloads (Pure)

Abstract

Existing parallel SPARQL query optimizers assume hash-based data partitioning and adopt plan enumeration algorithms with unnecessarily high complexity. Therefore, they cannot easily accommodate other partitioning methods and only consider an unnecessarily limited plan space. To address these problems, we first define a generic RDF data partitioning model to capture the common structure of various state-of-The-Art RDF data partitioning methods. Then we propose a query plan enumeration algorithm that not only has an optimal efficiency, but also accommodates different data partitioning methods. Furthermore, based on a solid analysis of the complexity of the plan enumeration algorithm, we propose two new heuristic methods that can consider a much larger plan space than the existing methods, and at the same time can still confine the search space of the algorithm. An autonomous approach is proposed to choose one of the two methods by considering the structure and the size of a complex SPARQL query. We conduct extensive experiments using synthetic and a real-world dataset, which show the superiority of our algorithms in comparing to existing ones.

OriginalsprogEngelsk
TitelProceedings of the 33rd IEEE International Conference on Data Engineering (ICDE)
Antal sider12
ForlagIEEE Press
Publikationsdato16 maj 2017
Sider547-558
ISBN (Trykt)978-1-5090-6544-8
ISBN (Elektronisk)978-1-5090-6543-1
DOI
StatusUdgivet - 16 maj 2017
Udgivet eksterntJa
Begivenhed33rd IEEE International Conference on Data Engineering - San Diego, USA
Varighed: 19 apr. 201722 apr. 2017
Konferencens nummer: 33

Konference

Konference33rd IEEE International Conference on Data Engineering
Nummer33
Land/OmrådeUSA
BySan Diego
Periode19/04/201722/04/2017

Fingeraftryk

Dyk ned i forskningsemnerne om 'Parallel SPARQL query optimization'. Sammen danner de et unikt fingeraftryk.

Citationsformater