SemStore: a semantic-preserving distributed RDF triple store

Buwen Wu, Yongluan Zhou, Pingpeng Yuan, Hai Jin, Ling Liu

28 Citations (Scopus)

Abstract

The flexibility of the RDF data model has attracted an increasing number of organizations to store their data in an RDF format. With the rapid growth of RDF datasets, we envision that it is inevitable to deploy a cluster of computing nodes to process large-scale RDF data in order to deliver desirable query performance. In this paper, we address the challenging problems of data partitioning and query optimization in a scale-out RDF engine. We identify that existing approaches only focus on using fine-grained structural information for data partitioning, and hence fail to localize many types of complex queries. We then propose a radically different approach, where a coarse-grained structure, namely Rooted Sub-Graph (RSG), is used as the partition unit. By doing so, we can capture structural information at a much greater scale and hence are able to localize many complex queries. We also propose a k-means partitioning algorithm for allocating the RSGs onto the computing nodes as well as a query optimization strategy to minimize the inter-node communication during query processing. An extensive experimental study using benchmark datasets and real dataset shows that our engine, SemStore, outperforms existing systems by orders of magnitudes in terms of query response time.

Original languageEnglish
Title of host publicationProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
Number of pages10
PublisherAssociation for Computing Machinery
Publication date3 Nov 2014
Pages509-518
ISBN (Electronic)978-1-4503-2598-1
DOIs
Publication statusPublished - 3 Nov 2014
Externally publishedYes
Event23rd ACM International Conference on Conference on Information and Knowledge Management - Shanghai, China
Duration: 3 Nov 20147 Nov 2014

Conference

Conference23rd ACM International Conference on Conference on Information and Knowledge Management
Country/TerritoryChina
CityShanghai
Period03/11/201407/11/2014

Fingerprint

Dive into the research topics of 'SemStore: a semantic-preserving distributed RDF triple store'. Together they form a unique fingerprint.

Cite this