Efficient skyline computation in MapReduce

Kasper Mullesgaard, Jens Laurits Pederseny, Hua Lu, Yongluan Zhou

52 Citationer (Scopus)

Abstract

Skyline queries are useful for finding interesting tuples from a large data set according to multiple criteria. The sizes of data sets are constantly increasing and the architecture of back-ends are switching from single-node environments to non-conventional paradigms like MapReduce. Despite the usefulness of skyline queries, existing works on skyline computation in MapReduce do not take full advantage of parallelism but still run significant parts serially. In this paper, we propose a novel approach to compute skylines efficiently in MapReduce. We design a grid partitioning scheme to divide the data space into partitions, and employ a bitstring to represent the partitions. The bitstring is efficiently obtained in MapReduce, and it clearly helps prune partitions (and tuples) that cannot have skyline tuples. Based on the grid partitioning, we propose two MapReduce algorithms to compute skylines. Both algorithms utilize the bitstring and distribute the original tuples to multiple mappers and make use of them to compute local skylines in parallel. In particular, MapReduce Grid Partitioning based Single-Reducer Skyline Computation (MR-GPSRS) employs a single reducer to assemble the local skylines appropriately to compute the global skyline. In contrast, MapReduce Grid Partitioning based Multiple Reducer Skyline Computation (MR-GPMRS) further divides local skylines and distributes them to multiple reducers that compute the global skyline in an independent and parallel manner. The proposed algorithms are evaluated through extensive experiments, and the results show that MR-GPMRS significantly outperforms the alternatives in various settings.

OriginalsprogEngelsk
TitelProceedings of the 17th International Conference on Extending Database Technology (EDBT), Athens, Greece, March 24-28, 2014
Antal sider12
ForlagOpenProceedings
Publikationsdato2014
Sider37-48
ISBN (Elektronisk)978-3-89318065-3
DOI
StatusUdgivet - 2014
Udgivet eksterntJa
Begivenhed17th International Conference on Extending Database Technology - Athen, Grækenland
Varighed: 24 mar. 201428 mar. 2014
Konferencens nummer: 17

Konference

Konference17th International Conference on Extending Database Technology
Nummer17
Land/OmrådeGrækenland
ByAthen
Periode24/03/201428/03/2014

Fingeraftryk

Dyk ned i forskningsemnerne om 'Efficient skyline computation in MapReduce'. Sammen danner de et unikt fingeraftryk.

Citationsformater