Dynamic resource management in a massively parallel stream processing engine

Kasper Grud Skat Madsen, Yongluan Zhou

13 Citations (Scopus)

Abstract

The emerging interest in Massively Parallel Stream Processing Engines (MPSPEs), which are able to process longstanding computations over data streams with ever-growing velocity at a large-scale cluster, calls for efficient dynamic resource management techniques to avoid any waste of resources and/or excessive processing latency. In this paper, we propose an approach to integrate dynamic resource management with passive fault-tolerance mechanisms in a MP-SPE so that we can harvest the checkpoints prepared for failure recovery to enhance the efficiency of dynamic load migrations. To maximize the opportunity of reusing checkpoints for fast load migration, we formally define a checkpoint allocation problem and provide a pragmatic algorithm to solve it. We implement all the proposed techniques on top of Apache Storm, an open-source MPSPE, and conduct extensive experiments using a real dataset to examine various aspects of our techniques. The results show that our techniques can greatly improve the efficiency of dynamic resource reconfiguration without imposing significant overhead or latency to the normal job execution.

Original languageEnglish
Title of host publicationProceedings of the 24th ACM International Conference on Information and Knowledge Management
Number of pages10
PublisherAssociation for Computing Machinery
Publication date17 Oct 2015
Pages13-22
ISBN (Electronic)978-1-4503-3794-6
DOIs
Publication statusPublished - 17 Oct 2015
Externally publishedYes
Event24th ACM International Conference on Information and Knowledge Management - Melbourne, Australia
Duration: 18 Oct 201523 Oct 2015
Conference number: 24

Conference

Conference24th ACM International Conference on Information and Knowledge Management
Number24
Country/TerritoryAustralia
CityMelbourne
Period18/10/201523/10/2015

Fingerprint

Dive into the research topics of 'Dynamic resource management in a massively parallel stream processing engine'. Together they form a unique fingerprint.

Cite this