Enorm: efficient window-based computation in large-scale distributed stream processing systems

Kasper Grud Skat Madsen, Yongluan Zhou, Li Su

5 Citationer (Scopus)

Abstract

Modern distributed stream processing systems (DSPS), such as Storm, typically provide a flexible programming model, where computation is specified as complicated UDFs and data is opaque to the system. While such a programming framework provides very high flexibility to the developers, it does not provide much semantic information to the system and hence it is hard to perform optimizations that has already been proved very effective in conventional stream systems. Examples include sharing computation among overlapping windows, co-partitioning operators to save communication overhead and efficient state migration during load balancing. In lieu of these challenges, we propose a new framework, which is designed to expose sufficient semantic information of the applications to enable the aforementioned effective optimizations, while on the other hand, maintaining the flexibility of Storm's original programming framework. Furthermore, we present new optimization algorithms to minimize the communication cost and state migration overhead for dynamic load balancing. We implement our framework on top of Storm and run an extensive experimental study to verify its effectiveness.

OriginalsprogEngelsk
TitelProceedings of the 10th ACM International Conference on Distributed and Event-based Systems
Antal sider12
UdgivelsesstedNew York, NY, USA
ForlagAssociation for Computing Machinery
Publikationsdato13 jun. 2016
Sider37-48
ISBN (Trykt)978-1-4503-4021-2
DOI
StatusUdgivet - 13 jun. 2016
Udgivet eksterntJa
Begivenhed10th ACM International Conference on Distributed and Event-Based Systems - Irvine, CA, USA
Varighed: 20 jun. 201624 jun. 2016

Konference

Konference10th ACM International Conference on Distributed and Event-Based Systems
Land/OmrådeUSA
ByIrvine, CA
Periode20/06/201624/06/2016

Fingeraftryk

Dyk ned i forskningsemnerne om 'Enorm: efficient window-based computation in large-scale distributed stream processing systems'. Sammen danner de et unikt fingeraftryk.

Citationsformater