Abstract
Modern distributed stream processing systems (DSPS), such as Storm, typically provide a flexible programming model, where computation is specified as complicated UDFs and data is opaque to the system. While such a programming framework provides very high flexibility to the developers, it does not provide much semantic information to the system and hence it is hard to perform optimizations that has already been proved very effective in conventional stream systems. Examples include sharing computation among overlapping windows, co-partitioning operators to save communication overhead and efficient state migration during load balancing. In lieu of these challenges, we propose a new framework, which is designed to expose sufficient semantic information of the applications to enable the aforementioned effective optimizations, while on the other hand, maintaining the flexibility of Storm's original programming framework. Furthermore, we present new optimization algorithms to minimize the communication cost and state migration overhead for dynamic load balancing. We implement our framework on top of Storm and run an extensive experimental study to verify its effectiveness.
Original language | English |
---|---|
Title of host publication | Proceedings of the 10th ACM International Conference on Distributed and Event-based Systems |
Number of pages | 12 |
Place of Publication | New York, NY, USA |
Publisher | Association for Computing Machinery |
Publication date | 13 Jun 2016 |
Pages | 37-48 |
ISBN (Print) | 978-1-4503-4021-2 |
DOIs | |
Publication status | Published - 13 Jun 2016 |
Externally published | Yes |
Event | 10th ACM International Conference on Distributed and Event-Based Systems - Irvine, CA, United States Duration: 20 Jun 2016 → 24 Jun 2016 |
Conference
Conference | 10th ACM International Conference on Distributed and Event-Based Systems |
---|---|
Country/Territory | United States |
City | Irvine, CA |
Period | 20/06/2016 → 24/06/2016 |