TY - GEN
T1 - Power of d choices with simple tabulation
AU - Aamand, Anders
AU - Knudsen, Mathias Bæk Tejs
AU - Thorup, Mikkel
PY - 2018/7/1
Y1 - 2018/7/1
N2 - We consider the classic d-choice paradigm of Azar et al. [STOC'94] in which m balls are put into n bins sequentially as follows: For each ball we are given a choice of d bins chosen according to d hash functions and the ball is placed in the least loaded of these bins, breaking ties arbitrarily. The interest is in the number of balls in the fullest bin after all balls have been placed. In this paper we suppose that the d hash functions are simple tabulation hash functions which are easy to implement and can be evaluated in constant time. Generalising a result by Dahlgaard et al. [SODA'16] we show that for an arbitrary constant d ≥ 2 the expected maximum load is at mostlg
lg
lg
d
n + O(1). We further show that by using a simple tie-breaking algorithm introduced by Vöcking [J.ACM'03] the expected maximum load is reduced tod
lg
lg
lg
ϕ
n
d + O(1) where ϕd is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those known from the fully random setting. The analysis by Dahlgaard et al. relies on a proof by P tra cu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We require a generalisation to d > 2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by P tra cu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.
AB - We consider the classic d-choice paradigm of Azar et al. [STOC'94] in which m balls are put into n bins sequentially as follows: For each ball we are given a choice of d bins chosen according to d hash functions and the ball is placed in the least loaded of these bins, breaking ties arbitrarily. The interest is in the number of balls in the fullest bin after all balls have been placed. In this paper we suppose that the d hash functions are simple tabulation hash functions which are easy to implement and can be evaluated in constant time. Generalising a result by Dahlgaard et al. [SODA'16] we show that for an arbitrary constant d ≥ 2 the expected maximum load is at mostlg
lg
lg
d
n + O(1). We further show that by using a simple tie-breaking algorithm introduced by Vöcking [J.ACM'03] the expected maximum load is reduced tod
lg
lg
lg
ϕ
n
d + O(1) where ϕd is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those known from the fully random setting. The analysis by Dahlgaard et al. relies on a proof by P tra cu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We require a generalisation to d > 2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by P tra cu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.
KW - Balls
KW - Bins
KW - Load Balancing
KW - Phrases Hashing
KW - Simple Tabulation
UR - http://www.scopus.com/inward/record.url?scp=85049793862&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICALP.2018.5
DO - 10.4230/LIPIcs.ICALP.2018.5
M3 - Article in proceedings
AN - SCOPUS:85049793862
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
A2 - Kaklamanis, Christos
A2 - Marx, Daniel
A2 - Chatzigiannakis, Ioannis
A2 - Sannella, Donald
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018
Y2 - 9 July 2018 through 13 July 2018
ER -