Twisted tabulation hashing

Mikkel Thorup; Mihai Patrascu

doi:10.1137/1.9781611973105.16

Twisted tabulation hashing

Mikkel Thorup, Mihai Patrascu

Department of Computer Science

10 Citations (Scopus)

Abstract

We introduce a new tabulation-based hashing scheme called "twisted tabulation". It is essentially as simple and fast as simple tabulation, but has some powerful distributional properties illustrating its promise: (1) If we sample keys with arbitrary probabilities, then with high probability, the number of samples inside any subset is concentrated exponentially. With bounded independence we only get polynomial concentration, and with simple tabulation, we have no good bound even in the basic case of tossing an (unbiased) coin for each key. (2) With classic hash tables such as linear probing and collision-chaining, a window of B operations takes O(B) time with high probability, for B = Ω(lg n). Good amortized performance over any window of size B is equivalent to guaranteed throughput for an on-line system processing a stream via a buffer of size B (e.g., Internet routers).

Original language	English
Title of host publication	Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms
Editors	Sanjeev Khanna
Number of pages	20
Publisher	Association for Computing Machinery
Publication date	2013
Pages	209-228
ISBN (Print)	978-1-611972-51-1
ISBN (Electronic)	978-1-61197-310-5
DOIs	https://doi.org/10.1137/1.9781611973105.16
Publication status	Published - 2013
Event	Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms - Aster Crowne Plaza Hotel, New Orleans, United States Duration: 6 Jan 2013 → 8 Jan 2013 Conference number: 24

Conference

Conference	Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms
Number	24
Location	Aster Crowne Plaza Hotel
Country/Territory	United States
City	New Orleans
Period	06/01/2013 → 08/01/2013

Access to Document

10.1137/1.9781611973105.16

Cite this

@inproceedings{d0adfb1ce9c34749b83095dc7cac9436,

title = "Twisted tabulation hashing",

abstract = "We introduce a new tabulation-based hashing scheme called {"}twisted tabulation{"}. It is essentially as simple and fast as simple tabulation, but has some powerful distributional properties illustrating its promise: (1) If we sample keys with arbitrary probabilities, then with high probability, the number of samples inside any subset is concentrated exponentially. With bounded independence we only get polynomial concentration, and with simple tabulation, we have no good bound even in the basic case of tossing an (unbiased) coin for each key. (2) With classic hash tables such as linear probing and collision-chaining, a window of B operations takes O(B) time with high probability, for B = Ω(lg n). Good amortized performance over any window of size B is equivalent to guaranteed throughput for an on-line system processing a stream via a buffer of size B (e.g., Internet routers).",

author = "Mikkel Thorup and Mihai Patrascu",

year = "2013",

doi = "10.1137/1.9781611973105.16",

language = "English",

isbn = "978-1-611972-51-1",

pages = "209--228",

editor = "Sanjeev Khanna",

booktitle = "Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms",

publisher = "Association for Computing Machinery",

note = "Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms ; Conference date: 06-01-2013 Through 08-01-2013",

}

TY - GEN

T1 - Twisted tabulation hashing

AU - Thorup, Mikkel

AU - Patrascu, Mihai

N1 - Conference code: 24

PY - 2013

Y1 - 2013

N2 - We introduce a new tabulation-based hashing scheme called "twisted tabulation". It is essentially as simple and fast as simple tabulation, but has some powerful distributional properties illustrating its promise: (1) If we sample keys with arbitrary probabilities, then with high probability, the number of samples inside any subset is concentrated exponentially. With bounded independence we only get polynomial concentration, and with simple tabulation, we have no good bound even in the basic case of tossing an (unbiased) coin for each key. (2) With classic hash tables such as linear probing and collision-chaining, a window of B operations takes O(B) time with high probability, for B = Ω(lg n). Good amortized performance over any window of size B is equivalent to guaranteed throughput for an on-line system processing a stream via a buffer of size B (e.g., Internet routers).

AB - We introduce a new tabulation-based hashing scheme called "twisted tabulation". It is essentially as simple and fast as simple tabulation, but has some powerful distributional properties illustrating its promise: (1) If we sample keys with arbitrary probabilities, then with high probability, the number of samples inside any subset is concentrated exponentially. With bounded independence we only get polynomial concentration, and with simple tabulation, we have no good bound even in the basic case of tossing an (unbiased) coin for each key. (2) With classic hash tables such as linear probing and collision-chaining, a window of B operations takes O(B) time with high probability, for B = Ω(lg n). Good amortized performance over any window of size B is equivalent to guaranteed throughput for an on-line system processing a stream via a buffer of size B (e.g., Internet routers).

U2 - 10.1137/1.9781611973105.16

DO - 10.1137/1.9781611973105.16

M3 - Article in proceedings

SN - 978-1-611972-51-1

SP - 209

EP - 228

BT - Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms

A2 - Khanna, Sanjeev

PB - Association for Computing Machinery

T2 - Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms

Y2 - 6 January 2013 through 8 January 2013

ER -

Twisted tabulation hashing

Abstract

Conference

Access to Document

Fingerprint

Cite this