Multi-dueling bandits and their application to online ranker evaluation

Brian Brost; Yevgeny Seldin; Ingemar Johansson Cox; Christina Lioma

doi:10.1145/2983323.2983659

Multi-dueling bandits and their application to online ranker evaluation

Brian Brost, Yevgeny Seldin, Ingemar Johansson Cox, Christina Lioma

Datalogisk Institut

11 Citationer (Scopus)

Abstract

Online ranker evaluation focuses on the challenge of efficiently determining, from implicit user feedback, which ranker out of a finite set of rankers is the best. It can be modeled by dueling bandits, a mathematical model for online learning under limited feedback from pairwise comparisons. Comparisons of pairs of rankers is performed by interleaving their result sets and examining which documents users click on. The dueling bandits model addresses the key issue of which pair of rankers to compare at each iteration. Methods for simultaneously comparing more than two rankers have recently been developed. However, the question of which rankers to compare at each iteration was left open. We address this question by proposing a generalization of the dueling bandits model that uses simultaneous comparisons of an unrestricted number of rankers. We evaluate our algorithm on standard large-scale online ranker evaluation datasets. Our experimentals show that the algorithm yields orders of magnitude gains in performance compared to state-of-the-art dueling bandit algorithms.

Originalsprog	Udefineret/Ukendt
Titel	Proceedings of the 25th ACM International Conference on Information and Knowledge Management
Antal sider	6
Forlag	Association for Computing Machinery
Publikationsdato	24 okt. 2016
Sider	2161-2166
ISBN (Elektronisk)	978-1-4503-4073-1
DOI	https://doi.org/10.1145/2983323.2983659
Status	Udgivet - 24 okt. 2016
Begivenhed	25th ACM International Conference on Information and Knowledge Management - Indianapolis, USA Varighed: 24 okt. 2016 → 28 okt. 2016 Konferencens nummer: 25

Konference

Konference	25th ACM International Conference on Information and Knowledge Management
Nummer	25
Land/Område	USA
By	Indianapolis
Periode	24/10/2016 → 28/10/2016

Navn	ACM International Conference on Information and Knowledge Management

Emneord

cs.IR
cs.LG
stat.ML

Adgang til dokumentet

10.1145/2983323.2983659

https://arxiv.org/pdf/1608.06253v1.pdf

Citationsformater

Multi-dueling bandits and their application to online ranker evaluation. / Brost, Brian; Seldin, Yevgeny ; Cox, Ingemar Johansson et al.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, 2016. s. 2161-2166 (ACM International Conference on Information and Knowledge Management).

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

Brost, B, Seldin, Y , Cox, IJ & Lioma, C 2016, Multi-dueling bandits and their application to online ranker evaluation. i Proceedings of the 25th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, ACM International Conference on Information and Knowledge Management, s. 2161-2166, 25th ACM International Conference on Information and Knowledge Management, Indianapolis, USA, 24/10/2016. https://doi.org/10.1145/2983323.2983659

@inproceedings{2790d2b17ebc4dcf9f9faa24c91fdfcc,

title = "Multi-dueling bandits and their application to online ranker evaluation",

abstract = "Online ranker evaluation focuses on the challenge of efficiently determining, from implicit user feedback, which ranker out of a finite set of rankers is the best. It can be modeled by dueling bandits, a mathematical model for online learning under limited feedback from pairwise comparisons. Comparisons of pairs of rankers is performed by interleaving their result sets and examining which documents users click on. The dueling bandits model addresses the key issue of which pair of rankers to compare at each iteration. Methods for simultaneously comparing more than two rankers have recently been developed. However, the question of which rankers to compare at each iteration was left open. We address this question by proposing a generalization of the dueling bandits model that uses simultaneous comparisons of an unrestricted number of rankers. We evaluate our algorithm on standard large-scale online ranker evaluation datasets. Our experimentals show that the algorithm yields orders of magnitude gains in performance compared to state-of-the-art dueling bandit algorithms.",

keywords = "cs.IR, cs.LG, stat.ML",

author = "Brian Brost and Yevgeny Seldin and Cox, {Ingemar Johansson} and Christina Lioma",

year = "2016",

month = oct,

day = "24",

doi = "10.1145/2983323.2983659",

language = "Udefineret/Ukendt",

series = "ACM International Conference on Information and Knowledge Management",

pages = "2161--2166",

booktitle = "Proceedings of the 25th ACM International Conference on Information and Knowledge Management",

publisher = "Association for Computing Machinery",

note = "25th ACM International Conference on Information and Knowledge Management ; Conference date: 24-10-2016 Through 28-10-2016",

}

TY - GEN

T1 - Multi-dueling bandits and their application to online ranker evaluation

AU - Brost, Brian

AU - Seldin, Yevgeny

AU - Cox, Ingemar Johansson

AU - Lioma, Christina

N1 - Conference code: 25

PY - 2016/10/24

Y1 - 2016/10/24

N2 - Online ranker evaluation focuses on the challenge of efficiently determining, from implicit user feedback, which ranker out of a finite set of rankers is the best. It can be modeled by dueling bandits, a mathematical model for online learning under limited feedback from pairwise comparisons. Comparisons of pairs of rankers is performed by interleaving their result sets and examining which documents users click on. The dueling bandits model addresses the key issue of which pair of rankers to compare at each iteration. Methods for simultaneously comparing more than two rankers have recently been developed. However, the question of which rankers to compare at each iteration was left open. We address this question by proposing a generalization of the dueling bandits model that uses simultaneous comparisons of an unrestricted number of rankers. We evaluate our algorithm on standard large-scale online ranker evaluation datasets. Our experimentals show that the algorithm yields orders of magnitude gains in performance compared to state-of-the-art dueling bandit algorithms.

AB - Online ranker evaluation focuses on the challenge of efficiently determining, from implicit user feedback, which ranker out of a finite set of rankers is the best. It can be modeled by dueling bandits, a mathematical model for online learning under limited feedback from pairwise comparisons. Comparisons of pairs of rankers is performed by interleaving their result sets and examining which documents users click on. The dueling bandits model addresses the key issue of which pair of rankers to compare at each iteration. Methods for simultaneously comparing more than two rankers have recently been developed. However, the question of which rankers to compare at each iteration was left open. We address this question by proposing a generalization of the dueling bandits model that uses simultaneous comparisons of an unrestricted number of rankers. We evaluate our algorithm on standard large-scale online ranker evaluation datasets. Our experimentals show that the algorithm yields orders of magnitude gains in performance compared to state-of-the-art dueling bandit algorithms.

KW - cs.IR

KW - cs.LG

KW - stat.ML

U2 - 10.1145/2983323.2983659

DO - 10.1145/2983323.2983659

M3 - Konferencebidrag i proceedings

T3 - ACM International Conference on Information and Knowledge Management

SP - 2161

EP - 2166

BT - Proceedings of the 25th ACM International Conference on Information and Knowledge Management

PB - Association for Computing Machinery

T2 - 25th ACM International Conference on Information and Knowledge Management

Y2 - 24 October 2016 through 28 October 2016

ER -