Abstract
We propose a family of new evaluation measures, called Markov Precision (MP), which exploits continuous-time and discrete-time Markov chains in order to inject user models into precision. Continuous-time MP behaves like timecalibrated measures, bringing the time spent by the user into the evaluation of a system; discrete-time MP behaves like traditional evaluation measures. Being part of the same Markovian framework, the time-based and rank-based versions of MP produce values that are directly comparable. We show that it is possible to re-create average precision using specific user models and this helps in providing an explanation of Average Precision (AP) in terms of user models more realistic than the ones currently used to justify it. We also propose several alternative models that take into account different possible behaviors in scanning a ranked result list. Finally, we conduct a thorough experimental evaluation of MP on standard TREC collections in order to show that MP is as reliable as other measures and we provide an example of calibration of its time parameters based on click logs from Yandex.
Original language | English |
---|---|
Title of host publication | SIGIR 2014 - Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval |
Number of pages | 10 |
Publisher | ASSOCIATION FOR COMPUTING MACHINERY. JOU |
Publication date | 1 Jan 2014 |
Pages | 597-606 |
ISBN (Print) | 9781450322591 |
DOIs | |
Publication status | Published - 1 Jan 2014 |
Event | 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2014 - Gold Coast, QLD, Australia Duration: 6 Jul 2014 → 11 Jul 2014 |
Conference
Conference | 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2014 |
---|---|
Country/Territory | Australia |
City | Gold Coast, QLD |
Period | 06/07/2014 → 11/07/2014 |
Sponsor | Baidu, et al., Google, Microsoft Research, Special Interest Group on Information Retrieval (ACM SIGIR), Tourism and Events Queensland |
Keywords
- Evaluation
- Markov precision
- Time
- User model