Abstract
A typical evaluation of a retrieval system involves computing an effectiveness metric, e.g. average precision, for each topic of a test collection and then using the average of the metric, e.g. mean average precision, to express the overall effectiveness. However, averages do not capture all the important aspects of effectiveness and, used alone, may not be an informative measure of systems' effectiveness. Indeed, in addition to the average, we need to consider the variation of effectiveness across topics. We refer to this variation as the variability in effectiveness. In this paper we explore how the variance of a metric can be used as a measure of variability. We define a variability metric, and illustrate how the metric can be used in practice.
Original language | English |
---|---|
Title of host publication | Advances in Multidisciplinary Retrieval |
Number of pages | 14 |
Publisher | Springer Science+Business Media |
Publication date | 2010 |
Pages | 70-83 |
Publication status | Published - 2010 |
Externally published | Yes |