Abstract
Ranking search results is an ongoing research topic in information retrieval. The traditional models are the vector space, probabilistic and language models, and more recently machine learning has been deployed in an effort to learn how to rank search results. Categorization of search results has also been studied as a means to organize the results, and hence to improve users search experience. However there is little research to-date on ranking categories of results in comparison to ranking the results themselves. In this paper, we propose a probabilistic ranking model that includes categories in addition to a ranked results list, and derive six ranking methods from the model. These ranking methods utilize the following features: the class probability distribution based on query classification, the lowest ranked document within each class and the class size. An empirical study was carried out to compare these methods with the traditional ranked-list approach in terms of rank positions of click-through documents and experimental results show that there is no simpler winner in all cases. Better performance is attained by class size or a combination of the class probability distribution of the queries and the rank of the document with the lowest list rank within the class.
Original language | English |
---|---|
Title of host publication | KDIR |
Number of pages | 8 |
Publication date | 2010 |
Pages | 294-301 |
Publication status | Published - 2010 |
Externally published | Yes |