Random walk term weighting for information retrieval

Christina Lioma; Roi Blanco

Random walk term weighting for information retrieval

14 Citationer (Scopus)

Abstract

We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.

Originalsprog	Engelsk
Titel	SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Forlag	Association for Computing Machinery
Publikationsdato	2007
Sider	829-830
Status	Udgivet - 2007
Udgivet eksternt	Ja

Adgang til dokumentet

http://64.238.147.53/citation.cfm?id=1277741.1277930&coll=DL&dl=GUIDE&CFID=87655016&CFTOKEN=30826131

Citationsformater

Random walk term weighting for information retrieval. / Lioma, Christina; Blanco, Roi.
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval . Association for Computing Machinery, 2007. s. 829-830.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › peer review

@inproceedings{42e5f973131d41b1bf7a8a90cd5f55ca,

title = "Random walk term weighting for information retrieval",

abstract = "We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.",

author = "Christina Lioma and Roi Blanco",

note = "Copyright is held by the author/owner(s). SIGIR{\textquoteright}07, July 23–27, 2007, Amsterdam, The Netherlands. ACM 978-1-59593-597-7/07/0007. ",

year = "2007",

language = "English",

pages = "829--830",

booktitle = "SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval",

publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Random walk term weighting for information retrieval

AU - Lioma, Christina

AU - Blanco, Roi

N1 - Copyright is held by the author/owner(s). SIGIR’07, July 23–27, 2007, Amsterdam, The Netherlands. ACM 978-1-59593-597-7/07/0007.

PY - 2007

Y1 - 2007

N2 - We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.

AB - We present a way of estimating term weights for Information Retrieval (IR), using term co-occurrence as a measure of dependency between terms.We use the random walk graph-based ranking algorithm on a graph that encodes terms and co-occurrence dependencies in text, from which we derive term weights that represent a quantification of how a term contributes to its context. Evaluation on two TREC collections and 350 topics shows that the random walk-based term weights perform at least comparably to the traditional tf-idf term weighting, while they outperform it when the distance between co-occurring terms is between 6 and 30 terms.

M3 - Article in proceedings

SP - 829

EP - 830

BT - SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

PB - Association for Computing Machinery

ER -

Random walk term weighting for information retrieval

Abstract

Adgang til dokumentet

Fingeraftryk

Citationsformater