Increasing ranked accuracy for unstructured distributed search with dynamic replication

Sami Richardson, Ingemar J Cox

1 Citation (Scopus)

Abstract

In unstructured peer-to-peer networks it is necessary to query every node to guarantee finding a particular document. This is usually not feasible, so search is probabilistic. In previous work the metric of rank-accuracy was proposed to measure the performance of a query. It compares the set of documents retrieved from a search of a random subset of nodes to the set that would have been retrieved from an exhaustive search of all nodes, assigning a higher weight to higher ranked documents. It was shown that rank-accuracy can be significantly increased by replicating documents across nodes in proportion to their weighted retrieval rate, a function of query frequency and rank. However, this work assumed that the query distribution is known and unchanging, both of which are seldom true for practical systems. In this paper we make no such assumptions. We investigate how the popularity- and rank-aware distribution of documents can be approximated by replicating documents as queries are made. Using a simulated network of 10,000 nodes and 10,000,000 queries drawn from a Yahoo! web search engine log, we show that such a scheme can achieve over a 450% increase in overall rank-accuracy when compared to a uniform random distribution of documents. We also show this increase can be improved up to about 650% when the dynamically replicated documents are pruned down to just the terms that appear in queries.

Original languageEnglish
Title of host publicationPeer-to-Peer Computing (P2P), 2013 IEEE Thirteenth International Conference on
Number of pages5
Publication date2013
Publication statusPublished - 2013
Externally publishedYes

Fingerprint

Dive into the research topics of 'Increasing ranked accuracy for unstructured distributed search with dynamic replication'. Together they form a unique fingerprint.

Cite this