Estimating global statistics for unstructured P2P search in the presence of adversarial peers

Sami Richardson, Ingemar J. Cox

6 Citations (Scopus)

Abstract

A common problem in unstructured peer-to-peer (P2P) information retrieval is the need to compute global statistics of the full collection, when only a small subset of the collection is visible to a peer. Without accurate estimates of these statistics, the effectiveness of modern retrieval models can be reduced. We show that for the case of a probably approximately correct P2P architecture, and using either the BM25 retrieval model or a language model with Dirichlet smoothing, very close approximations of the required global statistics can be estimated with very little overhead and a small extension to the protocol. However, through theoretical modeling and simulations we demonstrate this technique also greatly increases the ability for adversarial peers to manipulate search results. We show an adversary controlling fewer than 10% of peers can censor or increase the rank of documents, or disrupt overall search results. As a defense, we propose a simple modification to the extension, and show global statistics estimation is viable even when up to 40% of peers are adversarial.

Original languageUndefined/Unknown
Title of host publicationProceedings of the 37th International ACM SIGIR Conference on Research 38; Development in Information Retrieval
Number of pages10
PublisherIEEE/ACM
Publication date2014
Pages203-212
DOIs
Publication statusPublished - 2014
Externally publishedYes
EventInternational ACM SIGIR conference on Research & development in information retrieval (2014) - Gold Coast, Australia
Duration: 6 Jul 201411 Jul 2014

Conference

ConferenceInternational ACM SIGIR conference on Research & development in information retrieval (2014)
Country/TerritoryAustralia
CityGold Coast
Period06/07/201411/07/2014

Cite this