Estimating global statistics for unstructured P2P search in the presence of adversarial peers

Sami Richardson, Ingemar J. Cox

6 Citationer (Scopus)

Abstract

A common problem in unstructured peer-to-peer (P2P) information retrieval is the need to compute global statistics of the full collection, when only a small subset of the collection is visible to a peer. Without accurate estimates of these statistics, the effectiveness of modern retrieval models can be reduced. We show that for the case of a probably approximately correct P2P architecture, and using either the BM25 retrieval model or a language model with Dirichlet smoothing, very close approximations of the required global statistics can be estimated with very little overhead and a small extension to the protocol. However, through theoretical modeling and simulations we demonstrate this technique also greatly increases the ability for adversarial peers to manipulate search results. We show an adversary controlling fewer than 10% of peers can censor or increase the rank of documents, or disrupt overall search results. As a defense, we propose a simple modification to the extension, and show global statistics estimation is viable even when up to 40% of peers are adversarial.

OriginalsprogUdefineret/Ukendt
TitelProceedings of the 37th International ACM SIGIR Conference on Research 38; Development in Information Retrieval
Antal sider10
ForlagIEEE/ACM
Publikationsdato2014
Sider203-212
DOI
StatusUdgivet - 2014
Udgivet eksterntJa
BegivenhedInternational ACM SIGIR conference on Research & development in information retrieval (2014) - Gold Coast, Australien
Varighed: 6 jul. 201411 jul. 2014

Konference

KonferenceInternational ACM SIGIR conference on Research & development in information retrieval (2014)
Land/OmrådeAustralien
ByGold Coast
Periode06/07/201411/07/2014

Emneord

  • adversarial ir, p2p ir

Citationsformater