Abstract
A common problem in unstructured peer-to-peer (P2P) information retrieval is the need to compute global statistics of the full collection, when only a small subset of the collection is visible to a peer. Without accurate estimates of these statistics, the effectiveness of modern retrieval models can be reduced. We show that for the case of a probably approximately correct P2P architecture, and using either the BM25 retrieval model or a language model with Dirichlet smoothing, very close approximations of the required global statistics can be estimated with very little overhead and a small extension to the protocol. However, through theoretical modeling and simulations we demonstrate this technique also greatly increases the ability for adversarial peers to manipulate search results. We show an adversary controlling fewer than 10% of peers can censor or increase the rank of documents, or disrupt overall search results. As a defense, we propose a simple modification to the extension, and show global statistics estimation is viable even when up to 40% of peers are adversarial.
Originalsprog | Udefineret/Ukendt |
---|---|
Titel | Proceedings of the 37th International ACM SIGIR Conference on Research 38; Development in Information Retrieval |
Antal sider | 10 |
Forlag | IEEE/ACM |
Publikationsdato | 2014 |
Sider | 203-212 |
DOI | |
Status | Udgivet - 2014 |
Udgivet eksternt | Ja |
Begivenhed | International ACM SIGIR conference on Research & development in information retrieval (2014) - Gold Coast, Australien Varighed: 6 jul. 2014 → 11 jul. 2014 |
Konference
Konference | International ACM SIGIR conference on Research & development in information retrieval (2014) |
---|---|
Land/Område | Australien |
By | Gold Coast |
Periode | 06/07/2014 → 11/07/2014 |
Emneord
- adversarial ir, p2p ir