Improving query correctness using centralized probably approximately correct (pac) search

Ingemar Cox, Jianhan Zhu, Ruoxun Fu, Lars Kai Hansen

3 Citations (Scopus)

Abstract

A non-deterministic architecture for information retrieval, known as probably approximately correct (PAC) search, has recently been proposed. However, for equivalent storage and computational resources, the performance of PAC is only 63% of a deterministic system. We propose a modification to the PAC architecture, introducing a centralized query coordination node. To respond to a query, random sampling of computers is replaced with pseudo-random sampling using the query as a seed. Then, for queries that occur frequently, this pseudo-random sample is iteratively refined so that performance improves with each iteration. A theoretical analysis is presented that provides an upper bound on the performance of any iterative algorithm. Two heuristic algorithms are then proposed to iteratively improve the performance of PAC search. Experiments on the TREC-8 dataset demonstrate that performance can improve from 67% to 96% in just 10 iterations, and continues to improve with each iteration. Thus, for queries that occur 10 or more times, the performance of a non-deterministic PAC architecture can closely match that of a deterministic system.

Translated title of the contributionImproving query correctness using centralized probably approximately correct (pac) search
Original languageEnglish
Title of host publicationAdvances in Information Retrieval
Number of pages16
PublisherSpringer Science+Business Media
Publication date2010
Pages265-280
Publication statusPublished - 2010
Externally publishedYes

Fingerprint

Dive into the research topics of 'Improving query correctness using centralized probably approximately correct (pac) search'. Together they form a unique fingerprint.

Cite this