PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Tonellotto N., Macdonald C., Ounis I. Efficient dynamic pruning with proximity support. In: LSDS-IR - SIGIR 2010 - Workshop on Large Scale Distributed Search (Ginevra, Svizzera, Luglio 2010). Proceedings, pp. 31 - 35. CEUR Workshop Proceedings, 2010.
 
 
Abstract
(English)
Modern retrieval approaches apply not just single-term weighting models when ranking documents - instead, proximity weighting models are in common use, which highly score the co-occurrence of pairs of query terms in close proximity to each other in documents. The adoption of these proximity weighting models can cause a computational overhead when documents are scored, negatively impacting the efficiency of the retrieval process. In this paper, we discuss the integration of proximity weighting models into efficient dynamic pruning strategies. In particular, we propose to modify document-at-a-time strategies to include proximity scoring without any modifications to pre-existing index structures. Our resulting two-stage dynamic pruning strategies only consider single query terms during first stage pruning, but can early terminate the proximity scoring of a document if it can be shown that it will never be retrieved. We empirically examine the efficiency benefits of our approach using a large Web test collection of 50 million documents and 10,000 queries from a real query log. Our results show that our proposed two-stage dynamic pruning strategies are considerably more efficient than the original strategies, particularly for queries of 3 or more terms.
Subject Information Retrieval
Search Engines
H.3.3 Information Search and Retrieval


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional