Istituto di Scienza e Tecnologie dell'Informazione     
Macdonald C., Ounis I., Tonellotto N. Upper bound approximations for dynamic pruning. Technical report, 2011.
Dynamic pruning strategies for information retrieval systems can increase querying efficiency without decreasing effectiveness, by using upper bounds to safely omit the scoring of documents that are unlikely to make the final retrieved set. Often, such upper bounds are pre-calculated for a given weighting model at indexing time. However, this precludes the changing, adaptation or training of the weighting model without re-calculating of the upper bounds. Instead, upper bounds should be approximated at querying time from various statistics of each term to allow on-the-fly adaptation of the applied retrieval strategy. This work formulates, by using a uniform notation, the problem of determining a term upper bound given a weighting model and discusses the limitations of existing approximations. Moreover, we propose an upper bound approximation using a constrained non-linear maximisation problem. We prove that our proposed upper bound approximation does not impact on the retrieval effectiveness of several modern weighting models from various different families. We also show the applicability of the approximation for the Markov Random Field proximity model. Finally, we empirically examine how the accuracy of the upper bound approximation impacts the number of postings scored and the resulting efficiency, in the context of several large Web test collections.
Subject Performance
Dynamic Pruning
Upper Bounds
H.3.3 Information Search & Retrieval

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional