PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Esuli A. PP-Index: using permutation prefixes for efficient and scalable similarity search (Extended Abstract). In: SEBD 2010 - 18th Italian Symposium on Advanced Database Systems (Rimini, Italy, 20-23 June 2010). Atti, pp. 318 - 325. Sonia Bergamaschi, Stefano Lodi, Riccardo Martoglia, Claudio Sartori (eds.). Editrice Esculapio, 2010.
 
 
Abstract
(English)
The Permutation Prefix Index (PP-Index) is a data structure that allows to perform efficient approximate similarity search. It is a permutation-based index, which is based on representing any indexed object with "its view of the surrounding world", i.e., a list of the elements of a set of reference objects sorted by their distance order with respect to the indexed object. In its basic formulation, the PP-Index is biased toward efficiency. We show how the effectiveness can reach optimal levels just by adopting two "boosting" strategies: multiple index search and multiple query search, which both have nice parallelization properties. We study both the efficiency and the effectiveness properties of the PP-Index, experimenting with collections of sizes up to one hundred million objects, represented in a very high-dimensional similarity space.
Subject Approximate Similarity Search
Access Methods
H.3.3 Information Search and Retrieval


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional