PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Lam H. T., Dinh Viet D., Perego R., Silvestri F. An incremental prefix filtering approach for the all pairs similarity search problem. In: APWeb 2010 - 12th International Asia-Pacific Web Conference (Buscan, Korea, 6-8 April 2010). Proceedings, pp. 188 - 194. IEEE, 2010.
 
 
Abstract
(English)
Given a set of records, a threshold value t and a similarity function, we investigate the problem of finding all pairs of records such that similarity between each pair is above t. We propose several optimizations on the existing approaches to solve the problem. Our algorithm outperforms the state-of-the-art algorithms in the case with large and high-dimensional datasets. The speedup we achieved varied from 30% to 4-x depending on the similarity threshold and the dataset properties.
URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5474136
DOI: 10.1109/APWeb.2010.30
Subject All pair similarity search
Optimization
Prefix filtering
H.2.8 Database Application. Data Mining


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional