PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Dato D., Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R. Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. In: ACM Transactions on Information Systems, vol. 35 (2) article n. 15. ACM, 2016.
 
 
Abstract
(English)
Learning-to-Rank models based on additive ensembles of regression trees have been proven to be very effective for scoring query results returned by large-scale Web search engines. Unfortunately, the computational cost of scoring thousands of candidate documents by traversing large ensembles of trees is high. Thus, several works have investigated solutions aimed at improving the efficiency of document scoring by exploiting advanced features of modern CPUs and memory hierarchies. In this paper, we present QS, a new algorithm that adopts a novel cache-efficient representation of a given tree ensemble, it performs an interleaved traversal by means of fast bitwise operations, and also supports ensembles of oblivious trees. An extensive and detailed test assessment is conducted on two standard Learning-to-Rank datasets and on a novel very-large dataset we made publicly available for conducting significant efficiency tests. The experiments show unprecedented speedups over the best state-of-the-art baselines ranging from 1.9x to 6.6x. The analysis of low-level profiling traces shows that QS efficiency is due to its cache-aware approach both in terms of data layout and access patterns, and to a control flow that entails very low branch mis-prediction rates.
URL: http://dl.acm.org/citation.cfm?id=2987380
DOI: 10.1145/2987380
Subject Learning to rank
Efficiency
H.3.3 INFORMATION STORAGE AND RETRIEVAL. Information Search and Retrieval


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional