PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Puppin D., Silvestri F. The query-vector document model. In: CIKM 2006 - 15th ACM International Conference on Information and Knowledge Management (Arlington, Virginia, USA, November 6-11 2006). Abstract, pp. 880 - 881. ACM, 2006.
 
 
Abstract
(English)
Modern Web IR systems have to manage collections of billions of documents. The indexes used to represent them are very large data structures, the form of which can have a big impact on the quality and the speed of IR algorithms. Traditionally, two main ways are used to model the documents available: the bag-of-words model, and the vector-space model. In the query-vector document model, documents are mod- eled with the list of queries they match, along with the rank they get for each. The query-vector representation of a doc- ument is built out of a query-log. A reference search engine is used in the building phase: for every query in the training set, the system stores the first 100 results along with their rank. This creates a matrix, with documents on columns and queries on rows, where each entry is the rank of a doc- ument for a given query.
Subject Document Partitioning
Collection Selection
Document Modelply PCAP
H.3.1 Content Analysis and Indexing


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional