PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Perego R., Orlando S., Lucchese C., Silvestri F., Laforenza D., Puppin D. On the value of query logs for modern information retrieval. A. Soro, G. Paddeu and G. Armano (eds.). Monza (MI): Polimetrica International Scientific Publisher, 2006.
 
 
Abstract
(English)
Query Logs collected by a Web Search Engine (WSE) constitute a valuable source of information which can be used in several ways to enhance efficiency and efficacy of the complex process of searching. This paper surveys the results recently achieved by our group in the design of innovative solutions targeting parallel Information Retrieval (IR) systems. Our solutions exploit the knowledge deriving from the patterns of common usage of the system extracted from query logs. Such knowledge has been used: (1), to devise an effective policy for caching WSE query results; (2), to drive the partitioning of the inverted index among the nodes of a termpartitioned, parallel IR system; (3), to perform document partitioning and effective collection selection in a document-partitioned, parallel IR system. The techniques and algorithms used vary from simple statistical analysis, to frequent pattern mining, and document/query co-clustering. The have the common denominator of exploiting past usage information, and of granting remarkable improvements in efficiency or efficacy. The paper briefly describes the proposals and the framework of their application, and reports the results of experiments conducted on large query logs of real WSEs.
Subject Query log analysis
H.3 Information Storage and Retrieval
H.3.3 Information Search and Retrieval


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional