Istituto di Scienza e Tecnologie dell'Informazione     
Giannotti F., Nanni M., Pedreschi D., Samaritani F. WebCat: Automatic Categorization of Web Search Results. In: Proceedings of the Eleventh Italian Symposium on Advanced Database Systems, SEBD 2003 (Cetraro (CS), Italy, June 24-27, 2003). Atti, p. 12. Sergio Flesca, Sergio Greco, Domenico Sacca' and Ester Zumpano (eds.). Rubettino Editore, 2003.
Finding information using Web search engines is not always successful. When search results are presented in a ranked list, users are often compelled to sift through a long list of snippets to find the information they are looking for. This paper presents a versatile system to reorganize search results into a partition of homogeneous document clusters using Data Mining techniques. The purpose is to help users to easily browse through the set of retrieved documents, by focusing on clusters whose characterizing keywords are directly pertinent to the query. We experienced interesting results by applying our techniques on snippets only, i.e., by running our application on the client side, 'outside' of the search engine. For general queries, the obtained clusters are usually natural collections of homogeneous documents, and often documents in the same cluster occur in distant positions in the ranked list returned by the search engine.
Subject Data mining
Web searching
Web data visualization
H. Information Systems

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional