Istituto di Scienza e Tecnologie dell'Informazione     
Esuli A., Fagni T., Sebastiani F. TreeBoost.MH : a boosting algorithm for multi-label hierarchical text categorization. In: 13th International Symposium on String Processing and Information Retrieval (SPIRE'06) (Glasgow, UK, 11-13 October 2006). Proceedings, pp. 13 - 24. F. Crestani, P. Ferragina, and M. Sanderson (eds.). (Lecture Notes in Computer Science, vol. 4209). Springer Verlag, 2006.
In this paper we propose TreeBoost.MH, an algorithm for multi-label Hierarchical Text Categorization (HTC) consisting of a hierarchical variant of AdaBoost.MH. TreeBoost.MH embodies several intuitions that had arisen before within HTC: e.g. the intuitions that both feature selection and the selection of negative training examples should be performed 'locally', i.e. by paying attention to the topology of the classification scheme. It also embodies the novel intuition that the weight distribution that boosting algorithms update at every boosting round should likewise be updated 'locally'. We present the results of experimenting TreeBoost.MH on two HTC benchmarks, and discuss analytically its computational cost.
URL: http://www.springerlink.com/content/b73715w22n546771/fulltext.pdf
Subject Text categorization
Supervised learning
Hierarchical categorization
I.2.6 Learning

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional