PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Sebastiani F., Sperduti A., Valdambrini N. An improved boosting algorithm and its application to automated text categorization. Technical report, 2000.
 
 
Abstract
(English)
We describe an improved boosting algorithm, called ADABooST.MHKR, and its application to text categorization. Boosting is a method for supervised learning which has successfully been applied to many different dornains, and that has proven one of the best performers in text categorization exercises so far. Boosting is based on the idea of relying on the collective judgment of a committee of c1assifiers that are trained sequentially. In training the i-th c1assifier special emphasis is placed on the correct categorization of the training documents which have proven harder for the previously trained c1assifiers. ADABooST.MHKR is based on.the idea to build, at every iteration of the learning phase, not a single c1assifier but a sub-committee of the K classifiers which, at that iteration, look the most promising. We report the results of systematic experimentation of this method performed on the standard Reuters-21578 benchmark. These experiments have shown that ADABooST.MHKR is both more efficient to train and more effective than the original. ADABooST MHR algorithm.
Subject Text categorization
Text classification
Machine learning
Boosting
H.3 Information storage and retrieval
I.2 Artificial Intelligence


Icona documento 1) Download Document PS


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional