Istituto di Scienza e Tecnologie dell'Informazione     
Sebastiani F. A tutorial on automated text categorisation. In: ASAI'99 - Argentine Symposium on Artificial Intelligence (Buenos Aires, Argentina, 7-10 September 1999). Atti, pp. 1 - 25. Amadi Analia, Zunino Alejandro (eds.). Unicen, 1999.
The automated categorisation (or classification) of lex.ts into topical categories has a long history, dating back at least to 1960. Until the late '80s, the dominant approach to the problem involved knowledge-engineering automatic categorisers, Le. manually bll~lding a set of rules encoding expert knowledge on how lO classify document". In the '90s, with the booming production and availability of on-line documents, automated text categorisatioI' has witnessed an increased and renewed interest. A newer paradigm based on ;machine learning has superseded the previous approach. Within this paradigm, a general inductive process automatically builds a classifier by "learning", from; a set of previously classified documents, the characteristics of one or more categories; the advantages are a very good effectiveness, a considerable savings in terms of expert manpower, and domain independence, In this tulorial we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues of document indexing, classifier construction, and classifier evaluation, will be touched upon.

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional