Istituto di Scienza e Tecnologie dell'Informazione     
Sebastiani F. Classification of text, automatic. 2nd ed. vol. 2 Keith Brown (ed.). Amsterdam, NL: Elsevier, 2006.
Automatic text classification (ATC) is a discipline at the crossroads of information retrieval (IR), machine learning (ML), and computational linguistics (CL), and consists in the realization of text classifiers, i.e. software systems capable of assigning texts to one or more categories, or classes, from a predefined set. Applications range from the automated indexing of scientific articles, to e-mail routing, spam filtering, authorship attribution, and automated survey coding. This article will focus on the ML approach to ATC, whereby a software system (called the learner) automatically builds a classifier for the categories of interest by generalizing from a 'training' set of pre-classified texts.
URL: http://nmis.isti.cnr.it/sebastiani/Publications/ELL06.pdf
Subject Text classification
Text categorization
Supervised learning
I.2.6 Learning

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional