PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Avancini H., Rauber A., Sebastiani F. Organizing Digital Libraries by Automated Text Categorization. In: International Conference on Digital Libraries (New Delhi, IN, 24-27 February 2004). Proceedings, pp. 919 - 931. 2004.
 
 
Abstract
(English)
Text Categorization (TC) is the discipline concerned with the construction of automatic text classifiers, i.e. programs capable of assigning to a document one or more among a set of predefined categories based on the content of the document. Building these classifiers is itself done automatically, by means of a general inductive process that learns the characteristics of the categories from a set of preclassified documents. In this paper we discuss a class of applications, automatic indexing with controlled vocabularies, that is of direct concern to organizing digital libraries. We exemplify this class of applications by discussing an ongoing project aimed at classifying scientific papers about computer science with respect to the ACM Classification Scheme.
URL: http://faure.iei.pi.cnr.it/~fabrizio/Publications/ICDL-04.pdf
Subject Hierarchical text classification
Hierarchical clustering
I.5.2 Classifier design and evaluation
H.3.1 Indexing methods
H.3.3 Clustering


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional