Istituto di Scienza e Tecnologie dell'Informazione     
Avancini H., Lavelli A., Magnini B., Sebastiani F., Zanoli R. Expanding domain-specific lexicons by term categorization. In: ACM Symposium on Applied Computing. SAC 2003 (Melbourne, Florida, 9-12 March 2003). Proceedings, pp. 793 - 797. ACM, 2003.
We discuss an approach to the automatic expansion of domain specific lexicons by means of term categorization, a novel task employing techniques from information retrieval (IR) and machine learning (ML). Specifically, we view the expansion of such lexicons as a process of learning previously unknown associations between terms and domains. The process generates, for each ci in a set C = {c1,..,cm} of domains, a lexicon L1i, bootstrapping from an initial lexicon L0i and a set of documents given as input. The method is inspired by text categorization (TC), the discipline con=cerned with labelling natural language texts with labels from a predefined set of domains, or categories. However, while TC deals with documents represented as vectors in a space of terms, we formulate the task of term categorization as one in which terms are (dually) represented as vectors in a space of documents, and in which terms (instead of documents) are labelled with domains.
Subject term categorization
I.5.2 Classifier Design and Evaluation
I.2.6 Learning
H.3.3 Information Search and Retrieval
H.3.1 Thesauruses

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional