PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Esuli A., Moreo Fernāndez A. Distributional correspondence indexing for cross-language text categorization. In: ECIR 2015 - Advances in Information Retrieval. 37th European Conference on IR Research (Vienna, Austria, 29 March - 2 April 2015). Proceedings, pp. 104 - 109. Allan Hanbury, Gabriella Kazai, Andreas Rauber, Norbert Fuhr (eds.). (Lecture Notes in Computer Science, vol. 9022). Springer, 2015.
 
 
Abstract
(English)
Cross-Language Text Categorization (CLTC) aims at producing a classifier for a target language when the only available training examples belong to a different source language. Existing CLTC methods are usually affected by high computational costs, require external linguistic resources, or demand a considerable human annotation effort. This paper presents a simple, yet effective, CLTC method based on projecting features from both source and target languages into a common vector space, by using a computationally lightweight distributional correspondence profile with respect to a small set of pivot terms. Experiments on a popular sentiment classification dataset show that our method performs favorably to state-of-the-art methods, requiring a significantly reduced computational cost and minimal human intervention.
URL: http://link.springer.com/chapter/10.1007%2F978-3-319-16354-3_12
DOI: 10.1007/978-3-319-16354-3_12
Subject Cross-Language Text Categorization
Distributional Semantics
Sentiment Analysis
I.2.7 Natural Language Processing
I.2.6 Learning


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional