Istituto di Scienza e Tecnologie dell'Informazione     
Peters C., Picchi E. From parallel to comparable text corpora. In: Euralex - International Congress on Lexicografy (Goteborg, Sweden, ). Proceedings, pp. 173 - 180. M. Gellerstam et al. (eds.). 1996.
We present a bilingual corpus management system under development in Pisa. The first component of this system was a set of procedures to create and query parallel text archives; we are now studying the implementation of a second set of procedures to interrogate comparable archives. The approach followed is quite different from that used for parallel data and considerably more complex; the results are also very different. In the paper, we describe the strategy we are adopting to retrieve significant data from comparable corpora, and discuss the preliminary results.

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional