Istituto di Scienza e Tecnologie dell'Informazione     
Picchi E., Sassi M., Biagioni S., Giannini S. Extending the "Facets" concept by applying NLP tools to catalog records of scientific literature. In: GL12 - Twelfth International Conference on Grey Literature : Transparency in Grey Literature, Grey Tech Approaches to High Tech Issues (Praga, 6-7 December 2010). Abstract, pp. 82 - 87. D.J. Farace, J. Frantzen, GreyNet (eds.). (Gl-conference series, vol. 12). TextRelease, 2010.
The prototype of an "intelligent" navigation system, which has been implemented on the contents of PUMA (http://puma.isti.cnr.it), a digital library of scientific literature, is presented. The system has been implemented by integrating our core textual search engine (known as DBT) with the TextPower (TP) technology. TP is based on NLP techniques and linguistic resources and provides tools specialized for the evaluation, analysis, classification and browsing of scientific literature. TP extends the facet concept by extracting "field + content" pairs not only from structured fields but also from free text, eg. abstracts, using a linguistic-statistical approach to annotate relevant terminology, named entities, etc. The enriched text can be queried, analysed, and classified using a new version of the DBT System known as "DBT&Facets". DBT&Facets has been implemented on the full bibliographic records of the documents archived in the PUMA digital library of the Italian National Research Council (CNR). PUMA is a user-focused, service-oriented infrastructure which manages 30 CNR institutional repositories containing about 25,000 published or open access documents in a wide variety of disciplines. In an open domain like scientific documentation, our approach based on the criteria of "semantic similarity" is useful - and perhaps more objective than one based on hierarchical elements - as it makes it possible to link different types of information, also across domains if necessary. DBT&Facets is an advanced search tool that permits the user to query and refine their results, and to identify particular relations between them. The aim of the project has been to structure a knowledge system of domain-specific information which assists the user by suggesting possible directions for their search.
Subject NLP tools
Digital libraries
I.2.7 Natural Language Processing
H.3.7 Digital Libraries

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional