PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Esuli A., Sebastiani F. SentiWordNet: a high-coverage lexical resource for opinion mining. Preprint ISTI-2007-PP-002. Preprint, 2007.
 
 
Abstract
(English)
Opinion mining (OM) is a recent subdiscipline at the crossroads of information retrieval and computational linguistics which is concerned not with the topic a document is about, but with the opinions it expresses. OM has a rich set of applications, ranging from tracking users' opinions about products or about political candidates as expressed in online forums, to customer relationship management. In order to aid the extraction of opinions from text, recent research has tried to automatically determine the "PN-polarity" of subjective terms, i.e. identify whether a term that indicates the presence of an opinion has a positive or a negative connotation. Research on determining the "SO-polarity" of terms, i.e. whether a term indeed indicates the presence of an opinion (a subjective term) or not (an objective, or neutral term) has been instead much scarcer. In this paper we describe SentiWordNet, a lexical resource produced by asking an automated classifier to associate to each synset s of WordNet (version 2.0) a triplet of scores (s, p) (for p 2 P ={Positive, Negative, Objective}) describing how strongly the terms contained in s enjoy each of the three properties. The method used to develop SentiWordNet is based on the quantitative analysis of the glosses associated to synsets, and on the use of the resulting vectorial term representations for semi-supervised synset classification. The score triplet is derived by combining the results produced by a committee of eight ternary classifiers, all characterized by similar accuracy levels but extremely different classification behaviour. We present the results of evaluating the accuracy of the automatically assigned triplets on a publicly available benchmark. SentiWordNet is freely available for research purposes, and is endowed with a Web-based graphical user interface.
Subject Lexical resources
Opinion mining
Sentiment classification
Gloss analysis
Supervised learning
H.3.3 Information Search and Retrieval


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional