Istituto di Scienza e Tecnologie dell'Informazione     
Sebastiani F. Machines that learn how to code open-ended survey data: underlying principles, experimental data, and methodological issues. In: Conference on Optimal Methods of Coding Open-Ended Survey Data (Ann Arbor, Michigan, US, December 4-5 2008).
In the last six years I have led projects aimed at developing software that learns how to code open-ended survey data from data manually coded by humans. These projects have led to the development of software now in operation at the Customer Satisfaction division of a large international banking group, and now integrated into a major software platform for the management of open-ended survey data. This software, which can code data at a rate of tens of thousands responses per hour, is the result of contributions from different fields of computer science, including Information Retrieval, Machine Learning, Computational Linguistics, and Opinion Mining. In this talk I will discuss the basic philosophy underlying this software, I will present the results of experiments we have run on several datasets of respondent data in which we compare the accuracy of the software against the accuracy of human coders, and I will argue for a notion of "accuracy" defined in terms of inter-coder agreement rates. Finally, I will discuss the kind of characteristics that make a survey more or less amenable to automated coding by means of our system.
Subject Survey coding
Customer satisfaction
Text classification
Sentiment analysis
Machine learning
Market research
I.5.2 Design Methodology. Classifier design and evaluation
I.2.6 Learning
H.3.1 Content Analysis and Indexing

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional