Istituto di Scienza e Tecnologie dell'Informazione     
Kotti M., Paterṇ F. Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema. In: International Journal of Speech Technology, vol. 15 (2) pp. 131 - 150. Springer, 2012. [Online First 31 January 2012]
In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisaki's model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers' error rates and then to evaluate the information expressed by the classifiers' confusion matrices.
URL: http://link.springer.com/article/10.1007/s10772-012-9127-7
DOI: 10.1007/s10772-012-9127-7
Subject Emotion recognition
Large-scale feature extraction
Binary classification schema
Speaker-independent protocol
Classifier comparison

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional