Istituto di Scienza e Tecnologie dell'Informazione     
Grossi V., Turini F. Stream mining: a novel architecture for ensemble-based classification. In: Knowledge and Information Systems, vol. 30 (2) pp. 247 - 281. Springer, 2012.
Mining data streams has become an important and challenging task for a wide range of applications. In these scenarios, data tend to arrive in multiple, rapid and time-varying streams, thus constraining data mining algorithms to look at data only once. Maintaining an accurate model, e.g. a classifier, while the stream goes by requires a smart way of keeping track of the data already passed away. Such a synthetic structure has to serve two purposes: distilling the most of information out of past data and allowing a fast reaction to concept drifting, i.e. to the change of the data trend that necessarily affects the model. The paper outlines novel data structures and algorithms to tackle the above problem, when the model mined out of the data is a classifier. The introduced model and the overall ensemble architecture are presented in details, even considering how the approach can be extended for treating numerical attributes. A large part of the paper discusses the experiments and the comparisons with several existing systems. The comparisons show that the performance of our system in general, and in particular with respect to the reaction to concept drifting, is at the top level
URL: http://link.springer.com/article/10.1007%2Fs10115-011-0378-4
DOI: 10.1007/s10115-011-0378-4
Subject Mining data streams
Concept drifting
Ensemble classifier
Data aggregation

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional