PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Orlando S., Perego R., Silvestri C. CCSM: an Efficient Algorithm for Constrained Sequence Mining. In: Proceedings of the 6th International Workshop on High Performance Data Mining: Pervasive and Data Stream Mining, in conjunction with Third International SIAM Conference on Data Mining (San Francisco, CA., May 1-3, 2003).
 
 
Abstract
(English)
This paper proposes CCSM (Cache-based Constrained Sequence Miner), a new level-wise algorithm that mines temporal databases to find sequential patterns satisfying user-defined constraints. The main innovation of CCSM is the adoption of k-way intersections of idlists to compute the support of candidate sequences. Our k-way intersection method is enhanced by the use of an e ective cache that stores intermediate idlists for future reuse. The exploitation of the cache entails a surprising reduction in the actual number of join operations performed on idlists. Moreover, CCSM is able to deal with very complex constraints, like the maximum temporal gap between events occurring in the input sequences. We experimentally evaluated the performances of CCSM on synthetically generated datasets, and compared them with those obtained running the cSPADE algorithm on the same datasets.
Subject Sequential pattern mining
H.2.8 Database Applications


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional