PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Silvestri F., Venturini R. VSEncoding: efficient coding and fast decoding of integer lists via dynamic programming. VSEncoding: Efficient Coding and Fast Decoding of Integer Lists via Dynamic Programming. Technical report, 2010.
 
 
Abstract
(English)
Encoding lists of integers in an efficient manner is key task in many applications in different fields. Adjacency lists of large graphs are usually encoded to save space and to im- prove decoding speed. Inverted indexes of Information Re- trieval systems keep the lists of postings usually compressed to allow an optimal utilization of memory hierarchy. Sec- ondary indexes of DBMS's are stored similarly to inverted indexes in IR systems. In this paper we propose a novel class of encoders (called VSEncoding from Vector of Splits Encoding) that, roughly speaking, work by partitioning an list of integers into blocks which are efficiently compressed by using simple encoders. Differently from previous work where heuristics were applied during the partitioning step, we carry out this important step via dynamic programming, which leads to produce the optimal solution. Experiments show that our class of encoders outperform all the existing methods in literature by more than 10% (with the exception of Binary Interpolative Coding with which they, roughly, tie) still retaining very fast decompression.
Subject d-gap encoding
Inverted index encoding
Adaptive encoding
Index compression
H.3.4 Systems and Software.
E.4 Coding and Information Theory. Data compaction and compression


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional