Istituto di Scienza e Tecnologie dell'Informazione     
Lucchese C., Perego R., Bolettieri P., Esuli A., Falchi F., Rabitti F. Enabling content-based image retrieval in very large digital libraries. In: VLDL 2009 - Second Workshop on Very Large Digital Libraries (Corfu, Greece, 2 October 2009). Proceedings, pp. 43 - 50. Yannis Ioannidis, Paolo Manghi, Pasquale Pagano (eds.). DELOS, 2009.
Enabling effective and efficient Content-Based Image Re- trieval (CBIR) on Very Large Digital Libraries (VLDLs), is today an important research issue. While there exist well-known approaches for information retrieval on textual content for VLDLs, the research for an effective CBIR method that is also able to scale to very large collections is still open. A practical effect of this situation is that most of the image retrieval services currently available for VLDLs are based only on tex- tual metadata. In this paper, we report on our experience in creating a collection of 106 million images, i.e., the CoPhIR collection, the largest currently available to the scientific community for research purposes.We discuss the various issues arising from working with a such large col- lection and dealing with a complex retrieval model on information-rich features. We present the non-trivial process of image crawling and de- scriptive feature extraction, using the European EGEE computer GRID. The feature extraction phase is often ignored when discussing the scala- bility issue while, as we show in this work, it could be one of the toughest issues to be solved in order to make CBIR feasible on VLDLs.
Subject Image similarity search
Image crawling
Descriptive feature extraction
H.3.3 Information Search and Retrieval
H.3.7 Digital Libraries

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional