Istituto di Scienza e Tecnologie dell'Informazione     
Salerno E., Tonazzini A. Low-level document image analysis by statistical processing. Mario Malcangi (ed.). Milano: CLUP, 2011.
A number of methods to extract information from digital images of documents are described. The appearance of a document can be seen as the superposition of a number of information layers (the "patterns"), and is represented by a vector image, whose components (the "channels") are entailed by the type of diversity used to capture the image. Our data model considers each channel as a function of all the patterns. Starting from the appearance data, the mathematical model chosen and some physical and statistical constraints for the patterns are used to develop a strategy to isolate the different patterns. In many cases, this allows us to separate features that are superimposed to one another. Finally, examples are shown where the strategies introduced are used to either clean the document appearance (mitigation of interferences) or extract partially hidden or entangled patterns, such as stamps, watermarks, and erased strokes.
Subject Document image processing
Virtual restoration
Pattern extraction
I.7.5 Document analyis

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional