PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Tonazzini A., Bedini L., Salerno E. Digital analysis of damaged documents by ICA techniques. In: First IAPR-TC3 workshop on Artificial Neural Networks in Pattern (Florence (Italy), 12-13 September 2003). Proceedings, vol. -- pp. 33 - 38. M. Gori and S. Marinai (eds.). -, 2003.
 
 
Abstract
(English)
Many text documents show a reduced legibility due to some specific kinds of physical degradation. In these cases, recovering a clean text pattern may be not the only purpose of digital document restoration, since some of the degradation artifacts may contain significant information.This is the case, for instance, of underwritings in palimpsests. In this paper, we propose a novel approach to this problem, by reformulating it as a blind source separation problem and solving it by independent component analysis techniques. Under appropriate hypotheses, the spectral components of the document, taken at different bands both in the visible and in the non-visible ranges, can be used to extract the individual contributions of, say, the text and the bleed-through and background patterns. Examples of bleed-through cancellation and recovery of underwriting from palimpsests are provided.
Subject Degraded Documents
Blind Source Separation
Independent Component Analysis
I.4 Image Processing and Computer Vision
I.4.3 Enhancement
I.4.8 Scene Analysis. Color
I.5 Pattern Recognition
I.5.4 Applications. Text processing
I.7 Document and Text Processing
I.7.5 Document Capture. Document Analysis


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional