PUMA
Istituto di Scienza e Tecnologie dell'Informazione     
Tonazzini A., Vezzosi S., Bedini L. Analysis and recognition of highly degraded printed characters. In: International Journal on Document Analysis and Recognition, vol. 6 pp. 236 - 247. Springer-Verlag, 2003.
 
 
Abstract
(English)
This paper proposes an integrated system for the processing and analysis of highly degraded printed documents for the purpose of recognizing text characters. As a case study, ancient printed texts are considered. The system is comprised of various blocks operating sequentially. Starting with a single page of the document, the background noise is reduced by wavelet-based decomposition and filtering, the text lines are detected, extracted, and segmented by a simple and fast adaptive thresholding into blobs corresponding to characters, and the various blobs are analyzed by a feedforward multilayer neural network trained with a back-propagation algorithm. For each character, the probability associated with the recognition is then used as a discriminating parameter that determines the automatic activation of a feedback process, leading the system back to a block for refining segmentation. This block acts only on the small portions of the text where the recognition cannot be relied on and makes use of blind deconvolution and MRF-based segmentation techniques whose high complexity is greatly reduced when applied to a few subimages of small size. The experimental results highlight that the proposed system performs a very precise segmentation of the characters and then a highly effective recognition of even strongly degraded texts.
URL: http://www.springerlink.com/media/m35eac0cwq4yyjeugxuq/Contributions/1/0/7/P/10
Subject text processing
I.4 Image Processing and Computer Vision
I.4.3 Enhancement. Filtering
I.4.8 Scene Analysis Object recognition
I.5 Pattern Recognition
I.5.1 Models Neural Nets
I.7 Document and Text Processing
I.7.7 Document Capture. Optical character recognition


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional