Paper
24 March 2014 Robust binarization of degraded document images using heuristics
Jon Parker, Ophir Frieder, Gideon Frieder
Author Affiliations +
Proceedings Volume 9021, Document Recognition and Retrieval XXI; 90210U (2014) https://doi.org/10.1117/12.2042581
Event: IS&T/SPIE Electronic Imaging, 2014, San Francisco, California, United States
Abstract
Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, image enhancement method that is input page independent and requires no training data. The approach applies to color or greyscale images with hand written script, typewritten text, images, and mixtures thereof. We evaluated the image enhancement method against the test images provided by the 2011 Document Image Binarization Contest (DIBCO). Our method outperforms all 2011 DIBCO entrants in terms of average F1 measure – doing so with a significantly lower variance than top contest entrants. The capability of the proposed method is also illustrated using select images from a collection of historic documents stored at Yad Vashem Holocaust Memorial in Israel.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jon Parker, Ophir Frieder, and Gideon Frieder "Robust binarization of degraded document images using heuristics", Proc. SPIE 9021, Document Recognition and Retrieval XXI, 90210U (24 March 2014); https://doi.org/10.1117/12.2042581
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image enhancement

Image processing

Image filtering

Edge detection

Detection and tracking algorithms

Optical character recognition

Principal component analysis

RELATED CONTENT


Back to Top