1 June 1994 Automated analysis of mixed documents consisting of printed Korean/alphanumeric texts and graphic images
Young Kug Ham, Hong Kyu Chung, In Kwon Kim, Rae-Hong Park
Author Affiliations +
Abstract
An efficient algorithm is proposed that recognizes a mixed document consisting of printed Korean/alphanumeric text and graphic images. In the preprocessing step, an input document is skew-normalized, if necessary, by rotating it by an angle detected with the Hough transform. Then we separate the graphic image parts from the text parts by considering chain codes of connected components. We further separate each character using vertical and horizontal projections. In the recognition step, a mixed text consisting of two different sets of characters, e.g. , Korean and alphanumeric characters is recognized. Korean and alphanumeric characters are classified and each is recognized hierarchically using several effective features. The output is obtained by combining the recognized characters and separated graphic parts. An efficient automated analysis algorithm for mixed documents consisting of graphic images and two different sets of characters is proposed and its performance is demonstrated via computer simulation.
Young Kug Ham, Hong Kyu Chung, In Kwon Kim, and Rae-Hong Park "Automated analysis of mixed documents consisting of printed Korean/alphanumeric texts and graphic images," Optical Engineering 33(6), (1 June 1994). https://doi.org/10.1117/12.171323
Published: 1 June 1994
Lens.org Logo
CITATIONS
Cited by 11 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Visualization

Optical character recognition

Detection and tracking algorithms

Binary data

Hough transforms

Feature extraction

Head

RELATED CONTENT


Back to Top