KEYWORDS: Distortion, Image processing, Detection and tracking algorithms, Image classification, Mobile devices, Cameras, Data modeling, Computing systems
In this paper we explore the impact of geometrical restrictions in RANSAC sampling on the ID document type recognition accuracy in images, as well as on the accuracy of the projective distortion parameters estimation. The studied method is based on representing images as constellations of keypoints and their descriptors. The distortion parameters are estimated by applying RANSAC on the matched keypoints. Cases are studied where the base algorithm can yield erroneous or insufficiently accurate solution. A RANSAC scheme is presented with geometrical restrictors and several restriction are proposed, limiting the samples and the computed transform parameters. An experiment was conducted on the open dataset MIDV-500 and the data is presented of the dependence of classification and localization accuracy on the considered restrictors. It was shown that the introduction of restrictors allows to achieve a accuracy improvement and significant speed up.
This paper presents a method for metric rectification of planar objects that preserves angles and length ratios. An inner structure of an object is assumed to follow the laws of Manhattan World i.e. the majority of line segments are aligned with two orthogonal directions of the object. For that purpose we introduce the method that estimates the position of two vanishing points corresponding to the main object directions. It is based on an original optimization function of segments that estimates a vanishing point position. For calculation of the rectification homography with two vanishing points we propose a new method based on estimation of the camera rotation so that the camera axis is perpendicular to the object plane. The proposed method can be applied for rectification of various objects such as documents or building facades. Also since the camera rotation is estimated the method can be employed for estimation of object orientation (for example, during a surgery with radiograph of osteosynthesis implants). The method was evaluated on the MIDV-500 dataset containing projectively distorted images of documents with complex background. According to the experimental results an accuracy of the proposed method is better or equal to the-state-of-the-art if the background occupies no more than half of the image. Runtime of the method is around 3ms on core i7 3610qm CPU.
The important part of the system of a planar rectangular object analysis is the localization: the estimation of projective transform from template image of an object to its photograph. The system also includes such subsystems as the selection and recognition of text fields, the usage of contexts etc. In this paper three localization algorithms are described. All algorithms use feature points and two of them also analyze near-horizontal and near- vertical lines on the photograph. The algorithms and their combinations are tested on a dataset of real document photographs. Also the method of localization quality estimation is proposed that allows configuring the localization subsystem independently of the other subsystems quality.
In this paper we consider a task of improving optical character recognition (OCR) results of document fields on low-quality and average-quality images using N-gram models. Cyrillic fields of Russian Federation internal passport are analyzed as an example. Two approaches are presented: the first one is based on hypothesis of dependence of a symbol from two adjacent symbols and the second is based on calculation of marginal distributions and Bayesian networks computation. A comparison of the algorithms and experimental results within a real document OCR system are presented, it's showed that the document field OCR accuracy can be improved by more than 6% for low-quality images.
KEYWORDS: Chromium, Detection and tracking algorithms, Chemical elements, Analytical research, Machine vision, Distortion, Target recognition, Denoising, Information science, Control systems
The work is devoted to the research on the calculation of a projective transformation, which arises in the problems in machine vision. The details of the calculation of projective transformation and found specificities of mathematical libraries implementations are carefully analyzed. The comparisons of different approaches are provided in terms of both productivity and accuracy, using both artificially generated and real data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.