In this work, a probabilistic approach to assessing the quality of a system for determining the authenticity of documents in the images is considered. The considered system is based on the aggregation of responses of various checks of the security elements of the document. We propose a probabilistic model of the document authentication system and the functional for evaluating the quality of the system. Based on them, we offer an approach for assessing the quality of the separate part and the whole system. Finally, the approach to constructing an optimal function of making a final decision is obtained.
The implementations of the convolution operation in neural networks are usually based on convolution-to-GeMM (General Matrix Multiplication) transformation. However, this transformation requires a big intermediate buffer (called im2col or im2row), and its initialization is both memory and time-consuming. To overcome this problem, one may use the Indirect Convolution Algorithm. This algorithm replaces the im2row buffer with a much smaller buffer of pointers, called indirection buffer. However, it limits our flexibility in the choice of multiplication micro-kernel, making matrix multiplication slightly less efficient than in the classical GeMM algorithm. To overcome this problem, we propose the Almost Indirect Convolution Algorithm, which initializes small specifically ordered block of values, which is used in matrix multiplication, via indirection buffer, the same way GeMM Algorithms initializes one block from im2row buffer. Our approach allows us to combine computational efficiency and flexibility in shape of GeMM micro-kernels with a small memory footprint of the Indirect Convolution Algorithm. Experiments with convolutions of 8-bit matrices on ARM processors show that our convolution works 14-24% faster than Indirect for a small number of channels and 10-20% faster than classical GeMM-based. This proves that it is perfectly suitable for computing inference of 8-bit quantized networks on mobile devices.
In this paper we explore a set of modifications of the cascade structure of the Viola-Jones detector on the example of solving stamp detection problem. The experiments on the public “SPODS” dataset for various document attributes extraction problems with extremely limited training set are presented. The positive training set is augmented by applying various image processing algorithms relevant to the stamp model to an available in a single instance image for each stamp type. We describe and analyze such structures of the Viola and Jones classifiers as the original cascade structure, tree, soft cascade, and perform the training experiments. Experimental results show that each modification of the cascade structure of the classifier has its own advantages and disadvantages, and the choice of the Viola-Jones classifier design significantly affects the quality of solving object detection problem.
In this paper we present a single-sample augmentation framework. The key idea of the framework consists of synthesizing a positive training set from a single natural sample using relevant geometric and pixel intensity transforms. The efficiency of the proposed framework has been demonstrated solving round seal stamp detection problem using Viola-Jones approach on the public “SPODS” dataset. The mentioned image transformations make it possible to simulate different orientation of the stamps, color differences, and distortions caused by stamping process and document aging. The proposed framework can be applied to training various machine learning algorithms for solving computer vision and computed tomography problems.
In this paper, a method for QR Code localization on images obtained under uncontrolled environment is presented. The proposed method is a modified Viola-Jones object detection method in which features are calculated over the directional edge image, and a tree classifier is used instead of cascade classifier. The experiments show that the use of the QR Code localization method described in the paper can significantly improve the quality of the existing decoding algorithms. The high performance of the developed method makes it possible to use it in various real-time recognition systems.
In this paper we present modification of the Viola-Jones approach for solving government seal stamp of the Russian Federation detection problem. The main contributions of the proposed modification are combining brightness and edge features as well as using L1 norm of the gradient of the image for calculating edge features. This modification allows to build classifiers which are more robust to noise, absence of a characteristic structure of contrasts and object's boundaries. The modification is experimentally compared to original Viola-Jones algorithm and showing better quality on different testing sets.
This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.
Object-to-features vectorisation is a hard problem to solve for objects that can be hard to distinguish. Siamese and Triplet neural networks are one of the more recent tools used for such task. However, most networks used are very deep networks that prove to be hard to compute in the Internet of Things setting. In this paper, a computationally efficient neural network is proposed for real-time object-to-features vectorisation into a Euclidean metric space. We use L2 distance to reflect feature vector similarity during both training and testing. In this way, feature vectors we develop can be easily classified using K-Nearest Neighbours classifier. Such approach can be used to train networks to vectorise such “problematic” objects like images of human faces, keypoint image patches, like keypoints on Arctic maps and surrounding marine areas.
In this paper, we present a new modification of Viola-Jones complex classifiers. We describe a complex classifier in the form of a decision tree and provide a method of training for such classifiers. Performance impact of the tree structure is analyzed. Comparison is carried out of precision and performance of the presented method with that of the classical cascade. Various tree architectures are experimentally studied. The task of vehicle wheels detection on images obtained from an automatic vehicle classification system is taken as an example.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.