Quadratic Correlation Filters (QCFs) have been shown to be useful at separating target areas of interest from background clutter for two-class discrimination. However, extension to multi-class discrimination is not straightforward, as QCFs primarily maximize the distance of the response of a filtered target area from clutter. Several attempts have been made to extend QCFs to multi-class discrimination using support vector machines and convolutional neural networks. In addition, detection and recognition of targets that are considered “unresolved” are still elusive for neural network architectures like YOLO which have minimum target size requirements. This work will show that the localization aspect of a QCF neural network layer paired with feature extraction layers of a purely convolutional neural network provides a robust solution to this problem. We will compare the recognition accuracy to YOLO outputs, since this is the current state-of-the-art for target localization and recognition for autonomous vehicles.
Automated target detection and recognition (ATDR) algorithms solely based on sensor data have seen great strides in improvement, especially with the on-set of deep learning neural networks, multi-sensor and multimodal fusion techniques. However, ATR applied just on imagery with few-pixels-on-targets in highly-cluttered environments remains a tough problem. Rather than focusing on imagery as the only input to an ATDR process, in this paper, we turn our attention to using contextual and heterogeneous information to help aid in improving ATDR accuracy. We treat scene context as a collection of random variables that can then be cast into a Bayesian framework. Specifically, targets likelihoods given the context are estimated by an ensemble training process. Then statistical inference is applied to update the probability vector of the target estimates. For low-observability cases on the targets, this can dramatically improve the accuracy of the true target type. In this paper, we identify some of these contexts and apply it from the output of an emulated ATDR image-only process and report results.
It is well known that a translating mask can optically encode low-resolution measurements from which higher resolution images can be computationally reconstructed. We experimentally demonstrate that this principle can be used to achieve substantial increase in image resolution compared to the size of the focal plane array (FPA). Specifically, we describe a scalable architecture with a translating mask (also referred to as a coded aperture) that achieves eightfold resolution improvement (or 64∶1 increase in the number of pixels compared to the number of focal plane detector elements). The imaging architecture is described in terms of general design parameters (such as field of view and angular resolution, dimensions of the mask, and the detector and FPA sizes), and some of the underlying design trades are discussed. Experiments conducted with different mask patterns and reconstruction algorithms illustrate how these parameters affect the resolution of the reconstructed image. Initial experimental results also demonstrate that the architecture can directly support task-specific information sensing for detection and tracking, and that moving objects can be reconstructed separately from the stationary background using motion priors.
A compressive imaging model is proposed that multiplexes segments of the field of view (FOV) onto an infrared focal plane array (FPA). Similar to compound imaging, our model is based on combining pixels from a surface comprising of the different parts of the FOV. We formalize this superposition of pixels in a global multiplexing process reducing the number of detectors required for the FPA. We present an analysis of the signal-to-noise ratio for the full rank and compressive collection paradigms for a target detection and tracking scenario. We then apply automated target detection algorithms directly on the measurement sequence for this multiplexing model. We extend the target training and detection processes for the application directly on the encoded measurements. Optimal measurement codes for this application may imply abandoning the ability to reconstruct the actual scene in favor of reconstructing the locations of interesting objects. We present a simulated case study as well as real data results from a visible FOV multiplexing camera.
A novel compressive imaging model is proposed that multiplexes segments of the field of view onto an infrared focal plane array (FPA). Similar to the compound eyes of insects, our imaging model is based on combining pixels from a surface comprising of different parts of the field of view (FOV). We formalize this superposition of pixels in a global multiplexing process reducing the resolution requirements of the FPA. We then apply automated target detection algorithms directed on the measurements of this model in a typical missile seeker scene. Based on quadratic correlation filters, we extend the target training and detection processes directly using these encoded measurements. Preliminary results are promising.
We describe the design and evaluate the performance of a compressive imaging system comprised of a 256x320 detector array sensitive to mid-wave infrared, DMD, objective and relay lenses. The irradiance of each detector element is characterized that allows a system of measurements to be made separable from other detectors. The FOV is divided into smaller areas based on the support of each detector, allowing for tractable high throughput reconstructions. Cross-talk is considered in the sensor modeling that corrects for the noise in the boundaries of the image patches. Based on our previous work, we apply optimal codes subject to device constraints and give favorable results.
We look at the design of projective measurements for compressive imaging based upon image priors and device
constraints. If one assumes that image patches from natural imagery can be modeled as a low rank manifold, we develop
an optimality criterion for a measurement matrix based upon separating the canonical elements of the manifold prior. We
then describe a stochastic search algorithm for finding the optimal measurements under device constraints based upon a
subspace mismatch algorithm. The algorithm is then tested on a prototype compressive imaging device designed to
collect an 8x4 array of projective measurements simultaneously.
This work is based upon work supported by DARPA and the SPAWAR System Center Pacific under Contract No.
N66001-11-C-4092. The views expressed are those of the author and do not reflect the official policy or position of the
Department of Defense or the U.S. Government.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.