Many computer vision applications require finding corresponding points between images and using the corresponding points to estimate disparity. Today’s correspondence finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3-D computer vision applications, however, do not produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. We present an image correspondence finding technique that aligns pairs of image sequences using optical flow fields. The optical flow fields provide information about the structure and motion of the scene, which are not available in still images but can be used in image alignment. We apply the technique to a dual focal length stereo camera rig consisting of a visible light—infrared camera pair and to a coaxial camera rig. We test our method on real image sequences and compare our results with the state-of-the-art multimodal and structure from motion (SfM) algorithms. Our method produces more accurate depth and scene velocity reconstruction estimates than the state-of-the-art multimodal and SfM algorithms.
A coaxial camera rig consists of a pair of cameras which acquire images along the same optical axis but at different
distances from the scene using different focal length optics. The coaxial geometry permits the acquisition of image pairs
through a substantially smaller opening than would be required by a traditional binocular stereo camera rig. This is
advantageous in applications where physical space is limited, such as in an endoscope. 3D images acquired through an
endoscope are desirable, but the lack of physical space for a traditional stereo baseline is problematic. While image
acquisition along a common optical axis has been known for many years; 3D reconstruction from such image pairs has
not been possible in the center region due to the very small disparity between corresponding points. This characteristic of
coaxial image pairs has been called the unrecoverable point problem. We introduce a novel method to overcome the
unrecoverable point problem in coaxial camera rigs, using a variational methods optimization algorithm to map pairs of
optical flow fields from different focal length cameras in a coaxial camera rig. Our method uses the ratio of the optical
flow fields for 3D reconstruction. This results in accurate image pair alignment and produces accurate dense depth maps.
We test our method on synthetic optical flow fields and on real images. We demonstrate our method's accuracy by
evaluating against a ground-truth. Accuracy is comparable to a traditional binocular stereo camera rig, but without the
traditional stereo baseline and with substantially smaller occlusions.
In recent years, the use of multi-modal camera rigs consisting of an RGB sensor and an infrared (IR) sensor have become increasingly popular for use in surveillance and robotics applications. The advantages of using multi-modal camera rigs include improved foreground/background segmentation, wider range of lighting conditions under which the system works, and richer information (e.g. visible light and heat signature) for target identification. However, the traditional computer vision method of mapping pairs of images using pixel intensities or image features is often not possible with an RGB/IR image pair. We introduce a novel method to overcome the lack of common features in RGB/IR image pairs by using a variational methods optimization algorithm to map the optical flow fields computed from different wavelength images. This results in the alignment of the flow fields, which in turn produce correspondences similar to those found in a stereo RGB/RGB camera rig using pixel intensities or image features. In addition to aligning the different wavelength images, these correspondences are used to generate dense disparity and depth maps. We obtain accuracies similar to other multi-modal image alignment methodologies as long as the scene contains sufficient depth variations, although a direct comparison is not possible because of the lack of standard image sets from moving multi-modal camera rigs. We test our method on synthetic optical flow fields and on real image sequences that we created with a multi-modal binocular stereo RGB/IR camera rig. We determine our method's accuracy by comparing against a ground truth.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.