PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8662, including the Title Page, Copyright Information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Performing actuation in nanomanipulation at the necessary accuracy is largely possible thanks to many new piezoelectric
actuation systems. Although piezoelectric actuators can provide means to perform near infinitely small displacements at
extremely high resolutions, the output of the actuator motion can be quite nonlinear, especially under voltage based
control modulation.
In this work, we will cover some of the control issues, related especially to piezoelectric actuation in nanomanipulation
tasks. We will also take a look at some of the recent improvements made possible by methods utilizing artificial neural
networks for improving the generalization capability and the accuracy of piezoelectric hysteresis models used in inverse
modelling and control of the solid-state voltage controlled piezoelectric actuators.
We will also briefly discuss the problem areas in which the piezoelectric control method research should be especially
focused on and some of the weaknesses of the existing methods. In addition, some of the common issues related to
testing and result representations are discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Intelligent Ground Vehicle Competition (IGVC) is one of four, unmanned systems, student competitions that were
founded by the Association for Unmanned Vehicle Systems International (AUVSI). The IGVC is a multidisciplinary
exercise in product realization that challenges college engineering student teams to integrate advanced control theory,
machine vision, vehicular electronics and mobile platform fundamentals to design and build an unmanned system.
Teams from around the world focus on developing a suite of dual-use technologies to equip ground vehicles of the future
with intelligent driving capabilities. Over the past 20 years, the competition has challenged undergraduate, graduate and
Ph.D. students with real world applications in intelligent transportation systems, the military and manufacturing
automation. To date, teams from over 80 universities and colleges have participated. This paper describes some of the
applications of the technologies required by this competition and discusses the educational benefits. The primary goal of
the IGVC is to advance engineering education in intelligent vehicles and related technologies. The employment and
professional networking opportunities created for students and industrial sponsors through a series of technical events
over the four-day competition are highlighted. Finally, an assessment of the competition based on participation is
presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual homing is a navigation method based on comparing a stored image of the goal location and the current image
(current view) to determine how to navigate to the goal location. It is theorized that insects, such as ants and bees,
employ visual homing methods to return to their nest. Visual homing has been applied to autonomous robot platforms
using two main approaches: holistic and feature-based. Both methods aim at determining distance and direction to the
goal location. Navigational algorithms using Scale Invariant Feature Transforms (SIFT) have gained great popularity in
the recent years due to the robustness of the feature operator. Churchill and Vardy have developed a visual homing
method using scale change information (Homing in Scale Space, HiSS) from SIFT.
HiSS uses SIFT feature scale change information to determine distance between the robot and the goal location. Since
the scale component is discrete with a small range of values, the result is a rough measurement with limited accuracy.
We have developed a method that uses stereo data, resulting in better homing performance. Our approach utilizes a pan-tilt
based stereo camera, which is used to build composite wide-field images. We use the wide-field images combined
with stereo-data obtained from the stereo camera to extend the keypoint vector described in to include a new
parameter, depth (z). Using this info, our algorithm determines the distance and orientation from the robot to the goal
location.
We compare our method with HiSS in a set of indoor trials using a Pioneer 3-AT robot equipped with a BumbleBee2
stereo camera. We evaluate the performance of both methods using a set of performance measures described in this
paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Conventional stereo vision systems have a small field of view (FOV) which limits their usefulness for certain
applications. While panorama vision is able to “see” in all directions of the observation space, scene depth information is
missed because of the mapping from 3D reference coordinates to 2D panoramic image. In this paper, we present an
innovative vision system which builds by a special combined fish-eye lenses module, and is capable of producing 3D
coordinate information from the whole global observation space and acquiring no blind area 360°×360° panoramic
image simultaneously just using single vision equipment with one time static shooting. It is called Panoramic Stereo
Sphere Vision (PSSV). We proposed the geometric model, mathematic model and parameters calibration method in this
paper. Specifically, video surveillance, robotic autonomous navigation, virtual reality, driving assistance, multiple
maneuvering target tracking, automatic mapping of environments and attitude estimation are some of the applications
which will benefit from PSSV.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces a novel image description technique that aims at appearance based loop closure detection
for mobile robotics applications. This technique relies on the local evaluation of the Zernike Moments. Binary
patterns, which are referred to as Local Zernike Moment (LZM) patterns, are extracted from images, and these
binary patterns are coded using histograms. Each image is represented with a set of histograms, and loop closure
is achieved by simply comparing the most recent image with the images in the past trajectory. The technique
has been tested on the New College dataset, and as far as we know, it outperforms the other methods in terms
of computation efficiency and loop closure precision.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, autonomous, micro-unmanned aerial vehicles (micro-UAVs), or more specifically hovering micro-
UAVs, have proven suitable for many promising applications such as unknown environment exploration and search
and rescue operations. The early versions of UAVs had no on-board control capabilities, and were difficult for
manual control from a ground station. Many UAVs now are equipped with on-board control systems that reduce the
amount of control required from the ground-station operator. However, the limitations on payload, power
consumption and control without human interference remain the biggest challenges.
This paper proposes to use a smartphone as the sole computational device to stabilize and control a quad-rotor.
The goal is to use the readily available sensors in a smartphone such as the GPS, the accelerometer, the rate-gyros,
and the camera to support vision-related tasks such as flight stabilization, estimation of the height above ground,
target tracking, obstacle detection, and surveillance. We use a quad-rotor platform that has been built in the Robotic
Vision Lab at Brigham Young University for our development and experiments. An Android smartphone is
connected through the USB port to an external hardware that has a microprocessor and circuitries to generate pulse-width
modulation signals to control the brushless servomotors on the quad-rotor. The high-resolution camera on the
smartphone is used to detect and track features to maintain a desired altitude level. The vision algorithms
implemented include template matching, Harris feature detector, RANSAC similarity-constrained homography, and
color segmentation. Other sensors are used to control yaw, pitch, and roll of the quad-rotor. This smartphone-based
system is able to stabilize and control micro-UAVs and is ideal for micro-UAVs that have size, weight, and power
limitations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present the development of a multi-stage automatic target recognition (MS-ATR) system for computer vision in robotics. This paper discusses our work in optimizing the feature selection strategies of the MS-ATR system. Past implementations have utilized Optimum Trade-off Maximum Average Correlation Height (OT‐MACH) filtering as an initial feature selection method, and principal component analysis (PCA) as a feature extraction strategy before the classification stage.
Recent work has been done in the implementation of a modified saliency algorithm as a feature selection method. Saliency is typically implemented as a “bottom-up” search process using visual sensory information such as color, intensity, and orientation to detect salient points in the imagery. It is a general saliency mapping algorithm that receives no input from the user on what is considered salient. We discuss here a modified saliency algorithm that accepts the guidance of target features in locating regions of interest (ROI). By introducing target related input parameters, saliency becomes more focused and task oriented. It is used as an initial stage for the fast ROI detection method. The ROIs are passed to the later stages for feature extraction and target identification process.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Most stereovision applications are binocular which uses information from a 2-camera array to perform stereo
matching and compute the depth image. Trinocular stereovision with a 3-camera array has been proved to provide
higher accuracy in stereo matching which could benefit applications like distance finding, object recognition,
and detection. This paper presents a real-time stereovision algorithm implemented on a GPGPU (General-purpose graphics processing unit) using a trinocular stereovision camera array. Algorithm employs a winner-take-all method applied to perform fusion of disparities in different directions following various image processing
techniques to obtain the depth information. The goal of the algorithm is to achieve real-time processing speed
with the help of a GPGPU involving the use of Open Source Computer Vision Library (OpenCV) in C++ and
NVidia CUDA GPGPU Solution. The results are compared in accuracy and speed to verify the improvement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The main purpose of this paper is to use machine learning method and Kinect and its body sensation technology to
design a simple, convenient, yet effective robot remote control system. In this study, a Kinect sensor is used to capture
the human body skeleton with depth information, and a gesture training and identification method is designed using the
back propagation neural network to remotely command a mobile robot for certain actions via the Bluetooth. The
experimental results show that the designed mobile robots remote control system can achieve, on an average, more than
96% of accurate identification of 7 types of gestures and can effectively control a real e-puck robot for the designed
commands.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the analysis and derivation of the geometric relation between vanishing points and camera
parameters of central catadioptric camera systems. These vanishing points correspond to the three mutually orthogonal
directions of 3D real world coordinate system (i.e. X, Y and Z axes). Compared to vanishing points (VPs) in the
perspective projection, the advantages of VPs under central catadioptric projection are that there are normally two
vanishing points for each set of parallel lines, since lines are projected to conics in the catadioptric image plane. Also,
their vanishing points are usually located inside the image frame. We show that knowledge of the VPs corresponding to
XYZ axes from a single image can lead to simple derivation of both intrinsic and extrinsic parameters of the central
catadioptric system. This derived novel theory is demonstrated and tested on both synthetic and real data with respect to
noise sensitivity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Natural Image processing and understanding encompasses hundreds or even thousands of different algorithms.
Each algorithm has a certain peak performance for a particular set of input features and configurations of
the objects/regions of the input image (environment). To obtain the best possible result of processing, we
propose an algorithm selection approach that permits to always use the most appropriate algorithm for the
given input image. This is obtained by at first selecting an algorithm based on low level features such as color
intensity, histograms, spectral coefficients. The resulting high level image description is then analyzed for logical
inconsistencies (contradictions) that are then used to refine the selection of the processing elements. The feedback
created from the contradiction information is executed by a Bayesian Network that integrates both the features
and a higher level information selection processes. The selection stops when the high level inconsistencies are all
resolved or no more different algorithms can be selected.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we identify some of the existing problems in shape context matching. We first identify the need for reflection
invariance in shape context matching algorithms and propose a method to achieve the same. With the use of these reflection
invariance techniques, we bring all the objects, in a database, to their canonical form, which halves the time required to
match two shapes using their contexts. We then show how we can build better shape descriptors by the use of geodesic
information from the shapes and hence improve upon the well-known Inner Distance Shape Context (IDSC). The IDSC is
used by many pre- and post-processing algorithms as the baseline shape-matching algorithm. Our improvements to IDSC
will remain compatible for use with those algorithms. Finally, we introduce new comparison metrics that can be used for
the comparison of two or more algorithms. We have tested our proposals on the MPEG-7 database and show that our
methods significantly outperform the IDSC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When we apply the newly developed LPED (local polar edge detection) image processing method to a binary IR-image
which contains a special meteorite-like streak produced by the enemy SAM, the image processing speed can
be even enhanced further if another novel preprocessing scheme is used. This novel preprocessing scheme is
achieved by taking the advantage of the characteristic geometry of the meteorite-like target into consideration.
That is, we only take the clustered high temperature image points making the shape of a slender cylinder ended with
a broom-like exhaust fume into consideration. Then we can spatial-filter or pre-extract the cylinder by its
geometrical property before we apply the LPED method. This will then result in a super-fast detection, super-fast
tracking and super-fast targeting on the CM (center of mass) point of the cylinder. This CM point is just the
“heart” of the flying missile. Incorporating this targeting system with a high power laser gun through the use of a
Wollaston prism, an air-borne instant detect-instant-kill SAM killer system may then be constructed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hundreds of millions of people use hand-held devices frequently and control them by touching the screen with their
fingers. If this method of operation is being used by people who are driving, the probability of deaths and accidents
occurring substantially increases. With a non-contact control interface, people do not need to touch the screen. As a
result, people will not need to pay as much attention to their phones and thus drive more safely than they would
otherwise. This interface can be achieved with real-time stereovision. A novel Intensity Profile Shape-Matching
Algorithm is able to obtain 3-D information from a pair of stereo images in real time. While this algorithm does have a
trade-off between accuracy and processing speed, the result of this algorithm proves the accuracy is sufficient for the
practical use of recognizing human poses and finger movement tracking. By choosing an interval of disparity, an object
at a certain distance range can be segmented. In other words, we detect the object by its distance to the cameras. The
advantage of this profile shape-matching algorithm is that detection of correspondences relies on the shape of profile and
not on intensity values, which are subjected to lighting variations. Based on the resulting 3-D information, the
movement of fingers in space from a specific distance can be determined. Finger location and movement can then be
analyzed for non-contact control of hand-held devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose to use gesture recognition approaches to track a human hand in 3D space and, without the use
of special clothing or markers, be able to accurately generate code for training an industrial robot to perform the same
motion. The proposed hand tracking component includes three methods: a color-thresholding model, naïve Bayes
analysis and Support Vector Machine (SVM) to detect the human hand. Next, it performs stereo matching on the region
where the hand was detected to find relative 3D coordinates. The list of coordinates returned is expectedly noisy due to
the way the human hand can alter its apparent shape while moving, the inconsistencies in human motion and detection
failures in the cluttered environment. Therefore, the system analyzes the list of coordinates to determine a path for the
robot to move, by smoothing the data to reduce noise and looking for significant points used to determine the path the
robot will ultimately take. The proposed system was applied to pairs of videos recording the motion of a human hand in
a „real‟ environment to move the end-affector of a SCARA robot along the same path as the hand of the person in the
video. The correctness of the robot motion was determined by observers indicating that motion of the robot appeared to
match the motion of the video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When learning complicated movements by ourselves, we encounter such problems as a self-rightness. The self-rightness
results in a lack of detail and objectivity, and it may cause to miss essences and even twist the essences. Thus,
we sometimes fall into the habits of doing inappropriate motions. To solve these problems or to alleviate the problems as
could as possible, we have been developed mechanical man-machine human interfaces to support us learning such
motions as cultural gestures and sports form. One of the promising interfaces is a wearable exoskeleton mechanical
system. As of the first try, we have made a prototype of a 2-link 1-DOF rotational elbow joint interface that is applied for
teaching extension-flexion operations with forearms and have found its potential abilities for teaching the initiating and
continuing flection motion of the elbow.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The estimation of human attention has recently been addressed in the context of human robot interaction. Today, joint
work spaces already exist and challenge cooperating systems to jointly focus on common objects, scenes and work
niches. With the advent of Google glasses and increasingly affordable wearable eye-tracking, monitoring of human
attention will soon become ubiquitous. The presented work describes for the first time a method for the estimation of
human fixations in 3D environments that does not require any artificial landmarks in the field of view and enables
attention mapping in 3D models. It enables full 3D recovery of the human view frustum and the gaze pointer in a
previously acquired 3D model of the environment in real time. The study on the precision of this method reports a mean
projection error ≈1.1 cm and a mean angle error ≈0.6° within the chosen 3D model - the precision does not go below the
one of the technical instrument (≈1°). This innovative methodology will open new opportunities for joint attention
studies as well as for bringing new potential into automated processing for human factors technologies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neya Systems, LLC competed in the CANINE program sponsored by the U.S. Army Tank Automotive Research
Development and Engineering Center (TARDEC) which culminated in a competition held at Fort Benning as part of the
2012 Robotics Rodeo. As part of this program, we developed a robot with the capability to learn and recognize the
appearance of target objects, conduct an area search amid distractor objects and obstacles, and relocate the target object
in the same way that Mine dogs and Sentry dogs are used within military contexts for exploration and threat detection.
Neya teamed with the Robotics Institute at Carnegie Mellon University to develop vision-based solutions for
probabilistic target learning and recognition. In addition, we used a Mission Planning and Management System (MPMS)
to orchestrate complex search and retrieval tasks using a general set of modular autonomous services relating to robot
mobility, perception and grasping.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the Mobile Intelligence Team's approach to addressing the CANINE outdoor ground robot
competition. The competition required developing a robot that provided retrieving capabilities similar to a dog, while
operating fully autonomously in unstructured environments. The vision team consisted of Mobile Intelligence, the
Georgia Institute of Technology, and Wayne State University. Important computer vision aspects of the project were the
ability to quickly learn the distinguishing characteristics of novel objects, searching images for the object as the robot
drove a search pattern, identifying people near the robot for safe operations, correctly identify the object among
distractors, and localizing the object for retrieval. The classifier used to identify the objects will be discussed, including
an analysis of its performance, and an overview of the entire system architecture presented. A discussion of the robot's
performance in the competition will demonstrate the system’s successes in real-world testing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The U.S. Army Tank Automotive Research, Development and Engineering Center (TARDEC) held
an autonomous robot competition called CANINE in June 2012. The goal of the competition was to
develop innovative and natural control methods for robots. This paper describes the winning
technology, including the vision system, the operator interaction, and the autonomous mobility. The
rules stated only gestures or voice commands could be used for control. The robots would learn a
new object at the start of each phase, find the object after it was thrown into a field, and return the
object to the operator. Each of the six phases became more difficult, including clutter of the same
color or shape as the object, moving and stationary obstacles, and finding the operator who moved
from the starting location to a new location. The Robotic Research Team integrated techniques in
computer vision, speech recognition, object manipulation, and autonomous navigation. A multi-filter
computer vision solution reliably detected the objects while rejecting objects of similar color or
shape, even while the robot was in motion. A speech-based interface with short commands provided
close to natural communication of complicated commands from the operator to the robot. An
innovative gripper design allowed for efficient object pickup. A robust autonomous mobility and
navigation solution for ground robotic platforms provided fast and reliable obstacle avoidance and
course navigation. The research approach focused on winning the competition while remaining
cognizant and relevant to real world applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Autonomous robotic “fetch” operation, where a robot is shown a novel object and then asked to locate it in the field, re-
trieve it and bring it back to the human operator, is a challenging problem that is of interest to the military. The CANINE
competition presented a forum for several research teams to tackle this challenge using state of the art in robotics technol-
ogy. The SRI-UPenn team fielded a modified Segway RMP 200 robot with multiple cameras and lidars. We implemented
a unique computer vision based approach for textureless colored object training and detection to robustly locate previ-
ously unseen objects out to 15 meters on moderately flat terrain. We integrated SRI’s state of the art Visual Odometry for
GPS-denied localization on our robot platform. We also designed a unique scooping mechanism which allowed retrieval
of up to basketball sized objects with a reciprocating four-bar linkage mechanism. Further, all software, including a novel
target localization and exploration algorithm was developed using ROS (Robot Operating System) which is open source
and well adopted by the robotics community. We present a description of the system, our key technical contributions and
experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As part of the TARDEC-funded CANINE (Cooperative Autonomous Navigation in a Networked Environment)
Program, iRobot developed LABRADOR (Learning Autonomous Behavior-based Robot for Adaptive Detection and
Object Retrieval). LABRADOR was based on the rugged, man-portable, iRobot PackBot unmanned ground vehicle
(UGV) equipped with an explosives ordnance disposal (EOD) manipulator arm and a custom gripper. For
LABRADOR, we developed a vision-based object learning and recognition system that combined a TLD (track-learn-detect)
filter based on object shape features with a color-histogram-based object detector. Our vision system was able to
learn in real-time to recognize objects presented to the robot. We also implemented a waypoint navigation system based
on fused GPS, IMU (inertial measurement unit), and odometry data. We used this navigation capability to implement
autonomous behaviors capable of searching a specified area using a variety of robust coverage strategies – including
outward spiral, random bounce, random waypoint, and perimeter following behaviors. While the full system was not
integrated in time to compete in the CANINE competition event, we developed useful perception, navigation, and
behavior capabilities that may be applied to future autonomous robot systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of the application of visual tracking technology, the performance of visual tracking algorithm is important. Due
to many kinds of voice, robust of tracking algorithm is bad. To improve identification rate and track rate for quickly moving target,
expand tracking scope and lower sensitivity to illumination varying, an active visual tracking system based on illumination invariants
is proposed. Camera motion pre-control method based on particle filter pre-location is used to improve activity and accuracy of track
for quickly moving target by forecasting target position and control camera joints of Tilt, Pan and zoom. Pre-location method using
particle sample filter according to illumination invariants of target is used to reduce the affect of varying illumination during tracking
moving target and to improve algorithm robust. Experiments in intelligent space show that the robust to illumination vary is improved
and the accuracy is improved by actively adjust PTZ parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Linear Dimensionality Reduction (LDR) techniques have been increasingly important in computer vision and
pattern recognition since they permit a relatively simple mapping of data onto a lower dimensional subspace,
leading to simple and computationally efficient classification strategies. Recently, many linear discriminant methods
have been developed in order to reduce the dimensionality of visual data and to enhance the discrimination
between different groups or classes. Many existing linear embedding techniques relied on the use of local margins
in order to get a good discrimination performance. However, dealing with outliers and within-class diversity
has not been addressed by margin-based embedding method. In this paper, we explored the use of different
margin-based linear embedding methods. More precisely, we propose to use the concepts of Median miss and
Median hit for building robust margin-based criteria. Based on such margins, we seek the projection directions
(linear embedding) such that the sum of local margins is maximized. Our proposed approach has been applied
to the problem of appearance-based face recognition. Experiments performed on four public face databases show
that the proposed approach can give better generalization performance than the classic Average Neighborhood
Margin Maximization (ANMM). Moreover, thanks to the use of robust margins, the proposed method down-grades
gracefully when label outliers contaminate the training data set. In particular, we show that the concept
of Median hit was crucial in order to get robust performance in the presence of outliers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We consider equivalency models, including matrix-matrix and matrix-tensor and with the dual adaptive-weighted
correlation, multi-port neural-net auto-associative and hetero-associative memory (MP NN AAM and HAP), which are
equivalency paradigm and the theoretical basis of our work. We make a brief overview of the possible implementations
of the MP NN AAM and of their architectures proposed and investigated earlier by us. The main base unit of such
architectures is a matrix-matrix or matrix-tensor equivalentor. We show that the MP NN AAM based on the equivalency
paradigm and optoelectronic architectures with space-time integration and parallel-serial 2D images processing have
advantages such as increased memory capacity (more than ten times of the number of neurons!), high performance in
different modes (1010 – 1012 connections per second!) And the ability to process, store and associatively recognize highly
correlated images. Next, we show that with minor modifications, such MP NN AAM can be successfully used for highperformance
parallel clustering processing of images. We show simulation results of using these modifications for
clustering and learning models and algorithms for cluster analysis of specific images and divide them into categories of
the array. Show example of a cluster division of 32 images (40x32 pixels) letters and graphics for 12 clusters with
simultaneous formation of the output-weighted space allocated images for each cluster. We discuss algorithms for
learning and self-learning in such structures and their comparative evaluations based on Mathcad simulations are made.
It is shown that, unlike the traditional Kohonen self-organizing maps, time of learning in the proposed structures of
multi-port neuronet classifier/clusterizer (MP NN C) on the basis of equivalency paradigm, due to their multi-port,
decreases by orders and can be, in some cases, just a few epochs. Estimates show that in the test clustering of 32 1280-
element images into 12 groups, the formation of neural connections of the matrix with dimension of 128x120 elements
occurs to tens of iterative steps (some epochs), and for a set of learning patterns consisting of 32 such images, and at
time of processing of 1-10 microseconds, the total learning time does not exceed a few milliseconds. We offer criteria for
the quality evaluation of patterns clustering with such MP NN AAM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.