PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE
Proceedings Volume 7708, including the Title Page, Copyright
information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Camera phones are ubiquitous, and consumers have been adopting them faster than any other technology in modern
history. When connected to a network, though, they are capable of more than just picture taking: Suddenly, they gain
access to the power of the cloud. We exploit this capability by providing a series of image-based personal advisory
services. These are designed to work with any handset over any cellular carrier using commonly available Multimedia
Messaging Service (MMS) and Short Message Service (SMS) features. Targeted at the unsophisticated consumer, these
applications must be quick and easy to use, not requiring download capabilities or preplanning. Thus, all application
processing occurs in the back-end system (i.e., as a cloud service) and not on the handset itself. Presenting an image to
an advisory service in the cloud, a user receives information that can be acted upon immediately. Two of our examples
involve color assessment - selecting cosmetics and home décor paint palettes; the third provides the ability to extract
text from a scene. In the case of the color imaging applications, we have shown that our service rivals the advice quality
of experts. The result of this capability is a new paradigm for mobile interactions - image-based information services
exploiting the ubiquity of camera phones.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Speaker recognition plays a very important role in the field of biometric security. In order to improve the
recognition performance, many pattern recognition techniques have be explored in the literature. Among these
techniques, the Gaussian Mixture Model (GMM) is proved to be an effective statistic model for speaker recognition
and is used in most state-of-the-art speaker recognition systems. The GMM is used to represent the 'voice
print' of a speaker through modeling the spectral characteristic of speech signals of the speaker. In this paper, we
implement a speaker recognition system, which consists of preprocessing, Mel-Frequency Cepstrum Coefficients
(MFCCs) based feature extraction, and GMM based classification. We test our system with TIDIGITS data set
(325 speakers) and our own recordings of more than 200 speakers; our system achieves 100% correct recognition
rate. Moreover, we also test our system under the scenario that training samples are from one language but test
samples are from a different language; our system also achieves 100% correct recognition rate, which indicates
that our system is language independent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The vein structure in the sclera is stable over time, unique to each person, and well suited for human
identification. A few researchers have performed sclera vein pattern recognition and reported promising initial
results. Sclera recognition poses several challenges: the vein structure moves and deforms with the
movement of the eye; images of sclera patterns are often defocused and/or saturated; and, most importantly,
the vein structure in the sclera is multi-layered and has complex non-linear deformation. In this paper, we
proposed a new method for sclera recognition: First, we developed a color-based sclera region estimation
scheme for sclera segmentation. Second, we designed a Gabor wavelet-based sclera pattern enhancement
method, and an adaptive thresholding method to emphasize and binarize the sclera vein patterns. Third, we
proposed a line descriptor-based feature extraction, registration, and matching method that is illumination-,
scale-, orientation-, and deformation-invariant, and can mitigate the multi-layered deformation effects
exhibited in the sclera and tolerate segmentation error. It is empirically verified using the UBIRIS database
that the proposed method can perform accurate sclera recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multimodal biometrics use more than one means of biometric identification to achieve higher recognition
accuracy, since sometimes a unimodal biometric is not good enough used to do identification and
classification. In this paper, we proposed a multimodal eye recognition system, which can obtain both iris and
sclera patterns from one color eye image. Gabor filter and 1-D Log-Gabor filter algorithms have been applied
as the iris recognition algorithms. In sclera recognition, we introduced automatic sclera segmentation, sclera
pattern enhancement, sclera pattern template generation, and sclera pattern matching. We applied kernelbased
matching score fusion to improve the performance of the eye recognition system. The experimental
results show that the proposed eye recognition method can achieve better performance compared to
unimodal biometric identification, and the accuracy of our proposed kernel-based matching score fusion
method is higher than two classic linear matching score fusion methods: Principal Component Analysis (PCA)
and Linear Discriminant Analysis (LDA).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces a human actions recognition framework based on multiple types of features. Taking the
advantage of motion-selectivity property of 3D dual-tree complex wavelet transform (3D DT-CWT) and affine
SIFT local image detector, firstly spatio-temporal and local static features are extracted. No assumptions of
scene background, location, objects of interest, or point of view information are made whereas bidirectional
two-dimensional PCA (2D-PCA) is employed for dimensionality reduction which offers enhanced capabilities
to preserve structure and correlation amongst neighborhood pixels of a video frame. The proposed technique
is significantly faster than traditional methods due to volumetric processing of input video, and offers a rich
representation of human actions in terms of reduction in artifacts. Experimental examples are given to illustrate
the effectiveness of the approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cancelable approaches for biometric person authentication have been studied to protect enrolled biometric data, and several algorithms have been proposed. One drawback of cancelable approaches is that the performance is inferior to that of non-cancelable approaches. As one solution, we proposed a scheme to enhance the performance of a cancelable approach for online signature verification by combining scores calculated from two transformed datasets generated using two keys. Generally, the same verification algorithm is used for transformed data as for raw (non-transformed) data in cancelable approaches, and, in our previous work, a verification system developed for a non-transformed dataset was used to calculate the scores from transformed data. In this paper, we modify the verification system by using transformed data for training. Several experiments were performed by using public databases, and the experimental results show that the modification of the verification system improved the performances. Our cancelable system combines two scores to make a decision. Several fusion strategies are also considered, and the experimental results are reported here.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The iris is a stable and reliable biometric for positive human identification. However, the traditional
iris recognition scheme raises several privacy concerns. One's iris pattern is permanently bound
with him and cannot be changed. Hence, once it is stolen, this biometric is lost forever as well as all
the applications where this biometric is used. Thus, new methods are desirable to secure the
original pattern and ensure its revocability and alternatives when compromised. In this paper, we
propose a novel scheme which incorporates iris features, non-invertible transformation and data
encryption to achieve "cancelability" and at the same time increases iris recognition accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the automatic lip reading problem is investigated, and an innovative approach to providing solutions to this
problem has been proposed. This new VSR approach is dependent on the signature of the word itself, which is obtained
from a hybrid feature extraction method dependent on geometric, appearance, and image transform features. The
proposed VSR approach is termed "visual words".
The visual words approach consists of two main parts, 1) Feature extraction/selection, and 2) Visual speech feature
recognition. After localizing face and lips, several visual features for the lips where extracted. Such as the height and
width of the mouth, mutual information and the quality measurement between the DWT of the current ROI and the DWT
of the previous ROI, the ratio of vertical to horizontal features taken from DWT of ROI, The ratio of vertical edges to
horizontal edges of ROI, the appearance of the tongue and the appearance of teeth. Each spoken word is represented by 8
signals, one of each feature. Those signals maintain the dynamic of the spoken word, which contains a good portion of
information. The system is then trained on these features using the KNN and DTW.
This approach has been evaluated using a large database for different people, and large experiment sets. The evaluation
has proved the visual words efficiency, and shown that the VSR is a speaker dependent problem.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Key management is one of the most important issues in cryptographic systems. Several important challenges in
such a context are represented by secure and efficient key generation, key distribution, as well as key revocation.
Addressing such challenges requires a comprehensive solution which is robust, secure and efficient. Compared to
traditional key management schemes, key management using biometrics requires the presence of the user, which
can reduce fraud and protect the key better. In this paper, we propose a novel key management scheme using
iris based biometrics. Our newly proposed scheme outperforms traditional key management schemes as well as
some existing key-binding biometric schemes in terms of security, diversity and/or efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fingerprints recognition systems have been widely used by financial institutions, law enforcement, border control, visa
issuing, just to mention few. Biometric identifiers can be counterfeited, but considered more reliable and secure
compared to traditional ID cards or personal passwords methods. Fingerprint pattern fusion improves the performance of
a fingerprint recognition system in terms of accuracy and security. This paper presents digital enhancement and fusion
approaches that improve the biometric of the fingerprint recognition system. It is a two-step approach. In the first step
raw fingerprint images are enhanced using high-frequency-emphasis filtering (HFEF). The second step is a simple linear
fusion process between the raw images and the HFEF ones. It is shown that the proposed approach increases the
verification and identification of the fingerprint biometric recognition system, where any improvement is justified using
the correlation performance metrics of the matching algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of mobile communication devices with advance sensors is growing rapidly. These sensors are enabling functions
such as Image capture, Location applications, and Biometric authentication such as Fingerprint verification and Face &
Handwritten signature recognition. Such ubiquitous devices are essential tools in today's global economic activities
enabling anywhere-anytime financial and business transactions. Cryptographic functions and biometric-based
authentication can enhance the security and confidentiality of mobile transactions.
Using Biometric template security techniques in real-time biometric-based authentication are key factors for successful
identity verification solutions, but are venerable to determined attacks by both fraudulent software and hardware. The
EU-funded SecurePhone project has designed and implemented a multimodal biometric user authentication system on a
prototype mobile communication device. However, various implementations of this project have resulted in long
verification times or reduced accuracy and/or security.
This paper proposes to use built-in-self-test techniques to ensure no tampering has taken place on the verification process
prior to performing the actual biometric authentication. These techniques utilises the user personal identification number
as a seed to generate a unique signature. This signature is then used to test the integrity of the verification process. Also,
this study proposes the use of a combination of biometric modalities to provide application specific authentication in a
secure environment, thus achieving optimum security level with effective processing time. I.e. to ensure that the
necessary authentication steps and algorithms running on the mobile device application processor can not be undermined
or modified by an imposter to get unauthorized access to the secure system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In computer vision applications, image matching performed on quality-degraded imagery is difficult due to image
content distortion and noise effects. State-of-the art keypoint based matchers, such as SURF and SIFT, work very well
on clean imagery. However, performance can degrade significantly in the presence of high noise and clutter levels.
Noise and clutter cause the formation of false features which can degrade recognition performance. To address this
problem, previously we developed an extension to the classical amplitude and phase correlation forms, which provides
improved robustness and tolerance to image geometric misalignments and noise. This extension, called Alpha-Rooted
Phase Correlation (ARPC), combines Fourier domain-based alpha-rooting enhancement with classical phase correlation.
ARPC provides tunable parameters to control the alpha-rooting enhancement. These parameter values can be optimized
to tradeoff between high narrow correlation peaks, and more robust wider, but smaller peaks. Previously, we applied
ARPC in the radon transform domain for logo image recognition in the presence of rotational image misalignments. In
this paper, we extend ARPC to incorporate quaternion Fourier transforms, thereby creating Alpha-Rooted Quaternion
Phase Correlation (ARQPC). We apply ARQPC to the logo image recognition problem. We use ARQPC to perform
multiple-reference logo template matching by representing multiple same-class reference templates as quaternion-valued
images. We generate recognition performance results on publicly-available logo imagery, and compare recognition
results to results generated from standard approaches. We show that small deviations in reference templates of sameclass
logos can lead to improved recognition performance using the joint matching inherent in ARQPC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Baggage scanning systems are used for detecting the presence of explosives and other prohibited items in baggage at
security checkpoints in airports. However, the CT baggage images contain projection noise and are of low resolution.
This paper introduces a new enhancement algorithm combining alpha-weighted mean separation and histogram
equalization to enhance the CT baggage images while removing the background projection noise. A new enhancement
measure is introduced for quantitative assessment of image enhancement. Simulations and a comparative analysis are
given to demonstrate the new algorithm's performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The manner of holding a pen is distinctive among people. Therefore, pen holding style is useful for person
authentication. In this paper, we propose a biometric person authentication method using features extracted
from images of pen holding style. Images of the pen holding style are captured by a camera, and several features are extracted from the captured images. These features are compared with a reference dataset to calculate dissimilarity scores, and these scores are combined for verification using a three-layer perceptron. Preliminary experiments were performed by using a private database. The proposed system yielded an equal error rate (EER) of 2.6%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computing the similarity between videos is a challenging problem in many applications of video processing such as
video retrieval and video-based action recognition. The main difficulty in evaluating similarity between videos is the
lack of an effective distance measure. In this paper, we propose a clustering based distance measure to solve this
problem. In our approach, a video sequence is represented by a set of spatiotemporal descriptors. Therefore computing
the similarity between two video sequences can be achieved by estimating the distance between two sets of descriptors
extracted from the videos. To compute the distance measure, we use clustering to obtain distributions and "constrained
distributions" of the two sets of descriptors, and use Kullback-Leibler (KL) divergence [6] with the distributions. We
apply our distance measure to the problems of distinguishing different human actions and content-based video retrieval.
Our experimental results show that the proposed distance measure is an effective metric for these applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Noise in general is considered to be degradation in image quality. Moreover image quality is measured based on
the appearance of the image edges and their clarity. Most of the applications performance is affected by image
quality and level of different types of degradation. In general measuring image quality and identifying the type
of noise or degradation is considered to be a key factor in raising the applications performance, this task can be
very challenging. Wavelet transform now a days, is widely used in different applications. These applications are
mostly benefiting from the wavelet localisation in the frequency domain. The coefficients of the high frequency
sub-bands in wavelet domain are represented by Laplace histogram. In this paper we are proposing to use the
Laplace distribution histogram to measure the image quality and also to identify the type of degradation
affecting the given image.
Image quality and the level of degradation are mostly measured using a reference image with reasonable quality.
The discussed Laplace distribution histogram provides a self testing measurement for the quality of the image.
This measurement is based on constructing the theoretical Laplace distribution histogram of the high frequency
wavelet sub-band. This construction is based on the actual standard deviation, then to be compared with the
actual Laplace distribution histogram. The comparison is performed using histogram intersection method. All
the experiments are performed using the extended Yale database.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe a method for searching videos in large video databases based on the activity contents present in the videos.
Being able to search videos based on the contents (such as human activities) has many applications such as security,
surveillance, and other commercial applications such as on-line video search. Conventional video content-based retrieval
(CBR) systems are either feature based or semantics based, with the former trying to model the dynamics video contents
using the statistics of image features, and the latter relying on automated scene understanding of the video contents.
Neither approach has been successful. Our approach is inspired by the success of visual vocabulary of "Video Google"
by Sivic and Zisserman, and the work of Nister and Stewenius who showed that building a visual vocabulary tree can
improve the performance in both scalability and retrieval accuracy for 2-D images. We apply visual vocabulary and
vocabulary tree approach to spatio-temporal video descriptors for video indexing, and take advantage of the
discrimination power of these descriptors as well as the scalability of vocabulary tree for indexing. Furthermore, this
approach does not rely on any model-based activity recognition. In fact, training of the vocabulary tree is done off-line
using unlabeled data with unsupervised learning. Therefore the approach is widely applicable. Experimental results using
standard human activity recognition videos will be presented that demonstrate the feasibility of this approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The high intra-class variability of acquired biometric data can be attributed to several factors such as quality of
acquisition sensor (e.g. thermal), environmental (e.g. lighting), behavioural (e.g. change face pose). Such large fuzziness
of biometric data can cause a big difference between an acquired and stored biometric data that will eventually lead to
reduced performance. Many systems store multiple templates in order to account for such variations in the biometric data
during enrolment stage. The number and typicality of these templates are the most important factors that affect system
performance than other factors. In this paper, a novel offline approach is proposed for systematic modelling of intra-class
variability and typicality in biometric data by regularly selecting new templates from a set of available biometric images.
Our proposed technique is a two stage algorithm whereby in the first stage image samples are clustered in terms of their
image quality profile vectors, rather than their biometric feature vectors, and in the second stage a per cluster template is
selected from a small number of samples in each clusters to create an ultimate template sets. These experiments have
been conducted on five face image databases and their results will demonstrate the effectiveness of proposed quality guided approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital steganography has been used extensively for electronic copyright stamping, but also for criminal or covert
activities. While a variety of techniques exist for detecting steganography the identification of semagrams, messages
transmitted visually in a non-textual format remain elusive. The work that will be presented describes the creation of a
novel application which uses hierarchical neural network architectures to detect the likely presence of a semagram
message in an image. The application was used to detect semagrams containing Morse Code messages with over 80%
accuracy. These preliminary results indicate a significant advance in the detection of complex semagram patterns.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a robust watermarking technique based on fractional cosine transform and singular value decomposition
is presented to improve the protection of the images. A meaningful gray scale image is used as watermark
instead of randomly generated Gaussian noise type watermark. First, host image is transformed by the means
of fractional cosine transform. Now, the positions of all frequency coefficients are changed with respect to some
rule and this rule is secret and only known to the owner/creator. Then inverse fractional cosine transform is
performed to get the reference image. Watermark logo is embedded in the reference image by modifying its
singular values. For embedding, the singular values of the reference image are found and then modify it by
adding the singular values of the watermark image. A reliable watermark extraction algorithm is developed for
extracting watermark from possibly attacked image. The experimental results show better visual imperceptibility
and resiliency of the proposed scheme against intentional or un-intentional variety of attacks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces a new effective and lossless image encryption algorithm using a Sudoku Matrix to scramble and
encrypt the image. The new algorithm encrypts an image through a three stage process. In the first stage, a reference
Sudoku matrix is generated as the foundation for the encryption and scrambling processes. The image pixels' intensities
are then changed by using the reference Sudoku matrix values, and then the pixels' positions are shuffled using the
Sudoku matrix as a mapping process. The advantages of this method is useful for efficiently encrypting a variety of
digital images, such as binary images, gray images, and RGB images without any quality loss. The security keys of the
presented algorithm are the combination of the parameters in a 1D chaotic logistic map, a parameter to control the size of
Sudoku Matrix and the number of iteration times desired for scrambling. The possible security key space is extremely
large. The principles of the presented scheme could be applied to provide security for a variety of systems including
image, audio and video systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new Region-of-Interest (ROI) based bit-allocation scheme for advanced video coding
(H.264). We use a group-of-picture (GOP) structure to compensate bits consumption between Intra and Inter
frames in one GOP, and simultaneously adjust the quantization parameter (QP) for each macroblock, according
to the region-of-interest on Intra(I) or Inter(P) frames, respectively. Our algorithm is implemented on the
H.264/AVC codec JM14.0 and totally compatible with existing H.264/AVC decoders. The experimental results
show that compared to the standard JM14.0 encoder, under the same bitrate budget, the proposed algorithm
achieves a subjective visual quality improvement as well as an objective PSNR gain. The ROI-based bit allocation
scheme has a potential to low-bitrate and real-time video applications, such as video conferencing and mobile
TV broadcasting.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, Compressive Sensing (CS) has emerged as a more efficient sampling method for sparse signals. Comparing to
the traditional Nyquist-Shannon sampling theory, CS provides a great reduction of sampling rate, power consumption, and
computational complexity to acquire and represent sparse signals. In this paper, we propose a new block based image/video
compression scheme, which uses CS to improve coding efficiency. In the traditional lossy coding schemes, such as JPEG
and H.264, the dominant coding error comes from scalar quantization. The CS recovery procedure can help mitigating the
quantization error in the decoding process. We use rate distortion optimization (RDO) for mode selection (MS) between
the traditional inverse DCT transform and projection onto convex sets (POCS) algorithm. In our experiment, the new image
compression method is able to achieve up to 1 dB gain over standard JPEG.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a novel compound video compression method for real-time applications of computer
screen video transmission. The compound video is a very special kind of video, which usually has less motion
and contains a mixture of text, graphics, and natural pictures in each frame. A variety of algorithms has been
proposed for compound image compression, which tries to remove the spatial redundancy of a single image.
However, few works have addressed the problem of compound videos compression. Therefore, we review the
previous works on compound image compression and discuss how to extend the existing algorithms to compound
video compression. We propose a new video compression framework based on H.264. In order to improve the
visual performance and network efficiency, we propose an adaptive quantization algorithm and a rate control
algorithm for compound video compression. The experimental results show that the new compression method is
able to compress the compound video in real-time with a relatively low computational complexity and with the
motion compensation, the visual quality of the compound video is much better than simple compound image
compression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The control roots of latency information theory (LIT) are reviewed in this first paper of a three papers series.
LIT is the universal guidance theory for efficient system designs that has inherently surfaced from the confluence of five
ideas. They are: 1) The source entropy and channel capacity performance bounds of Shannon's mathematical theory of
communication; 2) The latency time (LT) certainty of Einstein's relativity theory; 3) The information space (IS)
uncertainty of Heisenberg's quantum physics; 4) The black hole Hawking radiation and its Boltzmann thermodynamics
entropy S in SI J/K; and 5) The author's 1978 conjecture of a structural-physical LT-certainty/IS-uncertainty duality for
stochastic control. LIT is characterized by a four quadrants revolution with two mathematical-intelligence quadrants and
two physical-life ones. Each quadrant of LIT is assumed to be physically independent of the others and guides its
designs with an entropy if it is IS-uncertain and an ectropy if it is LT-certain. While LIT's physical-life quadrants I and
III address the efficient use of life time by physical signal movers and of life space by physical signal retainers,
respectively, its mathematical-intelligence quadrants II and IV address the efficient use of intelligence space by
mathematical signal sources and of processing time by mathematical signal processors, respectively. The theoretical and
practical relevance of LIT has already been demonstrated using real-world adaptive radar, physics and biochemistry
applications. It is the objective of this paper to demonstrate that the structural dualities that are exhibited by the four
quadrants of LIT are similar to those that were earlier identified by the author for the practical solution of stochastic
control problems. More specifically, his 1978 conjecture of a structural-physical LT-certainty/IS-uncertainty duality
between bit detection communication and deterministic quantized control problem solutions that led him to the discovery
of a Matched Processors practical alternative to Bellman's Dynamic Programming.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Statistical physics bridges for latency information theory (LIT) are revealed in this second paper of a three
paper series that include the discovery of the time dual of thermodynamics. LIT is the universal guidance theory for
efficient system designs that has inherently surfaced from the confluence of five ideas. They are: 1) The source entropy
and channel capacity performance bounds of Shannon's mathematical theory of communication; 2) The latency time
(LT) certainty of Einstein's relativity theory; 3) The information space (IS) uncertainty of Heisenberg's quantum
physics; 4) The black hole Hawking radiation and its Boltzmann thermodynamics entropy S in SI J/K; and 5) The
author's 1978 conjecture of a structural-physical LT-certainty/IS-uncertainty duality for stochastic control. LIT is
characterized by a four quadrants revolution with two mathematical-intelligence quadrants and two physical-life ones.
Each quadrant of LIT is assumed to be physically independent of the others and guides its designs with an entropy if it is
IS-uncertain and an ectropy if it is LT-certain. While LIT's physical-life quadrants I and III address the efficient use of
life time by physical signal movers and of life space by physical signal retainers, respectively, its mathematicalintelligence
quadrants II and IV address the efficient use of intelligence space by mathematical signal sources and of
processing time by mathematical signal processors, respectively. Seven results are stated next that relate to the revelation
of statistical physics bridges for LIT. They are: 1) Thermodynamics, a special case of statistical physics, has a time dual
named lingerdynamics; 2) Lingerdynamics has a dimensionless lingerdynamics-ectropy Z that is the LT-certainty dual of
a dimensionless thermodynamics-entropy, and like thermodynamics has four physical laws that drive the Universe; 3) S
advances a bridge between quadrant II's source-entropy H in bit units and quadrant III's retainer-entropy N in SI m2
units; 4) Z advances a bridge between quadrant I's mover-ectropy A in SI secs and quadrant IV's processor-ectropy K in
binary operator (bor) units; 5) Statistical physics bridges are discovered between the LIT entropies and the LIT
ectropies; 6) Half of the statistical physics bridges between the LIT entropies and LIT ectropies are found to be medium
independent, thus yielding the same entropy-ectropy relationships for black holes, ideal gases, biological systems, etc.;
and 7) A medium independent quadratic relationship τ=l(M/▵M)2 relates the lifespan τ of a retained mass M to the ratio
of M to the fractional mass ▵M that escapes it every l seconds, e.g., for a human with M = 70 kg, expected lifespan of
τ=83.9 years (or 2.65 Gsec), l=1 day (or 86.4 ksec), its daily escaping mass is given by ▵M=0.4 kg. In turn, this requires him/her to consume 2,000 kcal per day (i.e., 5,000 kcal/kg times 0.4 kg) to replace the 0.4 kg lost from day to day which
correlates well with expectations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Knowledge unaided power centroid (KUPC) adaptive radar and its latency information theory (LIT) roots
are reviewed in this third paper of a three paper series. LIT is the universal guidance theory for efficient system designs
that has inherently surfaced from the confluence of five ideas. They are: 1) The source entropy and channel capacity
performance bounds of Shannon's mathematical theory of communication; 2) The latency time (LT) certainty of
Einstein's relativity theory; 3) The information space (IS) uncertainty of Heisenberg's quantum physics; 4) The black
hole Hawking radiation and its Boltzmann thermodynamics entropy S in SI J/K; and 5) The author's 1978 conjecture of
a structural-physical LT-certainty/IS-uncertainty duality for stochastic control. LIT is characterized by a four quadrants
revolution. While the first and third quadrants are concerned with the life time of physical signal movers and the life
space of physical signal retainers, respectively, the second and fourth quadrants are about the intelligence space of
mathematical signal sources and the processing time of mathematical signal processors, respectively. The four quadrants
of LIT are assumed to be physically independent with their system design methodologies guided by dualities and
performance bounds. Moreover, all the LIT quadrants are bridged by statistical physics, inclusive of a recently
discovered time dual for thermodynamics that has been named lingerdynamics. The theoretical and practical relevance
of LIT has already been demonstrated using real-world control, physics, biochemistry and the KUPC adaptive radar
application that is reviewed in this paper. KUPC adaptive radar is a technique that falls within the fourth quadrant of
LIT, and is thus a mathematical signal processing technique whose goal is the efficient detection of moving targets in
real-world taxing environments. As is highlighted in this review KUPC adaptive radar is found to come relatively close
to the signal to interference plus noise ratio (SINR) radar performance of DARPA's knowledge-aided sensory signal
processing expert reasoning (KASSPER) even though it is knowledge unaided.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, an integrated framework comprising of computer vision algorithms, Database system and Batch processing
techniques has been developed to facilitate effective automatic threat recognition and detection for security applications.
The proposed approach is used for automatic threat detection. The novel features of this structure include utilizing the
Human Visual System model for segmentation, and a new ratio based edge detection algorithm that includes a new
adaptive hysteresis thresholding method. The feature vectors of the baseline images are generated and stored in a
relational database system using a batch window. The batch window is a special process where image processing tasks
with similar needs are grouped together and effectively processed to save computing and memory requirements. The
feature vectors of the segmented objects are generated using the CED method and are classified using a support vector
machine (SVM) based classifier to identify threat objects. The experimental results demonstrate the presented
framework efficiency in reducing the classification time and provide accurate detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper is concerned with lip localization for visual speech recognition (VSR) system. We shall present an efficient
method for localization human's lips/mouth in video images. This method is based on using the YCbCr approach to find
at least any part of the lip as an initial step. Then we use all the available information about the segmented lip-pixels such
as r, g, b, warped hue, etc. to segment the rest of the lip. The mean is calculated for each value, then for each pixel in
ROI, Euclidian distance from the mean vector is calculated. Pixels with smaller distances are further clustered as lip
pixels. Thus, the rest of the pixels in ROI will be clustered (to lip/non-lip pixel) depending on their distances from the
mean vector of the initial segmented lip region.
The method is evaluated on a new-recorded database of 780,000 frames; the experiments show that the method localizes
the lips efficiently, with high level of accuracy (91.15%) that outperforms existing lip detection approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a wireless mesh network implementation consisting of both Wi-Fi Ad-Hoc networks as well as
Bluetooth Piconet/Scatternet networks, organised in an energy and throughput efficient structure. This type of networks
can be easily constructed for Crises management applications, for example in an Earthquake disaster. The motivation of
this research is to form mesh network from the mass availability of WiFi and Bluetooth enabled electronic devices such
as mobile phones and PC's that are normally present in most regions were major crises occurs.
The target of this study is to achieve an effective solution that will enable Wi-Fi and/or Bluetooth nodes to seamlessly
configure themselves to act as a bridge between their own network and that of the other network to achieve continuous
routing for our proposed mesh networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Histogram equalization is one of the common tools for improving contrast in digital photography, remote sensing,
medical imaging, and scientific visualization. It is a process for recovering lost contrast in an image by remapping the
brightness values in such a way that equalizes or more evenly distributes its brightness values. However, Histogram
Equalization may significantly change the brightness of the entire image and generate undesirable artifacts. Therefore,
many Histogram Equalization based algorithms have been developed to overcome this problem. This paper presents a
comprehensive review study of Histogram Equalization based algorithms. Computer simulations and analysis are
provided to compare the enhancement performance of several Histogram Equalization based algorithms. A secondderivative-
like enhancement measure is introduced to quantitatively evaluate their performance for image enhancement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes two image enhancement algorithms that are based on utilizing histogram data gathered from
wavelet transform domain coefficients. Computer simulations demonstrate that combining the spatial method of
histogram equalization with the logarithmic transform domain coefficient histograms achieves a much more balanced
enhancement, which outperforms classical histogram equalization algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we introduce a human visual system (HVS)-based edge detection algorithm. The introduced algorithm
integrates image enhancement, edge detection and logarithmic ratio filtering algorithms to develop an effective edge
detection method. Also a parameter (â) is introduced to control the level of detected edge details and functions as a
primary threshold parameter. The introduced algorithm functions in tracking and segmenting significant dark gray levels
in an image. Simulation results have shown the effectiveness of the introduced algorithm compared to other traditional
methods such as Canny's algorithm in preserving object's topology and shape. The developed algorithm functions at
various applications where measurements and segmentation of dark gray level spots for classification and tracking
purposes are required such as road cracks, lunar surface images, and remote objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.