Some researches show that learner’s Emotional state has an important impact on affective and cognitive processes influencing learning. A positive emotional state can enhance learning outcome. It is important to detect learner’s emotion state in learning processes unconsciously. Generally, emotions can be classified within the two dimensions, valence and activation. Happiness is an activating positive valence emotional state. This paper presents a happiness emotion detection method based on deep learning. Firstly, a certain amount of face images which include static emotion are selected from the image database. Faces are detected by using a face detector and aligned by using eye locations. Then, the face images are clipped into proper size to match the convolutional neural network input. In our classifier, input layer accepts single channel to process grayscale images, and the output layer outputs two classes, i.e. happiness emotion and non-happiness. Fourfold cross-validation is performed on the facial expression image dataset which is divided into four subsets randomly. In every round of cross validation, one subset is used for testing and other three subsets are used for training. The experiment results show that the average accuracy is up to 98.78 percent which is enough to use in learning outcome evaluation.
Cultural relic objects got from archaeology are often incomplete fragments. Some old and classical designs are preserved in these objects’ surfaces. But the objects are generally incomplete and their design is only the partial of the full design. Since the sherds or fragment objects suffer from serious corrosion, the fragment designs are often obscure and do not enough to cover all complete designs. Obviously, manual matching is inefficient and practically impossible for some complicated partial to global matches. This paper presents a feature matching algorithm to overcome these difficulties. Firstly, the color image of sherd object is converted to gray image. Then we detect the edges and the feature curves of the design and get one-pixel-wide edges and curves. Gray-scale image will be enhanced and removed noises. Some evident missing curves will be added to and incorrect curves will be removed manually. The fast matching algorithm is used to exclude the impossible matching designs in the design image database. For any possible matching designs, we use the parallel image matching algorithm based on the Hausdorff distance. The matching process consists a translation and a rotation transform. We divide the translation t into a number of subsets which will be assigned to different processors to compute any rotation Hausdorff distance and match the image of fragment object. All processors will stop when any processor with successful matching result. The experiment on a set of fragment designs shows that the algorithm is efficient and better than traditional matching method.
It is very necessary to recognize person through visual surveillance automatically for public security reason. Human gait
based identification focus on recognizing human by his walking video automatically using computer vision and image
processing approaches. As a potential biometric measure, human gait identification has attracted more and more
researchers. Current human gait identification methods can be divided into two categories: model-based methods and
motion-based methods. In this paper a two-Dimensional Principal Component Analysis and temporal-space analysis
based human gait identification method is proposed. Using background estimation and image subtraction we can get a
binary images sequence from the surveillance video. By comparing the difference of two adjacent images in the gait
images sequence, we can get a difference binary images sequence. Every binary difference image indicates the body
moving mode during a person walking. We use the following steps to extract the temporal-space features from the
difference binary images sequence: Projecting one difference image to Y axis or X axis we can get two vectors. Project
every difference image in the difference binary images sequence to Y axis or X axis difference binary images sequence
we can get two matrixes. These two matrixes indicate the styles of one walking. Then Two-Dimensional Principal
Component Analysis(2DPCA) is used to transform these two matrixes to two vectors while at the same time keep the
maximum separability. Finally the similarity of two human gait images is calculated by the Euclidean distance of the two
vectors. The performance of our methods is illustrated using the CASIA Gait Database.
A novel multi-scale cue combination contour detection method is presented. The contour detector is derived from the
local image brightness, color, and texture channels of each image pixel (x, y) . To build contour detector, the brightness,
color and texture gradient of image is defined. Then the posterior probability model of boundary G is introduced by
using learning techniques for multi-scale cue combination. Finally, the experiment shows the performance of the raw
detector and multi-scale cue combination detector.
It is important to accurately fit the unknown probability density functions of background or object. To solve this problem,
the Burr distribution is introduced. Three-parameter Burr distribution can cover a wide range of distribution. The
expectation maximization algorithm is used to deal with the estimation difficulty in the Burr distribution model. The
expectation maximization algorithm starts from a set of selected appropriate parameters’ initial values, and then iterates
the expectation-step and maximization-step until convergence to produce result parameters. The experiment results show
that the Burr distribution could depicts quite successfully the probability density function of a significant class of image,
and comparatively the method has low computing complexity.
The paper presents a novel region merging method based on the interactive information from users. An image firstly is
partitioned into homogeneous regions by using an initial segmentation and the regions will be label by taking an
interactive scheme. In this scheme, the users only roughly specify the position and main features of the object and
background, then any region will belong to non-label region or label region i.e. object or background. A similarity rule is
used to guide the merging process with the help of the users' markers. And then the object of interest is extracted from
the image. Experiment results show that the proposed method is efficient for us to extract the object of interest from the
complex background.
The Guassian distribution model is often used to characterize the statistical behavior of image or other multimedia signal,
and applied in fitting probability density functions of a signal. But, in practically, the probability density function of data
source may be inherently non-Gaussian. As the distribution family covers most of the common distribution types and the
frequency curves provided by the family are as wide as in general use, this paper considers Johnson distribution family to
estimate the unknown parameters and approximate the empirical distribution. The method uses the moments to initialize
the parameters of the distribution family, and then calculates parameters by using EM algorithm. The experiment results
show that the fitted model could depicts quite successfully the both Gaussian and non-Gaussian probability density
function of image intensity, and comparatively the method has low computing complexity.
Time complexity is one of the biggest problems for fractal image compression algorithm which can bring about high
compression ratio. However, there is inherently data parallelism for fractal image compression algorithm. Naturally,
parallel computation scheme would be used to deal with it. This paper uses "equal division load" balancing algorithm to
design parallel fractal coding algorithm and implement the fractal image compression. "Equal division load" balancing
algorithm distributes computation tasks to all processors equally. Load in every node is divided into smaller tasks based
on all power of nodes on network, and then these smaller tasks are sent to corresponding nodes to balance the load
among nodes. Analysis shows that the algorithm greatly reduces the component task execution time.
A novel content-based image retrieval data structure is developed in present work. It can improve the searching
efficiency significantly. All images are organized into a tree, in which every node is comprised of images with similar
features. Images in a children node have more similarity (less variance) within themselves in relative to its parent. It
means that every node is a cluster and each of its children nodes is a sub-cluster. Information contained in a node
includes not only the number of images, but also the center and the variance of these images. Upon the addition of new
images, the tree structure is capable of dynamically changing to ensure the minimization of total variance of the tree.
Subsequently, a heuristic method has been designed to retrieve the information from this tree. Given a sample image,
the probability of a tree node that contains the similar images is computed using the center of the node and its variance.
If the probability is higher than a certain threshold, this node will be recursively checked to locate the similar images. So
will its children nodes if their probability is also higher than that threshold. If no sufficient similar images were founded,
a reduced threshold value would be adopted to initiate a new seeking from the root node. The search terminates when it
found sufficient similar images or the threshold value is too low to give meaningful sense. Experiments have shown that
the proposed dynamic cluster tree is able to improve the searching efficiency notably.
To improve the retrieval accuracy of content-based video retrieval systems, researchers face a hard challenge that is
reducing the 'semantic gap' between the extracted features of the systems and the richness of human semantics. This
paper presents a novel video retrieval system to bridge the semantic gap. Firstly, the video captions are segmented from
the video and then are transformed into text format. To extract the semantic information from the video streaming we
apply a text mining process, which adopts a cluster algorithm as a kernel, on the text format captions. On the other hand,
in this system, users are requested to comment on the video which they download from the system when they have
watched the video. Then we associate the users' comments with the video on the system. The same text mining process
is used to deal with the comment texts. We combine the captions of the video with the comments on the video to extract
the semantic information of the video more accurately. Finally, taking advantage of the comments and the captions of the
video, we performed experiments on a set of videos and obtained promising results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.