In this paper, we propose a novel method to recognize human actions using 3D human skeleton joint points. First,
we represent a skeleton pose by a feature vector with three descriptors: limb orientation, joint motion orientation
and body part relation. Then, we mine discriminative local basic motions based on the sequences of feature
vectors. These local basic motions contain the discriminative motions of key joints and can well represent human
actions. Experiments conducted on MSR Action3D Dataset and MSR Daily Activity3D Dataset demonstrate
the effectiveness of the proposed algorithm and a superior performance over the state-of-the-art techniques.
Traditional forward view synthesis prediction enables the efficient use of depth to provide synthesized frames for texture
reference in non-base layers. But asserted drawbacks of high complexity that results from edge detection, hole-filling, up
sampling and down sampling in forward warping technique compromise the positive performance. Hence, backward
view synthesis prediction is proposed to remove these drawbacks while maintaining the performance. However, fixed
depth block used in backward view synthesis prediction limits the performance gain and the number of motion
compensation operations, which is a requisite concern of complexity analysis. In this paper, a block based BVSP for
inter-layer prediction with only high-level syntax changes is implemented and an adaptive depth block size selection
method is proposed. The experimental results show that an average gain of 3.5% bitrate reduction was achieved and after
enabling adaptive depth block size selection, this performance gain is relatively maintained while the number of motion
compensation operations was reduced to a designated level.
Mean Shift is popular in object tracking due to its simplicity and efficiency. It finds local maximum of the similarity
measure between the target model and target candidate, and works well in many situations. However, it suffers from two
aspects. First, Mean Shift tracker ignores background knowledge. As a result, it may fail when the background color is
similar to that of the target or the initial target region contains too much background. Second, Mean Shift tracker omits
the geometric structure with a global color histogram as the target model. Therefore, it may not work in the case of
partial occlusion. To solve the first problem, we introduce background color histogram into a MAP formulation. To
address the second problem, we divide the target into hierarchical blocks. These blocks are described with a histogram
each but tracked as a whole. The two threads lead to a new algorithm, named MAP spatial pyramid (MAP-SP) Mean
Shift. The efficiency of MAP-SP Mean Shift is demonstrated via comparative experiments on both standard and our own
video sequences
Video adaptation has been proved to be an efficient technique in dealing with various constraints such as bandwidth
limitation and user requirement in multimedia applications. However, existing methods including Scalable Video Coding
and transcoding cannot get a fine performance when bandwidth constraints exist in various scenarios particularly in realtime
applications. In this paper, we propose a novel rate control scheme based on intermediate description. The proposed
scheme can provide fast rate control for narrow and time-varying transmission channel in scenarios such as video
streaming, video sharing and video on demand. In this scheme, Discrete Cosine Transform (DCT) coefficients
distribution is modeled by generalized Gaussian distribution, meanwhile the parameter information of this model is
stored as side information for rate control. With the stored parameter information, encoder and transcoder can achieve
the target bit-rate with low complexity. Furthermore, an initial Quantization Parameter (QP) determination method is
also presented to calculate a proper QP for the Instantaneous Decoding Refresh (IDR) picture. Experimental results show
that compared with JVT-G012 in H.264, the proposed rate control scheme can save more than 85% encoding time and
obtain the required bit-rate more precisely, meanwhile gains a performance improvement by 0.2dB averagely.
Scalable Video Coding (SVC) is an extension of H.264/AVC standard. The base layer of SVC is compatible with
H.264/AVC standard, while the enhancement layers provide desired temporal, quality and/or spatial scalabilities.
Bit-stream rewriting in SVC standard allows an SVC bit-stream to be converted to an H.264/AVC bit-stream without
quality loss and preferably with low computational complexity. However, current rewriting is only supported in quality
scalability rather than spatial scalability, which limits the application in many practical scenarios. In this paper, a hybrid
bit-stream rewriting approach to support both quality and spatial scalability is proposed based on the principle of residue
upsampling in transform domain. The computational complexity of the proposed approach is much lower than the
conventional scheme of cascading transcoding. Extensive experimental results demonstrate that the loss of the
rate-distortion (RD) performance of the proposed rewritable SVC bit-stream is acceptable compared with the conventional
SVC bit-stream, however, the RD performance is better than that of simulcast. Furthermore, the RD performance of the
H.264/AVC bit-stream rewritten from the rewritable SVC bit-stream is even better than that of the input SVC bit-stream.
Compared with the cascading transcoding scheme, the proposed hybrid rewriting can achieve 0.8 dB Y-PSNR gains while
saving 80% processing time on average.
VOx/TiOx/Ti multilayer thin films were deposited on glass and molybdenum substrates by magnetron reactive sputtering.
The structure and properties of thin films were measured with X-ray diffraction (XRD), QJ31 Wheatstone Bridge and the
internal friction instrument. Preparing process and internal friction of VOx/TiOx/Ti multilayer thin films were studied
respectively. On the basis of measurement analysis from crystal structure, the curves of resistance vs temperature, and
Young's modulus vs temperature, the phase transformation of VOx multilayer thin film occurs at 66°C and its
temperature coefficient of resistance is - 4.35%/°C.
The different N-doped ZnO thin films were grown by RF magnetron sputtering on the glass substrates by changing the
ratio of O2 to N2.The XRD and photoluminescence (PL) spectra were measured. The results show that the intensity and
positions of these PL peaks are changed with nitrogen content. There are two peaks at 374nm and 391nm under
fluorescence spectrum when the ratio of Ar:O2:N2 is 15:7:8. The fluorescence peak located at 374nm has the
characteristic of p-type ZnO films.
The hole-drilling strain gage method is an effective semi-destructive technique for determining residual stresses in the
component. As a mechanical technique, a work-hardening layer will be formed on the surface of the hole after drilling,
and affect the strain relaxation. By increasing Young's modulus of the material near the hole, the work-hardening layer is
simplified as a heterogeneous annulus. As an example, two finite rectangular plates submitted to different initial stresses
are treated, and the relieved strains are measured by finite element simulation. The accuracy of the measurement is
estimated by comparing the simulated residual stresses with the given initial ones. The results are shown for various
hardness of work-hardening layer. The influence of the relative position of the gages compared with the thickness of the
work-hardening layer, and the effect of the ratio of hole diameter to work-hardening layer thickness are analyzed as well.
Using physical vapor deposition (PVD) method, the (cBN/nano-diamond)3 multilayer film with phase purity, high
hardness and low residual stress was synthesized on silicon substrates supported by a thick nano-diamond buffer. This
method presented is characteristic with the direct cBN growth on diamond without soft, non-cubic BN interface layers;
the synthesis of multilayer films with extraordinary adhesion to the substrates and higher hardness than a cBN single
film, and the stress of the multilayer film can be reduced to only one forth of that of a cBN single film. These prime
technological properties open the route to the mechanical exploitation of cBN films.
ZnO thin films were prepared by two methods .One was ion beam sputtering then annealing at 700°C in O2, another was
RF magnetron sputtering then annealing at 600°C in O2. The structures, morphologies, and electrical resistivities of the
ZnO films prepared by two methods were investigated and compared. The influences of two different methods on
properties of ZnO thin film were studied by XRD, AFM and LCR HITESTER. Compared with RF magnetron sputtering,
the ZnO films fabricated by ion beam sputtering deposition have disordered growth orientation, bigger surface roughness
and higher electrical resistivity.
Video coding in wireless environment requires lower computational complexity and lower energy consumption than that used in storage oriented or network oriented application. Although H.264/AVC standard provides considerable higher compression efficiency as compared to the previous standards, its complexity is significantly increased at the same time. In a H.264/AVC encoder, the most time-consuming components are variable block sizes motion estimation and mode decision using rate-distortion optimization (RDO). In this paper, we propose a novel fast inter-prediction mode decision by exploiting high correlation between rate-distortion costs (RD cost) of macroblocks in the current inter frame and their co-located macroblocks in the previous inter frame. Using this new algorithm, we can reduce a number of inter mode candidates and skip motion estimation for these modes. In addition, our algorithm can also decrease a number of tested intra modes. Simulation results show that our approach can save 20% to 50% encoding time, with a negligible PSNR loss less than 0.1 dB and bit rate increase no more than 2% for almost all the test sequences.
The application of error concealment in video communication is very important when compressed video sequences are transmitted over error-prone networks and erroneously received. In this paper, we propose a novel error concealment scheme, in which the concealment problem is formulated as minimizing, in a weighted manner, the difference between the gradient of the reconstructed data and a prescribed vector field under given boundary condition. Instead of using the motion compensated block as the final recovered pixel values, we use the gradient of the motion compensated block together with the surrounding correctly decoded pixels of the damaged block to reconstruct the lost data. Both temporal and spatial correlations of the video signals are exploited in the proposed scheme. A well designed weighting factor is used to control the regulation level at a desired direction according to the local blockiness degree at the boundaries of the recovered block. The experimental results show that the proposed algorithm is able to achieve higher PSNR as well as better visual quality in comparison with the error concealment feature implemented in the H.264 reference software. The blocking effects are greatly alleviated while the structural information in the interior of the recovered block is well preserved.
Conventional video coding techniques make use of the most recently decoded reference frame(s) for motioncompensated
inter prediction. However, it has been shown that to allow using reference frames in a flexible way such
that not only the latest reference frames are used is beneficial. A typical use of flexible reference frame is feedback
based reference picture selection, wherein error-free reference frames available in both the encoder and decoder sides
are selected and used for inter prediction reference. This paper first overviews support of reference picture selection in
different video coding standards, and then presents three specific feedback based reference picture selection methods
using flexible reference frames. In addition, a novel simple reference frame management method that enables using of
flexible reference frame is proposed. The reference frame management method enables much simpler video codec
implementations compared to the complex reference frame management methods in H.263 Annex U and H.264/AVC.
The proposed coding methods and some conventional methods are compared with each other. Simulation results show
significantly improved error resiliency performance of the proposed reference picture selection methods compared to
conventional methods. The effect on the performance imposed by feedback delay variation is also shown. Thanks to the
merits, support of flexible reference frame and the reference frame management has been adopted to the AVS-M video
coding standard.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.