Transform and partitioning represent core components of the video coding architectures. Compared with HEVC, VVC is characterized by higher number of transform types, additional transform level (LFNST) and more flexible partitioning via the binary tree and ternary tree. This flexibility in transform and partitioning provides about 2% and 10% coding gain. Nevertheless, the current design is not ultimately optimized for the highest coding gain, but rather for the compromise with the design complexity. That is, the potential of combining higher transform and partitioning diversity is higher than the current state in VVC. This can be demonstrated by utilizing some early transform and partitioning proposals in the context of VVC development, which were not adopted due to the complexity concerns. In this paper, we revisit these designs targeting the maximum bitrate saving. This is to establish a new state of the art anchor for the post VVC development.
In this paper, we propose a novel interpolation of reference samples for intra prediction in Versatile Video Coding (VVC). To interpolate a predictor value between two reference samples, the method uses four nearest reference samples, as does the existing cubic filter in VVC, but with a simpler design that does not require to pre-compute the filter coefficients. We model the signal as a sum of two components, where the first component is the classical linear interpolation, and the second component is a corrective term that accounts for the change due to the two farther samples. To arrive at this model, we model the signal as a sum of two quadratic functions where each quadratic function models the signal with three adjacent samples. The corrective term is attributed to the quadratic term in the resulting model, which can be calculated on the fly. Since models based on four samples can lead to large errors at edges of objects, we propose to use a thresholding method to decide between the proposed model and the usual linear interpolation. Besides BD-rate performance gain, the advantages of the proposed method are lower complexity, no memory storage of filter coefficients, and a uniform method for both Luma and Chroma components. The proposed method applied in VTM 7.0 intra prediction results in BD-rate gains of 0.13% for Luma and 0.30% for Chroma with lower decoding complexity.
Current video coding standards like HEVC, VP9, VVC, AV1, etc., involve partitioning a picture into coding tree units (CTU), typically corresponding to 64x64 or 128x128 picture areas. Each CTU is partitioned into coding blocks following a recursive coding tree. In recently published perceptual video encoding methods, the CTU is used as the spatial unit to assign a QP value in a given picture area. Such an approach fits well with the usual rate distortion optimization used to decide the coding tree representation of a CTU since a constant QP is used inside the CTU. Thus Lagrangian rate distortion optimization works in such a situation. However, for some applications, finer spatial granularity may be desired with an adaptive QP. A perceptual video coding scheme may use a codec agnostic QP allocation process that proceeds on a 16x16 block basis. The issue raised in such a case is that the rate distortion trade-off among split modes no more works with the Lagrangian method. This paper proposes several methods to perform the rate distortion optimization of a coding tree in the situation where multiple QPs may be assigned inside the same CTU. First a theoretical method to solve the problem is described. It consists in a coding tree RD optimization using multiple Lagrange parameters. Then some simpler empirical methods which emulate the theoretical approach are proposed. Experimental results show the benefit of the proposed methods on top of VP9 and HEVC video encoders.
KEYWORDS: Receivers, Video, Computer programming, Distortion, Video coding, Multimedia, Quantization, Forward error correction, Internet, Signal to noise ratio
Delivering temporally-constrained multimedia streams in heterogeneous environments, offering no guarantee in terms of bandwidth, packet loss, or delay, is a very challenging problem faced today by both the networking and the coding community. Layered coding is often proposed as a solution for rate-based congestion control of video transmission in heterogeneous environments. The problem addressed more specifically here is the design of a responsive mechanism for rate allocation in each layer, that would guarantee the best bandwidth usage for all the receivers. After a review of solutions for congestion and loss control in unicast video communications, the paper describes a rate-based congestion control mechanism for multicast layered video transmission.
Targeting multimedia communications over the Internet, this paper describes a technique in the direction of improved packet loss resiliency of video compressed streams. Aiming at a best trade-off between compression efficiency and packet loss resiliency, a procedure for adapting the video coding modes to varying network characteristics is introduced. The coding mode selection is based on a rate-distortion procedure with global distortion metrics incorporating channel characteristics under the form of a two states Markov model. This procedure has been incorporated in an MPEG-4 video encoder. It has been observed that, in error-free environments, the channel adaptive mode selection technique allows a significant gain with respect to simple conditional replenishment. On the other hand, under loss conditions, it is shown that this procedure significantly improves the encoder's performance with respect to the original MPEG-4 encoder, to approach the robustness of conditional replenishment mechanisms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.