Considering industry needs for further coding efficiency improvements, the Joined Exploration Team on Video (JVET) established by ITU-T and MPEG for standardizing VVC, has developed a new Enhanced Compression Model (ECM) based on VVC which is a common area for testing future video coding algorithms. The Versatile Video Coding (VVC) supports Reference Picture Resampling (RPR) to change frame resolution without inserting an Instantaneous Decoder Refresh (IDR) or Intra Random Access Picture (IRAP). This feature is particularly well adapted to video streaming and low delay scenarios since it allows seamless frame-based bit-rate adaptation, whereas traditional techniques based on streams switching between coded video chunks at fixed resolution can generate bitrate leaps. ECM implements several new tools that improve the coding efficiency compared to VVC, but some of them were not designed to support RPR. In this paper, we first discuss some necessary adaptations to implement RPR in ECM for these new coding tools. At low bit rate, RPR may improve the coding performance of ECM for luma component, and the coding complexity is reduced. However, RPR may show PSNR drop for chroma component because it performs an additional down-scaling filtering on samples that were already filtered from the original canonical 4:4:4 content to create the 4:2:0 format. Then, in a second part, some modifications of RPR to re-scale luma and chroma differently are proposed. It is shown that it improves ECM efficiency in the context of both super-resolution and low-delay coding use cases.
Current video coding standards like HEVC, VP9, VVC, AV1, etc., involve partitioning a picture into coding tree units (CTU), typically corresponding to 64x64 or 128x128 picture areas. Each CTU is partitioned into coding blocks following a recursive coding tree. In recently published perceptual video encoding methods, the CTU is used as the spatial unit to assign a QP value in a given picture area. Such an approach fits well with the usual rate distortion optimization used to decide the coding tree representation of a CTU since a constant QP is used inside the CTU. Thus Lagrangian rate distortion optimization works in such a situation. However, for some applications, finer spatial granularity may be desired with an adaptive QP. A perceptual video coding scheme may use a codec agnostic QP allocation process that proceeds on a 16x16 block basis. The issue raised in such a case is that the rate distortion trade-off among split modes no more works with the Lagrangian method. This paper proposes several methods to perform the rate distortion optimization of a coding tree in the situation where multiple QPs may be assigned inside the same CTU. First a theoretical method to solve the problem is described. It consists in a coding tree RD optimization using multiple Lagrange parameters. Then some simpler empirical methods which emulate the theoretical approach are proposed. Experimental results show the benefit of the proposed methods on top of VP9 and HEVC video encoders.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.