Transform Coding Using the Residual Quadtree (RQT)

A big challenge for image and video coding is the non-stationary character of the image signal. To adapt to local statistics, image coding algorithms typically divide the signal into blocks and process each of them separately. The video coding standard H.264 implements this idea by using variable block sizes of 4x4 and 8x8 samples for transform coding.

To further increase adaptivity, a quadtree can be used to generalize this concept to arbitrary block sizes. A quadtree divides the image plane into blocks as shown in Fig.1.

Using this approach for transform coding enables the adaptation of the transform to the varying space-frequency characteristics of the residual signal. Larger transform block sizes, which have larger spatial support, provide better frequency resolution. On the other hand, smaller transform block sizes, which have smaller spatial support, provide better spatial resolution. The trade-off between the two, spatial and frequency resolution, is chosen by the encoder control, for example based on Lagrangian optimization techniques.

For video coding, we developed this technique into the residual quadtree (RQT) which enables the transform coding stage of a hybrid video coder to exploit the aforementioned adaptivity of varying transform block sizes. We introduced the RQT as a coding tool into the recently developed High Efficiency Video Coding (HEVC) standard, where it allows varying the transform block size from 8x8 to 32x32.

Fast Encoder Control: Since the number of possible tree structures grows exponentially with the maximum allowed tree depth, the encoder complexity (e.g. runtime) required to obtain the optimal partitioning in terms of rate-distortion (RD) also increases exponentially. Thus, searching for the optimal tree structure would limit application of the RQT approach. Therefore, we developed a fast RQT encoder control limiting the number of tree structures searched. This leads to a reduction of encoder runtime at the cost of a slightly inferior coding performance.

Starting at the root the encoder subdivides the image until a termination criterion is fulfilled: when all the absolute un-quantized transform coefficients are below a certain threshold, the algorithm stops and further child nodes (i.e. smaller blocks) are excluded from the search. By using a QP-dependent thresholding-rule, our encoder control provides a consistent speedup factor over the whole range of QP values.

Related publications

  1. D. Marpe, H. Schwarz, S. Bosse, B. Bross, P. Helle, T. Hinz, H. Kirchhoffer, H. Lakshman, T. Nguyen, S. Oudin, M. Siekmann, K. Sühring, M. Winken, and T. Wiegand, Video Compression Using Nested Quadtree Structures, Leaf Merging and Improved Techniques for Motion Representation and Entropy Coding, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 20, No. 12, pp. 1676-1687, Dec. 2010.
  2. T. Nguyen, P. Helle, M. Winken, B. Bross, D. Marpe, Transform Coding Techniques in HEVC, IEEE Journal of Selected Topics in Signal Processing, Vol. 7, No. 6, DEC. 2013.
  3. M. Winken, P. Helle, D. Marpe, H. Schwarz, and T. Wiegand, Transform Coding in the HEVC Test Model, 18th IEEE International Conference on Image Processing (ICIP), 2011, pp. 3693 – 3696, doi:10.1109/ICIP.2011.6116521
  4. M. Siekmann, H. Schwarz, B. Bross, D. Marpe, and T. Wiegand, Fast encoder control for RQT, JCTVC-E425, Mar. 2011.