In modern video coding standards such as AVC and HEVC, a uniform scalar quantization is applied to the transform coefficients of the prediction residues. However, one knows that, given a fixed region in some high-dimensional real vector-space, for a fixed number of reconstruction points, vector quantization yields a smaller average reconstruction error than scalar quantization. Thus, vector quantization can give a better rate-distortion performance than scalar quantization. In order to exploit this benefit of vector quantization for video compression, we designed a transform-coding scheme that uses a special form of trellis-coded quantization (TCQ), a low-complexity variant of vector-quantization. For this scheme, we adjusted the entropy coding of quantized transform coefficients ito the underlying TCQ. Our method was adopted to the current working draft of the Versatile Video Coding standard.
In our realization of TCQ, two quantizers $Q_0$ and $Q_1$ are use which both contain zero and which are both symmetric. For a given transform coefficient, the choice of the quantizer is determined by a state where the transition between the states is realized by a state machine. The quantizers and the state machine are depicted in the figure below.
At the decoder, the transform coefficients are reconstructed in a predefined order and the initial state is set to $s_0$. Thus, the quantizer chosen for a current transform coefficient depends on the quantization of all previous transform coefficients. This is the central difference of our quantization method to scalar quantization.
Due to this dependency of the quantization between different coefficients, in order to obtain an optimal quantization, the transform coefficients need to be quantized jointly by an encoder. To achieve this, one represents the four possible states per coefficient and the transitions between these states by a trellis, as shown in the figure below. The possible choices of quantizers for all coefficients correspond to the paths through this trellis. Using the Viterbi algorithm, the path that minimizes the rate distortion cost can be determined efficiently.
For the entropy coding of the quantization indices of transform coefﬁcients, we designed an approach that is similar to the HEVC transform coefﬁcient coding but includes additional improvements as well as adjustments for TCQ. The quantization indices are binarized as in the figure below.
Since the first non-zero reconstruction values of the two quantizers $Q_0$ and $Q_1$ have different distances from 0, two different sets of context models are used for the coding of the bins $sig$ and $gt1$. In order to make such a context modelling possible, one needs to know the quantizer when parsing a given absolute value. Thus, as the quantizer depends on the absolute values of all previous quantization indices in coding order, in contrast to previous methods, in our scheme all bins specifying the absolute value need to be decoded in a single pass. This single pass decoding is further exploited by selecting context models for a current quantization index in dependency on the already decoded absolute values in its local neighborhood, as illustrated in the figure above.
- H. Kirchhofer, C. Rudat, M. Schäfer, J. Pfaff, H. Schwarz, D. Marpe and T. Wiegand, A Study on Data-Driven Probability Estimator Design for Video Coding, download here
- H. Schwarz, T. Nguyen, D. Marpe, and T. Wiegand, Hybrid Video Coding with Trellis-Coded Quantization, in Proc. IEEE Data Compress.1187 Conf. (DCC), Snowbird, UT, USA, Mar. 2019, pp. 182-191
- H. Schwarz, T. Nguyen, D. Marpe, T. Wiegand,M. Karczewicz, M. Coban, J. Dong, Improved Quantization and Transform Coefficient Coding for the Emerging Versatile Video Coding (VVC) Standard, in 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 1183-1187.