Image Processing Rate-Distortion Optimization

goal



Goal

text_goal



Development of a rate-distortion optimized encoder control algorithms for hybrid video coders.

intro



Introduction

text_intro



The rate-distortion efficiency of today's hybrid video coders (see Fig. 1) is based on a sophisticated interaction between various motion representation possibilities, waveform coding of differences, and waveform coding of various refreshed regions. Hence, a key problem in high compression video coding is the operational control of the encoder. This problem is compounded by the widely varying content and motion found in typical video sequences, necessitating the selection between different representation possibilities with varying rate-distortion efficiency. This research project addresses the problem of video encoder optimization and identifies its consequences on the compression architecture of the overall coding system.

The proposed solutions are based on Lagrangian optimization techniques trying to answer the crucial question: "What part of the video signal should be coded using what method and parameter settings?"

fig1



Fig. 1: Basic structure of hybrid video encoder

LagrangianCoder



Lagrangian Coder Control for Error-free and Error-prone Transmission

The specification of most video coding standards including H.262 / MPEG-2 Visual, H.263, MPEG-4 Visual, and H.264 / MPEG-4 AVC provide only the bit-stream syntax and the decoding process in order to enable interoperability.  The encoding process is left out of the scope to permit flexible implementations.  However, the development of an efficient encoder control is a key problem in video coding. This issue became even more important with the new H.264 / MPEG-4 AVC video coding standard, since it provides much more coding options (e.g. 259 possibilities to partition a macroblock for motion-compensated prediction) compared to older standards. For the encoding of a video source, a variety of coding parameters such as macroblock modes, sub-macroblock modes, motion vectors, reference picture indices, intra prediction modes, and transform coefficient levels has to be determined.  The chosen values determine the rate-distortion efficiency of the produced bit-stream for a given video coding standard. The coding parameters are often predicted using already transmitted of preceding blocks inside the picture and/or coded using conditioned entropy codes.  Moreover, the motion-compensated prediction introduces a temporal dependency because reference is made to prior decoded pictures. Because of these spatial and temporal dependencies, a global optimization is virtually not possible. Instead, the selection of the coding options is separated into smaller sub-problems, and already determined coding parameters are considered as given. The selection of a coding parameter p proceeds by minimizing a Lagrangian cost measure D(p)+l×R(p) with D(p) and R(p) being the distortion and the number of bits, respectively, that are associated with a coding mode p.

For error-free transmission, the distortion D can be directly determined at the encoder side, while for error-prone transmission, the distortion of the decoded video sequence is dependent on the actually occurring transmission errors. Consequently, the distortion term D(p) is replaced by the corresponding expectation value E{ D(p) }.  For the Lagrangian coder control for error-prone transmission channels, the expected decoder distortion is determined on the basis of several decoder simulations with different error characteristics.

The development of the Lagrangian coder control led to a change of the encoding strategy that is used in standardization. It now represents the recommended technique for H.263, MPEG-4 Visual, and H.264 / MPEG-4 AVC. Moreover, the increased coding efficiency of the highly flexible H.264 / MPEG-4 AVC standard is strongly connected to Lagrangian encoder control techniques.

Comparison of Video Coding Standards

The rate-distortion-optimized encoding strategy does not only improve the coding performance in relation to older techniques, but also allows a fair comparison between different hybrid video coding standards in terms of coding efficiency. We compared the video coding standards MPEG-2 using the popular ML@MP conformance point, H.263 using the HLP features, MPEG-4 Visual using the Advanced Simple Profile, and H.264/MPEG-4 AVC using the Main Profile. All coders used only one I-picture at the beginning of a sequence, and 2 B-pictures have been inserted between each two successive P-pictures. Full search motion estimation with a range of ±32 samples was used by all encoders along with the Lagrangian coder control. The results for two test sequences, "Foreman" and "Mobile & Calendar", are depicted in Fig. 2 and Fig. 3.

Fig. 2: Comparison of video coding standards for the sequence "Foreman" in QCIF resolution with a frame rate of 10Hz

Fig. 3: Comparison of video coding standards for the sequence "Mobile & Calendar" in CIF resolution with a frame rate of 30Hz

Inter-frameOptimization



Inter-frame Optimization of Transform Coefficient Selection

As an extension of the Lagrangian coder control, a novel strategy of selecting transform coefficient levels considering inter-picture dependencies that are introduced by motion-compensated prediction was developed.  Based on a linear model of the decoding process and a simple rate model the problem of an optimized selection of transform coefficient levels could be formulated as a discrete quadratic program (QP).  This QP is solved using an iterative algorithm.  First simulation results showed that the consideration of future pictures in the selection of transform coefficient levels can significantly improve the coding efficiency.

(Kopie 1)



Contact

Fraunhofer Institute for Telecommunications
Heinrich-Hertz-Institut
Image Processing
Einsteinufer 37
10587 Berlin
Germany