Image Processing Multi-View Video Coding

Multi-View Coding using MPEG4-AVC/H.264_text
text
Goal: To efficiently code Multi-view video sequences with standardized MPEG4-AVC/H.264 video coding software.
Coding Approach
3D and Free Viewpoint Video can occur in various incarnations, e.g. as synthetic 3D model with images or video sequences as textures or as pure 2D representation by a number of video sequences from a multi-camera system. In the latter approach the video sequences often share corresponding content and thus efficient video coding can be extended from single-view to multi-view coding by exploiting inter-view dependencies in addition to reducing temporal redundancies.
For such multi-view content we have taken MPEG4-AVC/H.264 as the best standard for single-view coding and used its video coder for multi-view content. First, the multi-view video data is rearranged into one uncompressed bit stream. To benefit from the dependencies in temporal as well as inter-view direction, a coding pattern is applied, which uses hierarchical B pictures for each view in temporal and for each 2nd view in inter-view direction. This pattern was determined by a statistical analysis and a coding pattern construction for a linear camera setting with a Group-of-Picture (GoP) of 8 is shown in the animation in Fig. 1.
text2
According to the camera setting and GOP number, Coding Patterns are adapted. At the decoder side, the single bit stream is decoded and split into the single views. The original multi-view rearrangement and multiplexing is carried out in a way that minimizes memory usage. The entire multi-view coding technology also remains standard conform, i.e. MPEG4-AVC/H.264 can be used for multi-view coding immediately, by only increasing the size of the decoded picture buffer.
Coding Results
The capability of our coding approach has been proven in objective as well as subjective tests, where it performed best among all coding approaches, that where submitted to the MPEG "Call for Proposal on Multi-view Video Coding". The overall results at 3 rate points for the 8 test sequences with varying number of cameras as well as camera setting are shown in Fig. 2.
fig2
fig2
Fig. 2: Overall Subjective MOS results for all submitted proposals
text3
To obtain a finer resolution, the MOS value takes values from 0 (=disgustingly poor) to 10 (=excellent), although the original values range from 0 - 5.
For an in-depth analysis on the achievable coding gain, our coding approach was additionally compared against anchor simulcast coding without and with hierarchical B pictures, to evaluate the contribution of temporal and inter-view decorrelation towards the overall coding gain.
Detailed subjective and objective coding results can be found below:
kontakt_k-mueller
zeile_3
title_3
Publications
Publications in Conference Proceedings
Publications in Conference Proceedings
in Conference Proceedings
2006
- P. Merkle, Karsten Müller, Aljoscha Smolic, and Thomas Wiegand:
Efficient Compression of Multi-View Video Exploiting Inter-View Dependencies Based on H.264/MPEG4-AVC,
IEEE International Conference on Multimedia and Expo (ICME'06), Toronto, Ontario, Canada, July 2006.
2005
- Phillipp Merkle, Karsten Müller, Aljoscha Smolic, and Thomas Wiegand:
Statistical Evaluation of Spatio-Temporal Prediction for Multi-View Video Coding,
2nd Workshop on Immersive Communication and Broadcast Systems (ICOB'05), Berlin, Germany, October 2005.
- Karsten Müller, Xeonophon Zabulis, Aljoscha Smolic, and Thomas Wiegand:
Multi-View Video Coding Based on H.264/AVC Using Hierarchical B-Frames,
2nd Workshop on Immersive Communication and Broadcast Systems (ICOB'05), Berlin, Germany, October 2005.
