Perceptually Optimized Video Coding

The efficiency of video/image coding algorithms can be measured by the level of video quality, which is achievable with a given maximum bit rate. While the bit rate can be distinctly mathematically described, the definition of image quality remains ambiguous. Due to its simplicity and mathematical tractability, the mean square error (MSE) of the degraded image is the most widely used distortion measure in video coding applications. However, it is well known that it poorly reflects the subjective quality perceived by a human observer.

A distortion model, which correlates more closely with the human visual system (HVS), has the potential to achieve significant video coding performance gains. This can be observed from the example images above. These images are extracted from two different encoded pictures with comparable bitrates. One is encoded with fixed QP (left column) and the other using an HVS-model based QP adaptation (right column).  This adaptation leads to a shift of coding bits from perceptual noncritical regions (bottom-right) to more critical regions (top-right).

In this field of perceptual video coding, our research activities include the evaluation of existing more advanced distortion measurement functions and also the development of new quality models for the application to video coding. As seen in the example above, first such models, which improve the perceived quality by adopting quantization parameters to image statistics, have been developed. The Fraunhofer HHI HEVC 4K real-time software encoder makes use of these models.  

Finding and testing quality models requires the assessment of the actual perceived visual quality. This assessment is still a widely non-automatic process. The most common approach to quantifying subjective distortion is still a psychophysical judgment experiment, in which a human observer is presented a stimulus and gives an overt response.

The typical procedure of those tests is that the subject has to rank the quality of a set of test videos or images. These subjective tests are widely used in practice and deliver quality assessments for video signals when averaged over many subjects. They share the drawback that ratings highly variable across subjects, prone to be affected by subjective factors (e.g., bias, expectations, strategies).

We carry out such subjective tests in our vision lab and also make use of crowd-sourcing of typically lab-bounded experiments. Our research is also focused on the evaluation of neural correlates of visual impairments measured by electro-encephalography. Here, we investigate and make use of psycho-physiological responses of the human brain to perceived visual distortions introduced by video compression algorithms, to degradations in 3D-media representations and to artifacts in computer graphics. The graphics below show SSVEP EEG responses to stimuli with low (left) and high (right) degradation levels respectively.