Volumetric video represents a new way of experiencing immersive media content. It refers to the process of capturing objects (e.g., people) from multiple cameras, which can be later viewed from any angle at any point in time. This enables a variety of applications with particular relevance for the fields of Augmented Reality (AR) and Virtual Reality (VR). Volumetric objects are typically stored as point clouds or meshes. Compared to traditional two-dimensional video, volumetric video requires the storage of geometric information in addition to texture (i.e., color information), which results in a huge amount of data. For example, a raw point cloud object consisting of 2.8 million points requires a bandwidth of about 110 billion bits per second at 30 frames per second. Therefore, efficient compression is essential for such applications.
The Multimedia Communications group is developing tools for the compression, packaging, and multiplexing of mesh-based volumetric video while optimizing the trade-off between compression efficiency and rendered quality. In addition, the group is focusing on the implementation of the volumetric player for mobile AR and VR applications as well as cloud-based streaming solutions.
Content creation and compression
Figure 1 illustrates content preparation for a mesh-based volumetric video. Each recorded object typically consists of three media sources: a two-dimensional video that represents the texture for each frame, three-dimensional meshes that describe the shape of the volumetric object, and an audio track. Each of these resources is compressed using the corresponding encoder, after which the resulting bitstreams are synchronously multiplexed into a single MP4 file.