KIVI: AI-based Generation of interactive Volumetric Assets

October 2020 – September 2022

Read more

Virtual LiVe


Virtualization of Live Events via Audiovisual Immersionen

January 2021 – December 2021

Read more


INVICTUS: Innovative Volumetric Capture and Editing Tools for Ubiquitous Storytelling

October 2020 – September 2022

Read more


Berliner Digitaler Bahnbetrieb

May 2020 - December 2022

Read more


Telepresence for Surgical Assistance and Training using Augmented Reality

January 2020 - December 2020

Read more


D4Fly: Detecting Document frauD and iDentity on the fly

September 2019 – August 2022

Read more

Research Alliance Cultural Heritage

1. Phase: January 2016 - December 2018

2. Phase: December 2019 – November 2022

Funded by Fraunhofer Directorate, Prof. Raimund Neugebauer.

read more


Multimodal AR-assisted ENT-Surgery

September 2018 - February 2022

Read more


Comprehensive Surgical Landscape Guidance System for Immersive Assistance in Minimally-invasive and Microscopic Interventions

September 2018 - September 2021

Read more


eXtended Reality for ALL

December 2019 – June 2021

Read more


Digital Tools and Workflow Integration for Building Lifecycles

March 2018 – February 2021

Read more


Robuste generische 3D-Gesichtserfassung für Authentifizierung und Identitätsprüfung

July 2017- January 2021

Read more


SS3D++: Single Sensor 3D ++

January 2020 – December 2020



Personalised Content Creation for the Deaf Community in a Connected Digital Single Market

September 2017 - November 2020

Read more


The Berlin Center for Digital Transformation

Phase I: July 2016 – June 2018

Phase II: July 2018 – June 2020

Read more


Ergonomics Assistance Systems for contactless Human-Machine-Operation

January 2017 – December 2019

Read more


Virtual worlds for digital diagnostics and cognitive rehabilitation

December 2017 – November 2019

Read more


Embedding of real persons in Virtual Reality productions using 3D videoprocessing

July 2017 – June 2019

Read more


Anomaly detection to prevent attacks on facial recognition systems

June 2016 – May 2019

Read more


Demonstrator 3D-LivingLab for the 3Dsensation research alliance

March 2017 – February 2019

Read more


Audio-Visual Capturing and Reproduction of Traffic Noise for the Simulation of Noise Mitigation Methods

January 2017 – December 2018

Read more


cReative-asset harvEsting PipeLine to Inspire Collective-AuThoring and Experimentation

January 2016 – December 2018

Read more


Mobile 3D Capturing and 3D Printing for Industrial Applications

April 2016 – September 2018

Read more


Self-research project for medium-sized companies: Even layered Structures for Micro optics of broad imaging Systems

January 2016 – December 2017

Read more

Past projects



November 2015 – October 2016

Single Sensor 3D

Research project within 3Dsensation Alliance and supported by the Federal Ministry of Education and Research


3D Technologies For Industrial and Medical application

July 2015 – September 2017

Read more


January 2015 – June 2016

Advanced tools for 3D quality control

Co-funded by the European Regional Development Fund (EFRE)


January 2015 – June 2016

Co-funded by the European Commission's Horizon 2020 Programme.

The goal of AutoPost project is to automate major parts of the daily workload in audio-visual post-production, particularly for small and medium post houses and, with it, to make post-production more efficient by reducing time-consuming and costly manual processing.


Gesture interaction and fusion of 3D images

December 2014 – January 2016

Fraunhofer HHI is co-ordinating GestFus, a project focused on the development of fundamentals for 3Dsensation R&D projects with the topics 3D gesture interaction and fusion of 3D images from different sources. 3Dsensation partners of subsequent R&D projects, who are new to these topics should benefit from GestFus outcomes and do not need to start "from scratch".

Topics related to "gesture interaction" and "facial expression":

  • Detailed gesture description framework
  • Definition of "basic gestures" useable for various partners and application areas
  • Algorithms and demonstrators for detection of basic gestures and facial expressions (Proof of Concept)

Topics related to "fusion of 3D images" or "Augmented Reality in 3D":

  • Description of psychological foundations of 3D perception
  • Description of technical challenges and possible solutions associated with the fusion of 3D images from different sources (e.g. by image processing)
  • Description of specific problems in 3D Augmented Reality and guidelines in order to avoid these problems

GestFus project consortium

  • Fraunhofer Heinrich-Hertz-Institute (Co-ordinator)
  • Humboldt-Universität zu Berlin
  • Otto-von-Guericke-Universität Magdeburg
  • Charité - Universitätsmedizin Berlin
  • Gesellschaft für Bild- und Signalverarbeitung mbH
  • Carl Zeiss AG

Joint research project within 3Dsensation Alliance and supported by the Federal Ministry of Education and Research.

SCU Leitwarte

SmartCareUnit Leitwarte - Subproject Proxemic Monitor

September 2014 – May 2017

Read more

IBRFace (DFG Project)

August 2014 – July 2015

This DFG funded project targets photo-realistic animation of faces using image-based rendering methods. It extends prior work on articulated pose space rendering to non-articulated objects like faces.


January 2014 – December 2016

User interaction aware content generation and distribution for next generation social television

ACTION-TV proposes an innovative mode of user interaction for broadcasting to relax the rigid and passive nature of present broadcasting ecosystems.

Co-funded by the European Commission's Seventh Framework Programme

Mobile 3D Visual Search in 3D Environments

January 2014 – December 2014

This project develops a novel mobile search solutions for real 3D objects, like buildings. Current mobile visual search solutions provide answers based on images of real 3D objects, but when there are too similar images of different objects, these solutions fail. The novel solution developed in this project captures and uses the 3D geometry and thus produces more accurate results for 3D objects.

Computer Vision & Graphics Group


November 2013 – October 2016

Bridget – BRIDging the Gap for Enhanced broadcast

Co-funded by the European Commission's Seventh Framework Programme

Cluster LCE

January 2013 - December 2015

The aim of the new Fraunhofer innovation cluster LCE is to transfer the concept of Life Cycle Engineering to turbomachines: energy- and resource-efficient technologies shall be provided for all life cycles of turbomachines. The cluster is focussed on engines in the aviation industry and gas turbines in the energy generation industry.


Wavelength-selective image separation for autostereoscopic 3D displays

January 2013 – December 2016

Read more

Next Generation ID

January 2013 - Dec 2015

The Innovation Cluster Next Generation ID complies with the requirements to map, protect and network identities clearly and reliably in the virtual world. In the context of the cluster all relevant partners from industry, academia and government come together in order to realize innovative and industrially useful ID technologies, services and products - from development to implementation.


September 2012 – August 2015

AUVIS – Audio-visual data mining for event segmentation in multi-modal speech data

The project further develops algorithms for automatic audio-visual analysis of huge databases for humanities research. Specific research questions with respect to event segmentation will be tackled and solutions for automatic annotation will be developed. Fraunhofer HHI is responsible for development of video analysis algorithms that perform automatic analysis of human motion, gesture and vision based semantic annotation.

Funded by the Federal Ministry of Education and Research.


December 2011 – November 2014

RE@CT will introduce a new production methodology to create film-quality interactive characters from 3D video capture of actor performance. The project aims to revolutionise the production of realistic characters and significantly reduce costs by developing an automated process to extract and represent animated characters from actor performance capture in a multiple camera studio. The key innovation is the development of methods for analysis and representation of 3D video to allow reuse for real-time interactive animation. This will enable efficient authoring of interactive characters with video quality appearance and motion.

inEvent - Accessing Dynamic Networked Multimedia Events

November 2011 – October 2014

The main goal of inEvent is to develop new means to structure, retrieve, and share large archives of networked, and dynamically changing, multimedia recordings, mainly consisting here of meetings, video-conferences, and lectures. Exploiting, and going beyond, the current state-of-the-art in audio, video, and multimedia processing and indexing, the present project proposes research and development towards a system that addresses the above problem by breaking our multimedia recordings into interconnected "hyper-events" (as opposed to hypertext) consisting of a particular structure of simpler "facets" which are easier to search, retrieve and share. Building and adaptively linking such "hyper-events", as a means to search and link networked multimedia archives, will result in more efficient search system, in which information can be retrieved based on "insights" and "experiences" (in addition to the usual metadata).

Co-funded by the European Commission's Seventh Framework Programme


November 2011 – October 2014

The goal of the 3DTVS (3DTV Content Search) project is to devise scalable 3DTV AV content description, indexing, search and browsing methods across open platforms, by using mobile and desktop user interfaces and to incorporate such functionalities in 3D audiovisual content archives. The major novelty in 3DTV content indexing/retrieval research will be on how to exploit 3D (depth) information for stereo and multiview video indexing, retrieval and browsing that could address semantic queries of the form ‘find stereo videos with shallow depth’ or ‘find stereo videos, where actor X approaches actor Y’. 3D multichannel audio analysis will perform better audio source (e.g. musical instrument) separation and localization that will be used in 3D audio/cross-modal indexing and retrieval. Multimodal 3D audiovisual content analysis will built on the results of 3D video and audio analysis. 3DTV content description and search mechanisms will be developed to enable fast reply to semantic queries. User friendly mobile/desktop search interfaces will be built that could also permit adaptation to the user needs and profile.

Co-funded by the European Commission's Seventh Framework Programme.

Capture & Display Systems Group


October 2011 – September 2014

To date, convincing AR has only been demonstrated on small mock-ups in controlled spaces; we haven’t yet seen key conditions being met to make AR a booming technology: seamless persistence and pervasiveness. VENTURI addresses such issues, creating a user appropriate, contextually aware AR system, through a seamless integration of core technologies and applications on a state-of-the-art mobile platform. VENTURI will exploit, optimize and extend current and next generation mobile platforms; verifying platform and QoE performance through life-enriching use-cases and applications to ensure device-to-user continuity.


October 2011 – September 2014

SCENE will develop novel scene representations for digital media that go beyond the ability of either sample based (video) or model-based (CGI) methods to create and deliver richer media experiences. The SCENE representation and its associated tools will make it possible to capture 3D video, combine video seamlessly with CGI, manipulate and deliver it to either 2D or 3D platforms in either linear or interactive form.


October 2011 – February 2014

The 3DLife consortium partners committed on building upon the project’s collaborative activities and establishing a sustainable European Competence Centre, namely Excellence in Media Computing and Communication (EMC²). Within the scope of EMC², 3DLife will promote additional collaborative activities such as an Open Call for Fellowships a yearly Grand Challenge, and a series of Distinguished Lectures.


September 2011 - May 2015

Reverie is a Large Scale Integrating Project funded by the European Union. The main objective of Reverie is to develop an advanced framework for immersive media capturing, representation, encoding and semi-automated collaborative content production, as well as transmission and adaptation to heterogeneous displays as a key instrument to push social networking towards the next logical step in its evolution: to immersive collaborative environments that support realistic inter-personal communication.


June 2011 – December 2013

FreeFace will develop a system for assisting the visual authentication of persons employing novel security documents which can store 3D representations of the human head. A person passing a security gate will be recorded by multiple cameras and a 3D representation of the person’s head will be created. Based on this representation, different types of queries such as pose and lighting adaption of either the generated or the stored 3D data will ease manual as well as automatic authentication.

Computer Vision & Graphics Group

Camera deshaking for endoscopic video

June 2011– February 2011

Endoscopic videokymography is a method for visualizing the motion of the plica vocalis (vocal folds) for medical diagnosis with time slice images from endoscopic video. The diagnostic interpretability of a kymogram deteriorates if camera motion interferes with vocal fold motion, which is hard to avoid in practice. For XION GmbH, a manufacturer of endoscopic systems, we developed an algorithm for compensating strong camera-to-scene motion in endoscopic video. Our approach is robust to low image quality, optimized to work with highly nonrigid scenes, and significantly improves the quality of vocal fold kymograms.

Computer Vision & Graphics Group


February 2010 – July 2013

The FascinatE (Format-Agnostic SCript-based INterAcTive Experience) project will develop a system to allow end-users to interactively view and navigate around an ultra-high resolution video panorama showing a live event, with the accompanying audio automatically changing to match the selected view. The output will be adapted to their particular kind of device, covering anything from a mobile handset to an immersive panoramic display. At the production side, this requires the development of new audio and video capture systems, and scripting systems to control the shot framing options presented to the viewer. Intelligent networks with processing components will be needed to repurpose the content to suit different device types and framing selections, and user terminals supporting innovative interaction methods will be needed to allow viewers to control and display the content.

Co-funded by the European Commission's Seventh Framework Programme


January 2010 – June 2013

3DLife is a funded by the European Union research project, a Network of Excellence (NoE), which aims at integrating research that is currently conducted by leading European research groups in the field of Media Internet. 3DLife's ultimate target is to lay the foundations of a European Competence Centre under the name "Excellence in Media Computing & Communication" or simply EMC².  Collaboration is in the core of the 3DLife Network of Excellence.


January 2010 – December 2012

MUSCADE (Multimedia Scalable 3D for Europe) will create major innovations in the fields of production equipment and tools, production, transmission and coding formats allowing technology independent adaptation to any 3D display and transmission of multiview signals while not exceeding double the data rate of monoscopic TV, and robust transmission schemes for 3DTV over all existing and future broadcast channels.

Co-funded by the European Commission's Seventh Framework Programme.

Capture & Display Systems Group

Cloud Rendering

December 2009 – August 2010

In the project CloudRendering, we investigated methods to efficiently encode synthetically produced image sequences in cloud computing environments, enabling interactive 3D graphics applications on computationally weak end devices. One goal was to investigate possibilities to speedup the encoding process by exploiting different levels of parallelism: SIMD, multi-core CPUs/GPUs and multiple connected computers. Additional speedup was achieved by exploiting knowledge from the synthetic nature of the images paired with access to the 3D image generation machinery. The study was performed for Alcatel-Lucent.

Computer Vision & Graphics Group


June 2009 – May 2012

The goal of AVATecH - Advancing Video Audio Technology in Humanities Research - is to investigate and develop technology for semi-automatic annotation of audio and video recordings used in humanities research. Detectors that will be available via interactive annotation tools and also via batch processing can help for example with chunking, tagging, annotation and search.

Joint Max-Planck/Fraunhofer project


June 2009 – July 2011

3D@SAT will review and investigate the potential of future 3D Multiview (MVV) / Free viewpoint Video (FVV) technologies, build up a simulation system for 3D services over satellite and disseminate its findings to relevant fora (e.g. DVB 3D-SM, DVB TM–AVC, ITU-T/VCEG, ISO/MPEG, SMPTE, 3D@Home etc.).

Funded by the European Space Agency (ESA).

Fraunhofer Secure Identity Innovation Cluster

January 2009 – December 2011

The Fraunhofer Secure Identity Innovation Cluster is an alliance of five Fraunhofer Institutes, five universities and 12 private sector companies, supported by the federal states of Berlin and Brandenburg. The aim of this joint research & development project is to deliver technologies, processes and products that enable clear and unambiguous identification of persons, objects and intellectual property both in the real and the virtual world, thus enabling owners and users of identity to have individual control over clearly defined, recognizable identities. HHI is working on the passive 3D capture of faces for security documents of the future.

Computer Vision & Graphics Group

2020 3D Media

March 2008 – February 2012

2020 3D Media will research, develop and demonstrate novel forms of compelling entertainment experiences based on new technologies for the capture, production, networked distribution and display of three-dimensional sound and images.

Co-funded by the European Commission's Seventh Framework Programme.

Capture & Display Systems Group


March 2008 – February 2011

Eight leading firms and research institutes have formed a consortium in order to develop trend-setting techniques and business-models for the implementation of 3D-media into cinema, TV and video games. Our project entitled "PRIME PRoduction- and Projection-Techniques for Immersive MEdia-" is state-aided by the German Federal Ministry for Economy and Technology (BMWi).

Co-funded by the German Federal Ministry for Economy and Technology (BMWi).

Capture & Display Systems Group


February 2008 – January 2011

The 3D4YOU project will develop the key elements of a practical 3D television system, particularly, the definition of a 3D delivery format and guidelines for a 3D content creation process. The project will develop 3D capture techniques, convert captured content for broadcasting and develop 3D coding for delivery via broadcast, i.e. suitable to transmit and make public.

Co-funded by the European Commission's Seventh Framework Programme.

Capture & Display Systems Group


January 2008 – June 2010

The 3DPresence project will implement a multi-party, high-end 3D videoconferencing concept that will tackle the problem of transmitting the feeling of physical presence in real-time to multiple remote locations in a transparent and natural way. More briefly, 3DPresence does research and implements one of the very first true 3D Telepresence systems. In order to realize this objective, 3D Presence will go beyond the current state of the art by emphasizing the transmission, efficient coding and accurate representation of physical presence cues such as multiple user (auto)stereopsis, multi-party eye contact and multi-party gesture-based interaction.

Co-funded by the European Commission's Seventh Framework Programme.

Immersive Media & Communication Group


February 2007 – July 2009

The overall aim of the RUSHES (Retrieval of multimedia Semantic units for enhanced reusability) project is to design, implement, and validate a system for indexing, accessing and delivering raw, unedited audio-visual footage known in broadcasting industry as "rushes". This system will have its viability tested by means of trials. The goal is to promote the reuse of such material, and especially its content in the production of new multimedia assets by offering semantic media search capabilities.

Co-funded by the European Commission's Sixth Framework Programme


Depth enabled workflow for flexible 2D and multiview video production.


In cooperation with Bitfilm, short video clips of robots were created for the distribution on mobile phones via MMS. From 2D images of the robots, 3D models are created and animated using MPEG-4 facial animation parameters which are derived from text input. Automatic pan and zoom as well as speech alternation are applied in order to enhance the variability of the video clips.

Computer Vision & Graphics Group

Bundesdruckerei GmbH

In cooperation with the Bundesdruckerei GmbH, we have constructed a multi-view camera array for the synchronous capturing of people from different viewing directions and under varying illumination. Methods for calibration, image enhancements, interpolation, and background substitution have been developed in order to create large databases of faces with calibrated known properties.

Computer Vision & Graphics Group

Deutsche Flugsicherung GmbH

For the Deutsche Flugsicherung GmbH (German Airtraffic Control), we have created an MPEG-4 panorama from the tower of the Berlin Schönefeld airport. The interactive panorama was demonstrated at the Internationale Luftfahrt Ausstellung (ILA) in the context of the presentation of the future Berlin-Brandenburg International airport. For the creation of the virtual environment, image warping for the removal of objects and people and high dynamic range imaging techniques for local contrast adaptation were developed.

Computer Vision & Graphics Group


Games@Large is a European Integrated Project funded under the 6th framework IST programme. The project targets at designing a platform for running an interactive rich content multimedia application such as games over local networks. The Fraunhofer HHI contributs to this project with low delay video and 3D graphics streaming.

Computer Vision & Graphics Group


In the Text2Video project, we have developed a system for the automatic conversion of SMS messages into video animations. From the written text, speech is synthesized and a 3D head model is synchronously animated. The recipient obtains a MMS message with a short video, where the chosen character reads the text. Both photorealistic images or cartoons can be selected for animation. Camera changes, additional head and eye motion as well as pitch shift enhance the variability of the output. The system is, e.g., used by digitalVanity.

Computer Vision & Graphics Group

Virtual Mirror

In cooperation with adidas, a Virtual Mirror has been created that allows the user to view him/herself in a mirror with individually designed shoes. For that purpose, Fraunhofer HHI has developed a system that tracks the 3D motion of the left and right shoes in real time using a single camera. The real shoes are exchanged by 3D computer graphics models giving the user the impression of actually wearing the virtual shoes.


The BMBF funded project VisionIC aims at the development of an intelligent vision platform including startup applications for the mass market. Within this project, the Fraunhofer HHI developed an Advanced Videophone System which enables multiple partners to meet and discuss in a virtual room. Image-based rendering techniques in combination with 3D head model animation allow head pose correction and enhances communication in comparison to traditional video conferencing systems.

Computer Vision & Graphics Group


VISNET II builds on the success and achievements of the VISNET NoE to continue the progress towards achieving the NoE mission of creating a sustainable world force in Networked Audiovisual (AV) Media Technologies. VISNET II is a network of excellence with a clear vision for integration, research and dissemination plans. The research activities within VISNET II will cover 3 major thematic areas related to networked 2D/3D AV systems and home platforms. These are: video coding, audiovisual media processing, and security. VISNET II brings together 12 leading European organisations in the field of Networked Audiovisual Media Technologies. The 12 integrated organisations represent 7 European states spanning across a major part of Europe, thereby promising the efficient dissemination of resulting technological development and exploitation to larger communities.

Computer Vision & Graphics Group


VISNET is a European Network of Excellence funded under the 6th framework programme. Its strategic objectives are revolving around its integration, research and dissemination activities. VISNET aims to create a sustainable world force of leading research groups in the field of networked audiovisual (AV) media technologies applied to home platforms. The member institutions have grouped together to set up a network of excellence with a clear vision for integration, research and dissemination plans. The research activities within VISNET will cover several disciplines related to networked AV systems and home platforms.

Computer Vision & Graphics Group