Search in large video collections is usually done by using keywords such as “beach“, “flower“ or “landscape“. The manual assignment of such keywords to videos for video search, is however very laborious and inaccurate. The Fraunhofer Heinrich Hertz Institute HHI has developed technologies to automatically analyze video content and assign keywords to videos.

Technical Background

The approach for automatically assigning keywords to videos is divided into three steps

  • Extraction of metadata
  • Training
  • Classification

In a first step, features (e.g. edges, color and geometric shapes) suitable to describe video content are automatically extracted from video frames. For each category (e.g. “beach“) to be learned by the system, a training set of video frames is required, which contains positive and negative samples of the category.

In the second step, the system is trained with the features extracted from the training set, in order to be able to distinguish the learned category from other categories. The system is then able to determine if previously “unseen“ video frames belong to the learned category or not, by analyzing their features.


  • Augmenting of broadcaster production and archiving workflows in a semi-automatic process, by aiding archivists/ documentalists in the generation of video metadata
  • Automatic video annotation for efficient and reliable archiving
  • Semantic video search and retrieval