Image Processing Pose & Gesture Analysis

Pose & Gesture Analysis_text1a
Pose and gesture analysis plays an important role in various applications such as human-machine interaction, behaviour analysis, video surveillance, annotation, search and retrieval, motion capture for the entertainment industry and interactive web-based applications. We are performing research and development of real-time video analysis algorithms since many years, mainly focusing on hand and head tracking and gesture analysis. The main features for tracking and segmentation of hands and head are skin-colour and motion. This provides robust and temporarily stable recognition results in relatively unconstrained scenarios. The technology has been developed for the following two application scenarios.
Image Film flv
Pose & Gesture Analysis_text1b
Annotation of human motion in large video corpora
In humanities research such as psycho-linguistic or neuro-psychology, video recordings of interview sessions are analyzed to carry out research. Current practice is to perform manual annotation of the video content in order to develop or validate theoretical studies. Due to the tremendous amount of video material and the huge amount of time required for manual annotation, automatic video analysis becomes obvious. Our pose and gesture algorithms allow a fast, robust and automatic annotation of human motion and behaviour for a large variety of scenarios. The development is part of the joint Max-Planck/Fraunhofer project AVATecH (www.mpi.nl/avatech).
Avatar animation
Real-time video analysis of human motion can also be used for new means of video communication based on an artificial human – a so called avatar – who is animated based on the live motion and the speech of the operator. Due to this, the communication is enhanced by visual cues without transmitting live video streams. The latter one is often not desired due to privacy protection reasons. Based on the stable and robust hand and head tracking, the 2D position of the hands and the head rotation is transferred to body animation parameters (BAP) as defined in the video standard MPEG-4 (Part 2 (Visual). The resulting animation parameters are sent to the receiving side at very low bandwidth compared to full video transmission. In the figure below, the animated avatar (left) and the related input video (right) are shown.
text2
The system provides additional features such as gesture recognition due to the high quality segmentation results based on skin colour. A set of 10 different gestures from the American Sign Language set are currently recognized and immediately shown by the avatar. The focus of the complete systems is oriented towards user friendliness and usability. Hence, no specific calibration or initialization is required. The user can behave normally without any restrictions.
Competencies in Pose and Gesture Analysis
Competencies in Pose and Gesture Analysis
Competencies in Pose and Gesture Analysis
- Robust hand and head tracking for unconstrained scenarios and multiple persons
- Real-time gesture recognition of finger gestures and dynamic hand and arm movement
- Recognition of semantic features such as eye blinking, face and body expressions
Publications
Publications
Publications
S. Masneri, O. Schreer, A new Skin Colour Estimation Method based on Change Detection and Cluster Analysis, Proc. of 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Delft, Netherlands, April 13-15, 2011.
P. Wittenburg, E. Auer, H. Sloetjes, O. Schreer, S. Masneri, D. Schneider, and S. Tschöpel, Automatic Annotation of Media Field Recordings, ECAI 2010 Workshop on: Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2010), Lisbon, Portugal, August 16, 2010.
S. Masneri, O. Schreer, D. Schneider, S. Tschöpel, R. Bardeli, S. Bordag, E. Auer, H. Sloetjes, and P. Wittenburg,Towards Semi-Automatic Annotations for Video and Audio Corpora, 4th Workshop on Representation and Processing of Sign Languages: Corpora and Sign Language Technologies in conjunction with 7th Int. Conf. on Language Resources and Evaluation (LREC), Malta, June 22, 2010.
E. Auer, A. Russel, H. Sloetjes, P. Wittenburg, O. Schreer, S. Masneri, D. Schneider, and S. Tschöpel, ELAN as Flexible Annotation Framework for Sound and Image Processing Detectors, Proc. of 7th International Conference on Language Resources and Evaluation (LREC), Malta, June 19-21, 2010.
O. Schreer, P. Eisert, P. Kauff, R. Tanger, R. Englert, Towards Robust Intuitive Vision-Based User Interfaces, Proc. of Int. Conf. on Multimedia and Expo (ICME 2006), Toronto, Canada, July 2006.
O. Schreer, R. Tanger, P. Eisert, P. Kauff, B. Kaspar, R. Englert, Real-Time Avatar Animation Steered by Live Body Motion, Proc. of 13th Int. Conf. on Image Analysis and Processing (ICIAP 2005), pp.147-154, Cagliari, Italy, September 2005.
