Big Data Analytics

Databases with content such as text, video and audio signals as well as potentially associated metadata show rapid growth with many petabytes of data. The efficient processing and fusion of these heterogeneous data sets is a major challenge. Unfortunately, the algorithmic scalability of popular fusion methods such as multiple kernel learning is rather limited. GPU-implemented deep neural networks and approximative techniques such as locality-sensitive hashing are promising and efficient alternatives for a variety of tasks.

Our research focuses on developing scalable machine learning algorithms for tasks such as classification, regression, outlier detection and data visualization. Furthermore we investigate the use of deep learning for multi-modal fusion of large data sets.


  1. S. Dähne, F. Bießmann, W. Samek, S. Haufe, D. Goltz, C. Gundlach, A. Villringer, S. Fazli, and K.-R. Müller, “Multivariate Machine Learning Methods for Fusing Multimodal Functional Neuroimaging Data”, Proceedings of the IEEE, 2015, in press.
  2. S. Fazli, S. Dähne, W. Samek, F. Bießmann, and K.-R. Müller, “Learning from more than one Data Source: Data Fusion Techniques for Sensorimotor Rhythm-based Brain-Computer Interfaces”, Proceedings of the IEEE, 103(6):891-906, June 2015.