Nature Machine Intelligence publishes Fraunhofer HHI study on SemanticLens

On August 14, 2025, the renowned journal Nature Machine Intelligence will publish an article by the Fraunhofer Heinrich Hertz Institute (HHI) presenting the newly developed SemanticLens method - a significant breakthrough in the explainability and validation of large AI models. The paper was written by a team led by Dr. Sebastian Lapuschkin, group leader of Explainable Artificial Intelligence, and Prof. Dr. rer. nat. Wojciech Samek, head of the Artificial Intelligence department and head of the Explainable Artificial Intelligence group at Fraunhofer HHI.

 

SemanticLens: Transparency for neural networks

SemanticLens aims to decode large AI models, i.e., to understand the semantics and functioning of each individual neuron in the model. To do this, it transfers the knowledge hidden in individual model components into the semantically structured, multimodal space of a foundation model such as CLIP (Contrastive Language-Image Pre-Training). The hidden knowledge of a neuron (i.e., the concept) is represented using examples (i.e., images from the training or test dataset). Specifically, it uses example images that contain the concept in the image (e.g., red nose, dog ears, car tires). This allows components to be content-described, i.e. individual neurons of the AI model, specifically searched, and linked to relevant training data and model predictions.

The method enables a wide range of analyses - such as finding neurons for certain concepts, comparing learned representations between models, evaluating the interpretability of components or audits to check concept and decision consistency. SemanticLens directly addresses the transparency gap of modern AI models, facilitates troubleshooting and validation and contributes to closing the “trust gap” between AI and classic, well-understood technical systems.

 

New analysis potential for AI systems

SemanticLens enables a range of previously unattainable analyses:

  • Concept-based search for model components via text or other modalities
  • Systematic knowledge description, including detection of missing or incorrect concepts
  • Comparisons between models at concept level
  • Audits for concept alignment with human-defined requirements
  • Interpretability metrics such as clarity, polysemy and redundancy

 

Trustworthy AI through semantic transparency

SemanticLens creates a previously missing link between the complex internal structure of modern AI models and a knowledge representation that humans can understand.

The breakthrough lies in the fact that SemanticLens is the first to create a direct semantic representation of individual model components that is simultaneously scalable, multimodal and functions without human intervention - a first in XAI research. Perhaps the most apt illustration of this principle is a highly complex technical system such as the Airbus A340-600: this aircraft consists of over four million individual parts, whose function and reliability must be precisely understood and documented by engineers in order to ensure overall performance. In contrast, the specific role of individual neurons in today's AI models remains largely unknown, which makes automated testing procedures and robust reliability analyses considerably more difficult. As a result, this methodology makes it possible to translate the high-dimensional, difficult-to-interpret internal structure of modern models into knowledge that is both human-comprehensible and searchable – an essential step towards holistic, component-specific explainability and verifiability.

The article in the Nature Machine Intelligence journal is available here.

Further information on the technology and application of SemanticLens can be found on the Fraunhofer HHI project page.

The authors of the article are: Maximilian Dreyer, Jim Berend, Tobias Labarta, Johanna Vielhaben, Thomas Wiegand, Sebastian Lapuschkin, and Wojciech Samek. “Mechanistic Understanding and Validation of Large AI Models with SemanticLens.” In: Computer Research Repository (January 9, 2025). ISSN: 2331-8422. DOI: 10.48550/arXiv.2501.05398. arXiv: 2501.05398 [cs.AI].