Class Artifact Compensation

In recent years, Deep Neural Networks have gained increased popularity, as they are able to achieve impressive performances on a growing variety of applications. These networks are usually trained using extremely large datasets, which often contain various unnoticed artifacts and spurious correlations that can serve as a seemingly easy workaround for far more complex relations. Consequently, the predictor may not learn a valid and fair strategy to solve the task at hand, and instead make biased decisions. When training and testing data are distributed equally - which is usually the case as they are often taken from the same data corpus - the same biases are present within both sets, leading to a severe overestimation of the model’s reported generalization ability.

In our recent collaborative work, together with the Machine Learning Group of the TU Berlin, we propose Class Artifact Compensation (ClArC) as a tool for mitigating the influence of specific artifacts on a predictor’s decision-making and for thereby enabling a more accurate estimation of its generalization ability. As such, ClArC describes a relatively general three-step framework for artifact removal, consisting of identifying artifacts, estimating a model for a specific artifact and finally augmenting the predictor in order to compensate for that artifact.

Due to its ability to automatically analyze large sets of explanations, we employed our LRP-based SpRAy method to identify artifacts in practice. With different user objectives in mind, we developed two alternative techniques for the artifact compensation step: Augmentative ClArC, which finetunes a model in order to shift its focus away from a specific confounder, and the extremely resource-efficient Projective ClArC, where a projection in feature space is used during inference to mitigate the confounder’s influence.

Publications

Christopher J. Anders, David Neumann, Talmaj Marinc, Wojciech Samek, Klaus-Robert Müller, Sebastian Lapuschkin (2020):
XAI for Analyzing and Unlearning Spurious Correlations in ImageNet,
Vienna, Austria, ICML'20 Workshop on Extending Explainable AI Beyond Deep Models and Classifiers (XXAI), July 2020
Christopher J. Anders, Leander Weber, David Neumann, Wojciech Samek, Klaus-Robert Müller, Sebastian Lapuschkin (2022)
Finding and removing Clever Hans: Using explanation methods to debug and improve deep models
Information Fusion

Class Artifact Compensation

Publications

Prof. Dr. rer. nat. Wojciech Samek

Dr. rer. nat. Sebastian Lapuschkin