In recent years, Deep Neural Networks have gained increased popularity, as they are able to achieve impressive performances on a growing variety of applications. These networks are usually trained using extremely large datasets, which often contain various unnoticed artifacts and spurious correlations that can serve as a seemingly easy workaround for far more complex relations. Consequently, the predictor may not learn a valid and fair strategy to solve the task at hand, and instead make biased decisions. When training and testing data are distributed equally - which is usually the case as they are often taken from the same data corpus - the same biases are present within both sets, leading to a severe overestimation of the model’s reported generalization ability.
In our recent collaborative work, together with the Machine Learning Group of the TU Berlin, we propose Class Artifact Compensation (ClArC) as a tool for mitigating the influence of specific artifacts on a predictor’s decision-making and for thereby enabling a more accurate estimation of its generalization ability. As such, ClArC describes a relatively general three-step framework for artifact removal, consisting of identifying artifacts, estimating a model for a specific artifact and finally augmenting the predictor in order to compensate for that artifact.
Due to its ability to automatically analyze large sets of explanations, we employed our LRP-based SpRAy method to identify artifacts in practice. With different user objectives in mind, we developed two alternative techniques for the artifact compensation step: Augmentative ClArC, which finetunes a model in order to shift its focus away from a specific confounder, and the extremely resource-efficient Projective ClArC, where a projection in feature space is used during inference to mitigate the confounder’s influence.
- Christopher J. Anders, David Neumann, Talmaj Marinc, Wojciech Samek, Klaus-Robert Müller, Sebastian Lapuschkin (2020):
XAI for Analyzing and Unlearning Spurious Correlations in ImageNet,
Vienna, Austria, ICML'20 Workshop on Extending Explainable AI Beyond Deep Models and Classifiers (XXAI), July 2020
- Christopher J. Anders, Leander Weber, David Neumann, Wojciech Samek, Klaus-Robert Müller, Sebastian Lapuschkin (2022)
Finding and removing Clever Hans: Using explanation methods to debug and improve deep models