ContextualDecomposition (CD) is a Shapley-based input feature attribution method that has been shown to work well for recurrent NLP models . We test the extent towhich CD is useful for models that contain attention operations . We show that the English and Dutch models demonstrate similar processing behaviour, but that under the hood there are consistent differences between our attention and non-attention models . Our experimentsconfirm that CD can successfully be applied for attention-based models as well, providing an alternative Shapley based attribution method for modern neuralnetworks. In particular, using CD, we show that using CD we . show that . the English . modelsdemonstrate similar . processing behaviour but that there are . similarities in the . English and . Dutch models demonstrated similar processing behavior, but there are inconsistent differences between . our attention models. Under the hood . There are consistent similarities between our . attention

Author(s) : Tom Kersten, Hugh Mee Wong, Jaap Jumelet, Dieuwke Hupkes

Links : PDF - Abstract

Code :

Keywords : attention - models - cd - based - shapley -

Leave a Reply

Your email address will not be published. Required fields are marked *