Deep NLP models have been shown to learn spurious correlations, leaving them brittle to input perturbations . We develop a Retrieve-Generate-Filter-Filter technique to create counterfactualevaluation and training data with minimal human supervision . Using anopen-domain QA framework and question generation model trained on original task data, we create counterfactsuals that are fluent, semantically diverse, andautomatically labeled. We find that RGF data leads to significant improvements in model’s robustness to local perturbation. Moreover, we find that data augmentation with RGF . improves performance on out-of-domain and challenging evaluation sets over and above existing methods, in both the reading comprehension and open- domain QAsettings. We also find that .

Author(s) : Bhargavi Paranjape, Matthew Lamm, Ian Tenney

Links : PDF - Abstract

Code :

Keywords : data - find - domain - create - filter -

Leave a Reply

Your email address will not be published. Required fields are marked *