Transformer-based methods such as RoBERTa and GPT-3 have led to advances in natural language processing tasks such as question answering and commonsense reasoning . We find clear evidence that fine-tunedcommonsense language models still do not generalize well, even with moderatechanges to the experimental setup, and may, in fact, be susceptible to datasetbias . We also perform selective studies, including qualitative and consistencyanalyses, to gain deeper insight into the problem. We also study the generalization issue in detail by designing and . designing andconducting a rigorous scientific study. Using five common benchmarks, multiplecontrols and statistical analysis, we find clear . evidence that . well-tweered language models do not . generalize . well, or well-trained

Author(s) : Mayank Kejriwal, Ke Shen

Links : PDF - Abstract

Code :


Keywords : language - models - generalize - evidence - find -

Leave a Reply

Your email address will not be published. Required fields are marked *