Unsupervised Text Style Transfer with Padded Masked Language Models

We propose Masker, an unsupervised text-editing method for style transfer . To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain . Then we find the text spans where the two models disagree the most in terms of likelihood . This allows us to identify the source tokens to delete to transform the source text to match the style of the target . The deleted tokens are replaced with the target MLM, and by using a padded MLM variant, we avoid having to predetermine the number of inserted tokens . In low-resource settings, it improves supervised methods’ accuracy by over 10 percentage points when pre-training them

Links: PDF - Abstract

Code :

https://github.com/google-research/lasertagger

Keywords : target - text - source - tokens - style -

Leave a Reply

Your email address will not be published. Required fields are marked *