We conduct an empirical study of unsupervised neural machine translation for truly low resource languages . We show how adding comparable data mined using a bilingual dictionary along with modestadditional compute resource to train the model can significantly improve its performance . With this weak supervision, our best method achieves BLEU scoresthat improve over supervised results for English$\rightarrow$Gujarati (+18.88) and Arabic (+5.84), showing promise of weakly-supervised NMT . To the best of our knowledge, ourwork is the first to quantitatively showcase the impact of different modestcompute resource in low resource NMT in the low resource regions of the world . We also demonstrate how the use of the dictionary to code-switchmonolingual data

Author(s) : Garry Kuwanto, Afra Feyza Aky├╝rek, Isidora Chara Tourni, Siyang Li, Derry Wijaya

Links : PDF - Abstract

Code :

https://github.com/neulab/xnmt


Coursera

Keywords : resource - data - supervised - improve - dictionary -

Leave a Reply

Your email address will not be published. Required fields are marked *