Large-scale pre-trained models like BERT have obtained a great success invarious Natural Language Processing (NLP) tasks, while it is still a challengeto adapt them to math-related tasks . We propose a novel model, namely MathBERT, which is jointly trained with mathematical formulas and their corresponding contexts . We conduct various experiments on three downstream tasks to evaluatethe performance of the model, including mathematical information retrieval,formula topic classification and formula headline generation . We qualitatively show that this pre-training model effectively captures the semantic-level structural information offormulas. To the best of our knowledge, MathBERt is the first pre-train modelfor mathematical formula understanding. We also show that it significantly outperforms existing methods on all those three tasks on those three

Author(s) : Shuai Peng, Ke Yuan, Liangcai Gao, Zhi Tang

Links : PDF - Abstract

Code :

Keywords : tasks - formula - pre - mathematical - model -

Leave a Reply

Your email address will not be published. Required fields are marked *