An Empirical Study on Large Scale Multi Label Text Classification Including Few and Zero Shot Labels

Large-scale Multi-label Text Classification (LMTC) has a wide range of Natural Language Processing (NLP) applications and presents interesting challenges . Current state-of-the-art LMTC models employ Label-Wise Attention Networks (LWANs), which typically treat LMTC as flat multi-label classification . We show that hierarchical methods based on Probabilistic Label Trees (PLTs) outperform LWANs . We propose new models that leverage the label hierarchy to improve few and zero-shot learning, considering on each dataset a graph-aware annotation proximity measure that we introduce . Furthermore, we show that Transformer-based approaches outperform the state of theart in two of the datasets, and we propose a new method which combines BERT with LWAN’s . The authors also propose new methods that combine BERT and LWAN’s BERT together to improve zero-shots learning on three datasets from different domains of different domains. The authors conclude that these methods are better suited to NLP-related learning on different domains and transfer learning on each of the different domains . The Authors’ findings are based on

Links: PDF - Abstract

Code :

None

Keywords : label - learning - domains - propose - based -

Leave a Reply

Your email address will not be published. Required fields are marked *