Deep Ensembles for Low Data Transfer Learning

In the low-data regime, it is difficult to train good supervised models from scratch . In this work, we study different ways of creating ensembles from pre-trained models . We show that the nature of pre-training itself is a performant source of diversity . We propose a practical algorithm that efficiently identifies a subset of models for any downstream dataset . This achieves state-of-the-art performance at a much lower inference budget, even when selecting from over 2,000 models . The approach is simple: Use nearest-neighbour accuracy to rank models, . fine-tune the best ones with a small hyperparameter sweep, and greedily construct an ensemble to minimise validation cross-entropy. When evaluated together with strong baselines on 19 different downstream tasks (the Visual Task Adaptation Benchmark, this achieves state of theart performance, this is .

Links: PDF - Abstract

Code :


Keywords : models - pre - data - performance - state -

Leave a Reply

Your email address will not be published. Required fields are marked *