The method significantly compressesneural network weights to a sparse ternary of [-1,0,1 . It can compress aResNet-18 model from 46 MB to 955KB and a ResNet-50 model from 99 MB to 3.3MB (~30x) The top-1 accuracy on ImageNet drops slightly from 69.7% to65.3% and from 76.15% to 74.47% . Our method unifies pruning andquantization and thus provides a range of size-accuracy trade-off . The method can provide at most46x compression ratio on the ResNet 18 structure, with an acceptable accuracyof 65.36% .

Author(s) : Dan Liu, Xi Chen, Jie Fu, Xue Liu

Links : PDF - Abstract

Code :
Coursera

Keywords : mb - method - model - accuracy - x -

Leave a Reply

Your email address will not be published. Required fields are marked *