Sandwich Batch Normalization

Sandwich Batch Normalization ($\textbf{SaBN}$) is a frustratingly easy improvement of Batch normalization . SaBN factorizes the BN affine layer into one shared layer, cascaded by several parallel parallel layers . Its variants include further decomposing the normalization layer into multiple parallel ones, and extending similar ideas to instance normalization. We demonstrate the prevailing effectiveness of SaBN (as well as its variants) as a drop-in replacement in four tasks: neural architecture search (NAS), image generation, adversarial training, and style transfer. We also provide visualizations and analysis to help understand why SaBN works.

Keywords : normalization - sabn - batch - parallel - layer -

