Element-wise activation functions play a critical role in deep neural networks by affecting expressivity power and learning dynamics . We propose a new perspective of learnable activation function through formulating them with element-wise attention mechanism . Attention-based Rectified Linear Unit (AReLU) significantly boosts the performance of most mainstream network architectures with only two extra learnable parameters per layer introduced . AReLU facilitates fast network training under small learning rates, which makes it especially suited in the case of transfer learning . Our source code has been released (https://://://gong.com/densechen/areLU). Our source codes have been released and our source code is released (http://www.gongen/ARelu). Our sources are available to read and use the latest version of this version of the article. For more information, please visit www.glaudech.org/glauban.uk/grange-aReLU.uk.org.html/brange-arelu.uk . Back to the page you came from

Links: PDF - Abstract

Code :


Keywords : arelu - source - uk - attention - released -

Leave a Reply

Your email address will not be published. Required fields are marked *