Common Kernels and Convolutions in Binary- and Ternary-Weight Neural Networks
Abstract
A new algorithm for extracting common kernels and convolutions to maximally eliminate the redundant operations among the convolutions in binary- and ternary-weight convolutional neural networks is presented. Precisely, we propose (1) a new algorithm of common kernel extraction to overcome the local and limited exploration of common kernel candidates by the existing method, and subsequently apply (2) a new concept of common convolution extraction to maximally eliminate the redundancy in the convolution operations. In addition, our algorithm is able to (3) tune in minimizing the number of resulting kernels for convolutions, thereby saving the total memory access latency for kernels. Experimental results on ternary-weight VGG-16 demonstrate that our convolution optimization algorithm is very effective, reducing the total number of operations for all convolutions by 25.8∼26.3%, thereby reducing the total number of execution cycles on hardware platform by 22.4% while using 2.7∼3.8% fewer kernels over that of the convolution utilizing the common kernels extracted by the state-of-the-art algorithm.
This paper was recommended by Regional Editor Emre Salman.