No Access

Common Kernels and Convolutions in Binary- and Ternary-Weight Neural Networks

Byungmin Ahn

Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea

E-mail Address: bmahn@snucad.snu.ac.kr

Search for more papers by this author

and

Taewhan Kim

https://orcid.org/0000-0002-6114-3772

Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea

E-mail Address: tkim@snucad.snu.ac.kr

Corresponding author.

Search for more papers by this author

https://doi.org/10.1142/S0218126621501589Cited by:3 (Source: Crossref)

Abstract

A new algorithm for extracting common kernels and convolutions to maximally eliminate the redundant operations among the convolutions in binary- and ternary-weight convolutional neural networks is presented. Precisely, we propose (1) a new algorithm of common kernel extraction to overcome the local and limited exploration of common kernel candidates by the existing method, and subsequently apply (2) a new concept of common convolution extraction to maximally eliminate the redundancy in the convolution operations. In addition, our algorithm is able to (3) tune in minimizing the number of resulting kernels for convolutions, thereby saving the total memory access latency for kernels. Experimental results on ternary-weight VGG-16 demonstrate that our convolution optimization algorithm is very effective, reducing the total number of operations for all convolutions by $25.8 \sim 26.3 %$ , thereby reducing the total number of execution cycles on hardware platform by 22.4% while using $2.7 \sim 3.8 %$ fewer kernels over that of the convolution utilizing the common kernels extracted by the state-of-the-art algorithm.

This paper was recommended by Regional Editor Emre Salman.

Keywords:

References

1. A. Krizhevsky, I. Sutskever and G. E. Hinton , Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, Lake Tahoe, USA, 2012, pp. 1097–1105. Google Scholar
2. K. Simonyan and A. Zisserman , Very deep convolutional networks for large-scale image recognition, Int. Conf. Learning Representations, San Diego, USA, 2015, pp. 1–14. Google Scholar
3. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich , Going deeper with convolutions, IEEE Conf. Comput. Vision and Pattern Recognition, Boston, USA, 2015, pp. 1–9. Crossref, Google Scholar
4. K. He, X. Zhang, S. Ren and J. Sun , Deep residual learning for image recognition, IEEE Conf. Comput. Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 770–778. Crossref, Google Scholar
5. J. Redmon, S. Divvala, R. Girshick and A. Farhadi , You only look once: Unified, real-time object detection, IEEE Conf. Comput. Vision and Pattern Recognition, Las Vegas, USA, 2016, pp. 779–788. Crossref, Google Scholar
6. M. N. Bojnordi and E. Ipek , Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning, IEEE Int. Symp. High Performance Computer Architecture, Barcelona, Spain, 2016, pp. 1–13. Crossref, Google Scholar
7. H. Pourmeidani, S. Sheikhfaal, R. Zand and R. F. DeMara , Probabilistic interpolation recoder for energy-error-product efficient DBNs with p-bit devices, IEEE Transactions on Emerging Topics in Computing (2020), pp. 1–11, https://doi.org/10.1109/TETC.2020.2965079. Crossref, Web of Science, Google Scholar
8. M. Courbariaux, Y. Bengio and J.-P. David , BinaryConnect: Training deep neural networks with binary weights during propagations, Advances in Neural Information Processing Systems, Montréal, Canada, 2015, pp. 3123–3131. Google Scholar
9. M. Rastegari, V. Ordonez, J. Redmon and A. Farhadi , XNOR-Net: ImageNet classification using binary convolutional neural networks, European Conf. Comput. Vision, Amsterdam, The Netherlands, 2016, pp. 525–542. Crossref, Google Scholar
10. M. Kim and P. Smaragdis , Bitwise neural networks, Int. Conf. Machine Learning, Lille, France, 2015, pp. 6–11. Google Scholar
11. F. Li, B. Zhang and B. Liu , Ternary weight networks, NIPS Workshop on Efficient Methods for Deep Neural Networks, Barcelona, Spain, 2016, pp. 1–5. Google Scholar
12. Z. Lin, M. Courbariaux, R. Memisevic and Y. Bengio , Neural networks with few multiplications, Int. Conf. Learning Representations, San Juan, Puerto Rico, 2016, pp. 1–9. Google Scholar
13. H. Alemdar, V. Leroy, A. Prost-Boucle and F. Pétrot , Ternary neural networks for resource-efficient ai applications, Int. Joint Conf. Neural Networks, Anchorage, USA, 2017, pp. 2547–2554. Crossref, Google Scholar
14. S. Zheng, Y. Liu, S. Yin, L. Liu and S. Wei , An efficient kernel transformation architecture for binary- and ternary-weight neural network inference, Design Automation Conf., San Francisco, USA, 2018, pp. 1–6. Google Scholar
15. R. Andri, L. Cavigelli, D. Rossi and L. Benini , YodaNN: An architecture for ultra-low power binary-weight CNN acceleration, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. 37 (2018) 48–60. Crossref, Web of Science, Google Scholar
16. H. Kim, J. Sim, Y. Choi and L.-S. Kim , A kernel decomposition architecture for binary-weight convolutional neural networks, Design Automation Conf., Austin, USA, 2017, pp. 1–6. Crossref, Google Scholar