Poster

Quantization-Friendly Winograd Transformations for Convolutional Neural Networks

Vladimir Protsenko · Vladimir Kryzhanovskiy · Alexander Filippov

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

2024 Poster

Paper PDF [ Poster] [ Supplemental]

Abstract

Efficient deployment of modern deep convolutional neural networks on resource-constrained devices suffers from demanding computational requirements of convolution operations. Quantization and use of Winograd convolutions operating on sufficiently large-tile inputs are two powerful strategies to speed up convolution operations. However, their combination results in numerical instability, which manifests itself in a strong quality performance degradation. We present an efficient learning scenario that either completely overcomes or strongly reduces the accuracy degradation of full 8-bit quantized F(4, 3) and F(6, 3) Winograd convolutions. Within the global particle swarm optimization (PSO), we derived a set of quantization-friendly Winograd transformations. Following the state-of-the-art (SOTA) training pipeline, we treat Winograd transformations as learnable parameters during network training. Evolving transformations starting from our PSO-derived ones rather than the standard Winograd transformations results in significant numerical error reduction and accuracy improvement. As a consequence, our approach significantly outperforms SOTA methods on various tasks.

Chat is not available.