Efficient deployment of modern deep convolutional neural networks on resource-constrained devices suffers from demanding computational requirements of convolution operations. Quantization and use of Winograd convolutions operating on sufficiently large-tile inputs are two powerful strategies to speed up convolution operations. However, their combination results in numerical instability, which manifests itself in a strong quality performance degradation. We present an efficient learning scenario that either completely overcomes or strongly reduces the accuracy degradation of full 8-bit quantized F(4, 3) and F(6, 3) Winograd convolutions. Within the global particle swarm optimization (PSO), we derived a set of quantization-friendly Winograd transformations. Following the state-of-the-art (SOTA) training pipeline, we treat Winograd transformations as learnable parameters during network training. Evolving transformations starting from our PSO-derived ones rather than the standard Winograd transformations results in significant numerical error reduction and accuracy improvement. As a consequence, our approach significantly outperforms SOTA methods on various tasks.
Live content is unavailable. Log in and register to view live content