Poster
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer
Zhuguanyu Wu · Jiaxin Chen · Hanwen Zhong · Di Huang · Yunhong Wang
# 105
Strong Double Blind |
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks for various computer vision tasks, which however requires a high computational cost and inference latency. Recently, post-training quantization (PTQ) has emerged as a promising way to promote the efficiency of Vision Transformers. Nevertheless, existing PTQ approaches for ViTs suffer from the inflexible quantization on the post-Softmax and post-GeLU activations with power-law-like distributions. To address the issue, we propose a novel non-uniform quantizer dubbed AdaLog, which adapts the logarithmic base to accommodate the power-law-like distribution of activations, and simultaneously allows for hardware-friendly quantization and de-quantization. By further employing the bias reparameterization, AdaLog Quantizer is applicable to both the post-Softmax and post-GeLU activations. Moreover, we develop a Beam-Search strategy to select the best logarithm base for AdaLog quantizers, as well as those scaling factors and zero points for uniform quantizers. Extensive experimental results on public benchmarks demonstrate the effectiveness of our approach for various ViT-based architectures and vision tasks such as classification, object detection, and instance segmentation.