Skip to yearly menu bar Skip to main content


Poster

PQ-SAM: Post-training Quantization for Segment Anything Model

Xiaoyu Liu · Xin Ding · Lei Yu · Yuanyuan Xi · Wei Li · Zhijun Tu · jie hu · Hanting Chen · Baoqun YIN · Zhiwei Xiong

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Fri 4 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Segment anything model (SAM) is a promising prompt-guided vision foundation model to segment objects of interest. However, the extensive computational requirements of SAM have limited its applicability in resource-constraint edge devices. Post-training quantization (PTQ) is an effective potential for fast-deploying SAM. Nevertheless, SAM's billion-scale pretraining creates a highly asymmetric activation distribution with detrimental outliers in excessive channels, resulting in significant performance degradation of the low-bit PTQ. In this paper, we propose PQ-SAM, the first PTQ method customized for SAM. To achieve a quantization-friendly tensor-wise distribution, PQ-SAM incorporates a novel grouped activation distribution transformation (GADT) based on a two-stage outlier hierarchical clustering (OHC) scheme to scale and shift each channel. Firstly, OHC identifies and truncates extreme outliers to reduce the scale variance of different channels. Secondly, OHC iteratively allocates learnable shifting and scaling sizes to each group of channels with similar distributions, reducing the number of learnable parameters and easing the optimization difficulty. These shifting and scaling sizes are used to adjust activation channels, and jointly optimized with quantization step sizes for optimal results. Extensive experiments demonstrate that PQ-SAM outperforms existing PTQ methods on nine zero-shot datasets, and pushes the 4-bit PTQ of SAM to a usable level.

Live content is unavailable. Log in and register to view live content