Skip to yearly menu bar Skip to main content


Poster

Memory-Efficient Fine-Tuning for Quantized Diffusion Model

Hyogon Ryu · Seohyun Lim · Hyunjung Shim

[ ]
Thu 3 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

The rise of billion-parameter diffusion models such as Stable Diffusion XL, Imagen, and Dall-E3 significantly propels the domain of generative AI. However, their large-scale architecture presents challenges in fine-tuning and deployment due to high resource demands and slow inference speed. This paper delves into relatively unexplored yet promising realm of fine-tuning quantized diffusion models. Our analysis identified that the baseline neglects the distinct pattern in model weights and different roles throughout time-step when finetuning the diffusion model. To address these limitations, we introduce a novel memory-efficient fine-tuning framework directly applicable to quantized diffusion models, dubbed TuneQDM. Our approach introduces quantization scales as separable functions to consider inter-channel patterns of weight and optimizes scales in a time-step specific manner for effective reflection of the role of time-step. TuneQDM demonstrates performance on par with its full-precision counterpart, while simultaneously offering a substantial advantage in terms of memory efficiency. The experimental results demonstrate that our efficient framework consistently outperforms the baseline in single-/multi-subject generation, exhibiting high subject fidelity and prompt fidelity comparable to the full precision model.

Live content is unavailable. Log in and register to view live content