Poster
Memory-Efficient Fine-Tuning for Quantized Diffusion Model
Hyogon Ryu · Seohyun Lim · Hyunjung Shim
# 55
The rise of billion-parameter diffusion models such as Stable Diffusion XL, Imagen, and Dall-E3 significantly propels the domain of generative AI. However, their large-scale architecture presents challenges in fine-tuning and deployment due to high resource demands and slow inference speed. This paper delves into relatively unexplored yet promising realm of fine-tuning quantized diffusion models. Our analysis identified that the baseline neglects the distinct pattern in model weights and different roles throughout time-step when finetuning the diffusion model. To address these limitations, we introduce a novel memory-efficient fine-tuning framework directly applicable to quantized diffusion models, dubbed TuneQDM. Our approach introduces quantization scales as separable functions to consider inter-channel patterns of weight and optimizes scales in a time-step specific manner for effective reflection of the role of time-step. TuneQDM demonstrates performance on par with its full-precision counterpart, while simultaneously offering a substantial advantage in terms of memory efficiency. The experimental results demonstrate that our efficient framework consistently outperforms the baseline in single-/multi-subject generation, exhibiting high subject fidelity and prompt fidelity comparable to the full precision model.