Skip to yearly menu bar Skip to main content


Poster

Distilling Diffusion Models into Conditional GANs

Minguk Kang · Richard Zhang · Connelly Barnes · Sylvain Paris · Suha Kwak · Jaesik Park · Eli Shechtman · Jun-Yan Zhu · Taesung Park

# 181
[ ] [ Project Page ] [ Paper PDF ]
Wed 2 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

We propose a method to distill a complex multistep diffusion model into a single-step student model, dramatically accelerating inference while preserving image quality. Our approach interprets the diffusion distillation as a paired image-to-image translation task, using noise-to-image pairs of the diffusion model’s ODE trajectory. For efficient regression loss computation, we propose E-LatentLPIPS, a Perceptual loss in the latent space of the diffusion model with an ensemble of augmentations. Despite dataset construction costs, E-LatentLPIPS converges more efficiently than many existing distillation methods. Furthermore, we adapt a diffusion model to construct a multi-scale discriminator with a text alignment loss to build an effective conditional GAN-based formulation. We demonstrate that our one-step generator outperforms all published diffusion distillation models on the zero-shot COCO2014 benchmark.

Chat is not available.