Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models

by   Muyang Li, et al.

During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including both conditional GANs and diffusion models. Our key observation is that users tend to make gradual changes to the input image. This motivates us to cache and reuse the feature maps of the original image. Given an edited image, we sparsely apply the convolutional filters to the edited regions while reusing the cached features for the unedited regions. Based on our algorithm, we further propose Sparse Incremental Generative Engine (SIGE) to convert the computation reduction to latency reduction on off-the-shelf hardware. With 1.2 method reduces the computation of DDIM by 7.5× and GauGAN by 18× while preserving the visual fidelity. With SIGE, we accelerate the speed of DDIM by 3.0x on RTX 3090 and 6.6× on Apple M1 Pro CPU, and GauGAN by 4.2× on RTX 3090 and 14× on Apple M1 Pro CPU.


page 2

page 8

page 16

page 17

page 19

page 20

page 21

page 22


FISEdit: Accelerating Text-to-image Editing via Cache-enabled Sparse Diffusion Inference

Due to the recent success of diffusion models, text-to-image generation ...

Zero-shot Image-to-Image Translation

Large-scale text-to-image generative models have shown their remarkable ...

Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images

Text-to-image generative models have made remarkable advancements in gen...

Leveraging Off-the-shelf Diffusion Model for Multi-attribute Fashion Image Manipulation

Fashion attribute editing is a task that aims to convert the semantic at...

Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

Generative models, particularly GANs, have been utilized for image editi...

Cascading Modular Network (CAM-Net) for Multimodal Image Synthesis

Deep generative models such as GANs have driven impressive advances in c...

To Beta or Not To Beta: Information Bottleneck for DigitaL Image Forensics

We consider an information theoretic approach to address the problem of ...

Please sign up or login with your details

Forgot password? Click here to reset